-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request For Comment: Vpype2 - Next Gen Primitive #588
Comments
Thanks for the extensive, thought-provoking write-up! First, I agree with the premises:
ObjectivesNow, IMO the "v2 data model" should be designed against a clear set of objectives in terms of: Before getting to your proposal, let me brain-dump what I currently have in mind for the above. This is obviously TBD and subject to evolve, but I guess it can serve as starting point. Feature setMust have:
Not sure:
Excluded:
TODO:
Expected performanceThough they would be useful, I don't have hard numbers/benchmark for this. Here are my subjective take for the time being:
APIvpype is both a library (the In theory, any data model can possibly be wrapped inside a nice API. In practice, the required implementation/maintenance effort will vary a lot depending on the chosen data model. Side note: from my experience with adding some metadata support to vpype, the question of metadata handling and it's impact on both the API and actual, high-level features (i.e. vpype command) implementation is probably greater and trickier than the purely geometric aspects. Proper metadata handling must be well-though and properly factored in the data model. My comments on the "Numpy Nx5" proposalIn light of the feature set, your proposal covers what I think would be needed for vpype 2. As I implied earlier, I reject the idea of a custom implementation of advanced computation geometry feature. I'd rather use Shapely for that. I terms of performance, you make a strong point that this is probably the most performant implementation that pure python can get. However, is this warranted? I'm not sure. In my experience, most real-world files contains paths that are either polyline, or compound paths with very few primitive (e.g. rounded rect: four lines and four arcs). In this context, the naive use of As you might expect by now, my biggest issue is the work that would be required to wrap this into a proper API, compared to the "naive" implementation, which could be made cleanly enough that no layering would be needed API-wise. Misc. specific issue I'm seeing with this format:
Going forwardBuilding a reference implementationI don't see myself going for a radical choice like your proposal without building a reference prototype with "regular", Pythonic code to explore the baseline in terms of performance and API. I've been thinking a bit about this lately, and might making strides in that direction part of my new year resolutions :) This should enable proper benchmarking and API exploration. Improving on your proposalAs I suggested already, I think your proposal could be much improved by using Structured arrays. In particular, the vpype2 sandboxI'm going make a sandbox branch with a |
SVG Arc is a dumpster fire.
I would advise against supporting the SVG arc primitive there. It's an absolute dumpster fire of a primitive. You can easily simulate the arc with quads, cubics, or circular arcs (which are really nice), unlike the weird parameters of the SVG arc. You can do circular arcs with just a start and end and a control point. Including that monstrosity basically make all math impossible. I would highly recommend just remove them and replacing them with some circular arcs or with some bezier curves. The circular arc code is easy, supported by gcode controllers, and a lot more graphics packages for display. Pick how many circular arcs you want to use. Then subdivide whatever curve you want to simulate, and pick three points that occur on the curve. Since you can always make a circular arc with SvgPathtools has some fantastic math for various primitives and it always gets really fuzzy for arcs.
Gcode exportThat's not necessarily directly related to line collections. I have some generic writer stuff for embroidery, it's just that there's a lot a commands and it has some notably complex parts. https://github.com/EmbroidePy/pyembroidery/blob/master/pyembroidery/GenericWriter.py Segment level metadata
ARCgis does some things with this sort of stuff, they have associated dictionaries for settings or something. Where the actual objects can have a bunch of data. You don't need a different layer and could even invent layers on the fly. You would be able to set a series of points and give them all an index number to draw a connects the dots. Numpy Nx5
In theory the code should work for data arranged in this way regardless how it's created. And for speed there's always a lot of utility in making something fairly uniform out of primitives. Structured ArraysI tried to use this stuff before and it didn't really turn out to be extremely helpful and seemed kind of slow accordingly. You lose some very useful commands. You also lose the somewhat easy ability to convert to lists of pure python primitives of the same type. And if you're at the point of using a structured array, you'd maybe be better off writing normal c++ classes and hooking them in, which is basically the core of Shapely in geos, or whatever. Addendum / Shapely doesn't do curves.Part of the hope here is to make something really easy to use that fits generally into a lot of different use-cases. I've done work in lasers, embroidery, etc. And I really like keeping good curves around as long as possible. Shapely is sometimes nice but notably it doesn't have curves. But, you'd be surprised how quickly svg paths get really really slow, when you load up a 20k heavy objects. And while you can almost always get by with enough lines, there are some things like gcode where you can actually draw circular arcs directly. And without any methods of dealing with them you can't get to that step. It seems like a lot of the math isn't too hard or weird, especially when you get rid of ellipical arcs and use circular arcs with very nice parameterizations. I also do a bunch cut planning for the laser software and it's lead to me using a like 3 different types of path-like classes. And this was the simplest thing that would speed things up a lot, that could also include things like |
It recently occurred to me that your proposed memory representation for curves could serve as a basis for an extremely performant GPU-based render. That data could be parsed by a tessellation shader to generate vertex, which could in turn be processed by my existing "fat line" geometry shader. Highly compact buffer for fast CPU-GPU transfer and 100% of the work done in the GPU. This should be fast. |
I am of the rather firm opinion that using memory based struct objects that are not always uniform tends to suffer some compatibility problems, not insurmountable but often inefficient. But, if I say these 5 complexes or 10 floats are my shape that we can do a lot of very nice and very fast operations on the structure. I think I've run into problems just streaming the memory straight to disk and reloading other structures. Which make me pretty confident that a set number of uniform core primitives is going to be more useful. Even numpy has some serious limitations with their own struct objects (a lot of commands don't work well with them). And yeah, a goodly amount of that is because you can store them in memory in very clearly sized blocks. So each 80 bytes is specifically able to be loaded as these complex objects merely by reading the entire chunk of memory and assigning it as complexes, and likely doing the same and assigning them as floats. This also makes the format quite cross platform and pretty compact to save. If I could get you onboard with this or something decidedly very similar to this. It might make things easier on me (and you), I could likely use your highly performant GPU rendering or use my 2-opt algorithm fairly interchangeably. Since all shapes would be fairly standardized the interchange between different formats would be really easy. Even if using a system that doesn't take complexes, these can be turned into floats. The file format to save this would basically be take this chunk of memory, write it directly to disk. Load this same chunk of memory from disk and memory assign those values as this type. At the most basic floats and ints tend to work like this, so if you can build a very uniform structure and just set some rules to govern how we treat this structure that definition would work, not just GPU vs CPU but also any other programming system, since we're loading not just a primitive but one that is massively well supported (assuming we can just treat these as floats when needed). You may run into issues if your memory goes Yes, please strongly reconsider. Despite your earlier criticism, I still think this is the way to go. Or something decidedly very similar to this. Obviously there's parts of the definitions and even shape types that could be converted and extended but it really does seem like something you could store in highly structured memory and move around is the way to go. |
To be entirely transparent, I've recently engaged on an entirely different trip. For a number of unrelated reasons, I've been intensively learning Rust over the last month or so, and now I'm exploring how this could play out in the vpype landscape. This is highly speculative stuff, and it tends to massively shift paradigms left and right. My current (again, highly speculative) question is "what if vpype-core (e.g. the current That said, my earlier point was actually tangential to this discussion. Your data format would be very good for a GPU-based renderer, even if that buffer is constructed from other data structures for display purposes only. As a matter of fact, the vpype viewer already constructs such a buffer, except it's only vertex since only polylines are currently supported. I should also say that, although my viewer has plenty of limitations (due to the very narrow scope) and some visual glitches, I have yet to find anything coming close in terms of performance, even it the (high-level) Rust crates I've been exploring so far. (It could of course be very efficiently re-implemented with low-level GPU crates such as Unrelated but of potential interest to you: the |
Yeah, it might be made correctly from what I can see. Basically svgelements is a bit misdesigned. The emphasis on the render tree being the same as the parsed tree is wrong. Basically the correct core methodology would be to parse the DOM entirely. Then use the given DOM tree to render shapes and paths. Which could use Basically reading to a DOM-tree state doing the regular bootstrapping that Browsers do is the correct methodology there. My code currently goes SVG -> Shapes. I did write a super basic save routine since people kept asking and it was a hundred lines of code or whatever and I've written a few different copies here and there, but it's notably going to be lossy since I mostly already processed everything. I did try rewriting some old code I did in Java into python and it notably got notably slower and so I was going to likely rewrite the core part of the algorithm into C++ though I guess I could look at Rust for that. I still want to use python but maybe through some bindings. Since Rust and C should compile the same or so, just end up being safer in Rust. Also the Zingl stuff ( https://zingl.github.io/bresenham.html ) for rendering curves though not able to be wildly parallelized is pretty impressive since you can render at the 1 pixel level using pure integer math without trying to figure out the optimal subdivision/segmentization math. I've considered a few times going through a process like porting that stuff to GRBL or something else since really the carry-forward error routines to figure out the discrete steps is wildly under-utilized currently. |
If you're looking for a fast SVG loader and you agree with the choices made by edit: I'd be more than happy to help with a |
Some quick and dirty benchmark with what I have so far, which includes SVG parsing, cropping (at polyline level for vpype, at bezier level for the rust prototype), and linearising at the default .1mm tolerance: vpype takes ~140ms after discounting the loading time ( The comparison is somewhat unfair, because there is some additional metadata stuff in vpype case. The way tolerance is dealt with is also different: vpype use it as max segment length regardless of curvature, whereas my rust prototype use it as max polyline vs. true curve distance, which is obviously much better. Anyways, these numbers obviously need to be taken with lots of grains of salt, but the clearly highlight the potential... I plan to polish a little be the display (white background, |
I've been working on a bunch of geometry stuff with more sophisticated segment types and numpy (https://github.com/meerk40t/meerk40t/blob/main/meerk40t/tools/geomstr.py). And I'd like your take on it as as core geometry primitive.
Why
When you choose your primitives, you tend to choose the limitations of what your geometry can do. You have a bunch of line segments stored in complexes and that's quite useful. But, you have made some notes about including beziers in the future or other projects, and there's clear reasons to want to do that. Obviously we want to use Numpy as best as we can, and we want other segment types that are currently not well supported. But, we want everything to quickly work together, so we want all of our segments to be generally compatible.
Primitive
The proposed core structure are lines of 5 complex numbers. These are
start, control, info, control2, end
the info is located directly in the center. The.real
component ofinfo
dictates the segment type and the.imag
component points to a.setting
object stored elsewhere.Segment Types
There are chiefly 8 segment types. The 5 geometric ones are
point
,line
,quad
,cubic
andarc
. These are entirely parameterized by positions, with some mere data values. Thearc
means circular arc and is parameterized bystart
,control
andend
. Though when there's only one control point such as the case forquad
andarc
the control point is duplicated intocontrol2
position as well. Forpoint
type thestart
andend
positions are also coincident. Other segments likeend
simply bookend runs of connected (or disjointed segments), and vertex is expected to represent graphs.Features
This arrangement is important for a couple reasons, most notably it purely symmetric. If you call
np.flip
you get the same segment backwards withcubic
segments actually remaining the same having flipped the control points. You can donp.flip
over the other axis too and this reverses our paths and the segments within our paths.Also, there's always going to be more and more stuff that needs to be added as metadata and pointing to an arbitrary bit of
.settings
metadata for each and every segment is needed. There's always need to addcolor
,linewidth
,penspeed
,z
.number_of_unicorns
,name
, etc. And with this we can do that on a per segment basis. Also, when the segments all point to the same data we can point them to literally the same data.All of our segments are the same size, and are flagged with which points are positions and thus should be treated as positions, letting us easily represent complex paths, etc.
Interoperability
For most of the vpype use cases, we can easily port the functions. Though rather than a single line of N complexes we'd have a N x 5 complex numbers. Also define the segments rather than just the points. The lower nibble of the
type
gives us the bits that are positions.eg.
So when we're doing something like a scale we can easily just apply that.
And regardless of the segment type all the points are modified which means are shapes are correctly modified like that. Giving us all the affine transformations in numpy. So we have full and fast access to our transformations.
In some highly dissimilar cases we do
np.where(infos.astype(int) == TYPE_LINE
and do the same for the other geometric primitives to keep things numpy.We can also fairly easily do the
npoint
like lookup of positions within these curves with respect fort
. And with that we can do things like using numpy to calculate the intersections for any segments. (see: https://github.com/meerk40t/meerk40t/blob/955c0c7568c61e3be22725c97ec9ad324166ff62/meerk40t/tools/geomstr.py#L1569).We can do more complex things like two-opt distance optimizations (which relies heavily on path flipping) (see: https://github.com/meerk40t/meerk40t/blob/955c0c7568c61e3be22725c97ec9ad324166ff62/meerk40t/tools/geomstr.py#L2298).
This also captures most of the Shapely primitives:
point
,multipoint
, 'lineString,
linearRing,
MultiLineString,
Polygon,
MultiPolygonand
GeometryCollection`. Which also makes it fit with the GeoJSON values (mostly), but also things like ESRI GIS shapefiles format since our geometry also has attribute data (https://en.wikipedia.org/wiki/Shapefile) (mostly). And does so in an evenly sized array.So we can subsume the functionality of
vpype
Original, however ourcircle
would likely be made our of arcs instead of lines etc.Polygon Clipping, Segment Clipping, Constructive Area Geometry.
This sort of segment-based style is extremely useful if we want to do polygon-polygon clipping or polygon-line clipping. (See Martinez' algorithm: https://sean.cm/a/polygon-clipping-pt2). The relevant understanding is that we find all the self-intersections and split the segment there. And intersections between two different lists of segments (and split both segments there). Then we can work out for all segments 4 bits worth of info.
If we have an open shape, they don't contain any regions and neither side of any segment contains that shape. We then just query for this data. If we want polygon-line clipping we're looking for all segments which have clip-region on both sides of that segment. Throw out all other segments. If we're looking for the intersection of two closed shapes, we want all segments that contain both clip-region and subject-region on one side and do not contain both on the other side.
Would get merged into:
This gives us fairly solid access to a close-to-native clipping algorithm for arbitrary shapes and is generally faster than Vatti. Also, nothing about this breaks with regard to our other geometric advanced segments (we can still find intersections).
n*point inside Polygon
Some needed functions are a bit cryptic ( https://github.com/meerk40t/meerk40t/blob/955c0c7568c61e3be22725c97ec9ad324166ff62/meerk40t/tools/geomstr.py#L289 ), like the ability to check an array of points and see if they exist within an arbitrary closed shape.
This involves creating a numpy array of active edges and y-event of a scanbeam. Calculating the sum of the whether our y-component is greater than the y-events which give us the event index. Looking up the active edges for that event index. Calculating all the x-intercepts of those segments at our point's y-component. And calculating the sum of whether our x-component is greater than x-intercept there, and modding that by 2. Which tells us whether all points in the array are inside or outside that shape, in pure numpy.
Conclusion
While there's a lot of other little details I may have glossed over we can do things like
bbox
andposition
(.point
but point is used to describe actual points),.length
,.split
and likely some-stuff like the normal and tangent calculations. Or the standards for calling most of these functions with either an integer (index), an array of integers (multiple indexes), or a Nx5 complex array (explicit segments). It seems like this sort of functional primitives is a fairly ideal candidate for any next generation sort of vpype stuff you'd be looking at. We can get the very basic primitive things while also implementing much more complex higher level algorithms. And keep nearly everything able to be done with numpy.As I'm building this out are there any concerns or problems you foresee? Even if you're not intending to redo/adapt a lot of your work and change the most basic element of your datastructure, what sorts of roadblocks do you think would crop up with this sort of change? For example are there structures that the primitives couldn't build that you've potentially wanted, some algorithms that would really get stuck using this sort of thing? Or other more general or specific concerns?
The text was updated successfully, but these errors were encountered: