-
Notifications
You must be signed in to change notification settings - Fork 36
Improve perf of CRS #446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Improve perf of CRS #446
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR significantly improves CRS (Coordinate Reference System) serialization performance through caching and string allocation optimizations. The changes achieve a ~20x speedup for authority code comparisons and ~10x speedup for lnglat comparisons, reducing execution times from 61ms to 3.4ms for auth codes and 37ms to 3.4ms for lnglat.
Key Changes:
- Introduced LRU cache for CRS deserialization results
- Refactored
deserialize_crsto accept&strinstead of&Value, eliminating unnecessary JSON parsing - Consolidated
authorityandcodefields into singleauth_codestring inAuthorityCodestruct to reduce allocations
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| rust/sedona-schema/src/crs.rs | Core CRS implementation with LRU cache, string-based deserialization, and optimized AuthorityCode structure |
| rust/sedona-schema/src/datatypes.rs | Updated to handle both string and object CRS values during deserialization |
| rust/sedona-schema/benches/crs_benchmarks.rs | Added benchmark suite for CRS equality operations |
| rust/sedona-schema/Cargo.toml | Added lru dependency and benchmark configuration |
| rust/sedona-functions/src/st_srid.rs | Updated tests to use string-based CRS deserialization |
| rust/sedona-functions/src/st_setsrid.rs | Simplified CRS deserialization calls |
| c/sedona-proj/src/st_transform.rs | Simplified CRS deserialization calls |
| c/sedona-proj/src/sd_order_lnglat.rs | Removed unused import and simplified CRS deserialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
jesspav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment in line for @paleolimbot
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for investigating and working on this!
I think it's worth keeping a version of deserialize_crs() around that does operate on a Value specifically for deserializing a GeoArrow or GeoParquet crs (where the JSON has already been parsed and reserializing it just to parse it again is probably slow). Or alternatively, add some very long PROJJSON string benchmarks to make sure!
rust/sedona-schema/src/crs.rs
Outdated
| /// LRU cache for CRS deserialization | ||
| static CRS_CACHE: OnceLock<Mutex<LruCache<String, Crs>>> = OnceLock::new(); | ||
|
|
||
| fn get_crs_cache() -> &'static Mutex<LruCache<String, Crs>> { | ||
| CRS_CACHE.get_or_init(|| { | ||
| // Cache up to 256 CRS strings | ||
| Mutex::new(LruCache::new(NonZeroUsize::new(256).unwrap())) | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should this be thread local to avoid contention when CRS computations are happening on many partitions at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!!! That should have an impact on the perf of a single thread, too, since it avoids the lock.
any opinion on the magic number for size of the cache? this could get very large if all the keys are json!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it 50 for now. I think this should be configurable given the perf implications. I'll can add an issue so we don't forget.
In flamegraphs for PR #430 revealed that the CRS serialization had some opportunities for improvement that would be essential for raster/geo per item CRS functions. This was going to be especially important for CRS comparisons in joins.
Added a simple benchmarks for lnglat and auth code equality that was run repeatedly with different performance optimizations. At the start it took 61ms for auth codes and 37ms for lnglat. By the end auth codes have improved by over 20x and lnglat by over 10x, with both running in 2.7ms.
There are basically two changes: an LRU cache and avoiding unnecessary string allocations. At the end, checked to see if we no longer needed the cache, but that brought the time back to 12ms.
Breakdown of performance changes: