-
-
Notifications
You must be signed in to change notification settings - Fork 46
Cycle 5: Improving astropy units parsing
#494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi, Also can you clarify who will be the payee? Will there be a subcontractor, do you have a name? Thanks! |
f220f39 to
ae81ce6
Compare
I'd be doing the work myself. |
|
Hi! I like this idea. it’s definitely important to fix the many problems with our unit parsing. One question I wanted to raise: in an earlier PR of yours (#17652) we talked about possibly moving away from formatter classes toward formatter instances and the many ways it would improve the unit formatters. Do you think it’s in scope for this work to include that transition, or would that be better handled separately? |
|
The main challenge with switching from formatter classes to formatter instances is how to avoid breaking formatters implemented in downstream packages. I would be willing to investigate that as a part of this project, if the |
|
Yes, it's definitely challenging, some kind of deprecation would be necessary. |
|
@nstarman - To change to formatter instances is one of those things that would be nice to have in theory (I know I am one of those who suggested it...), but I think in practice it will probably have relatively little impact for what will be quite a bit of work, especially with deprecations, etc. It doesn't seem obvious that it would be effort and money well spent. More on the topic here, it is certainly a long-standing problem that function units can only be parsed by general and cds, but I should probably mention that one of the reasons FITS doesn't have it, is that the FITS standard committee was less than helpful in thinking through how arguably the most relevant units ( p.s. On the proposal itself, I think it is a nice (if modest) improvement, but it does seem to me that the equivalent of six 40-hour weeks is rather a lot for the described work. (For reference for just the function unit part of it, I don't remember exactly, but ensuring cds could represent EDIT: explicit suggestion: if the proposal is meant to also cover (part of) continued general maintenance and cleanup, do include that. |
|
I agree that parsing function units shouldn't take too much time, but I don't expect designing the API for defining custom units to be a quick process. For example, we need to ensure custom units do not interfere needlessly with standard units, but we cannot simply forbid all interference either because users might be trying to read a non-standard file for which some overriding of standard units is necessary. And the very nature of custom units means there is little guidance from external standards. One more complication is that the custom unit functionality should be available through the |
|
Yes, the custom units might take a bit more time. My sense is that the basic infrastructure is actually in place (FITS files give warnings that tell what to do, but perhaps not very clearly or even misleadingly, see astropy/astropy#15313), so it might mostly be documenting it more clearly. But agreed that it needs time to even think through what actually goes wrong/is needed. p.s. A sideways related long-standing wish-list item is for FITS headers to return quantities if they list a unit in their string -- and vice versa, allow quantities to be written to headers. See astropy/astropy#9332 and an aborted attempt at astropy/astropy#11849. |
Some of the basic infrastructure seems to in place for the happy path, but my impression is that cases off the happy path have not received enough attention, and those are the cases that stand out for the users. |
ae81ce6 to
4681373
Compare
|
It would be useful for this funding request to have a budget range. |
4681373 to
f711481
Compare
|
Responding to #494 (comment)
A part of the issue is that But if a different factor is used we get a verbose error message that is mostly about custom units: And setting A better behavior would be that
That might be straightforward to implement, but this would have to be done for each parser rule individually, and we have several different parsers, which does add up to quite a lot. |
|
Agreed that |
|
My main worry would be that we spent a lot of time on fixing a corner case that in practice occurs rarely. The present message was an attempt at fixing an issue raised at astropy - maybe we should not worry about further fixes until we've got more concrete use cases (e.g., actual Vizier Catalogues that the current code cannot read, etc.)? |
|
Please react to this comment to vote on this proposal (👍, 👎, or no reaction for +0). |
|
The Cycle 5 funding request process has been hugely successful! On the downside, that means our funds are severely oversubscribed. Even after the Finance Committee and SPOC have taken into consideration community feedback/voting and alignment with the roadmap, there are still more funding requests than we can afford in 2026. We would like to stretch the budget as far as possible, and to fund as many activities as possible, while making sure the Project remains volunteer-driven. Hence, we would like to know if this project will still meet its deliverables if your minimum budget is reduced by 25%, 50%, or 100%. Or if there’s some other minimum, feel free to specify that instead. As a reminder, there will be more funding for 2027 and we expect the Cycle 6 call for 2027 funding requests to begin in the Fall of 2026. Thank you for your engagement and understanding as we continue to optimize our funding and budgeting processes and the balance of volunteer vs funded work! (@eerovaher ) |
|
Improving |
I am submitting a finance request for work on
astropyunits parsing, in particular parsing function units and custom units. The latter is important for reading existing data files that might be using non-standard units.