Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Expand CDMS testing #3095

Closed
2 tasks
keflavich opened this issue Sep 19, 2024 · 5 comments
Closed
2 tasks

ENH: Expand CDMS testing #3095

keflavich opened this issue Sep 19, 2024 · 5 comments

Comments

@keflavich
Copy link
Contributor

Following #3094, it is clear that there is a range of untested parameters in the CDMS module that require tests.

TODO includes:

  • Regression test for 3094 (check that 100501 can be parsed)
  • Systematic check that all molecules can be queried
@keflavich
Copy link
Contributor Author

The systematic search will begin with something like:

from astroquery.linelists.cdms import CDMS
from astropy import units as u

sptbl = CDMS.get_species_table()

for row in sptbl:
    result = CDMS.query_lines(1*u.GHz, 1*u.THz, molecule=str(row['tag']), parse_name_locally=False)
    print(tag, row['molecule'], len(result))

but either CDMS is down right now or they're blocking the airport IP.

@kmaitreys
Copy link

kmaitreys commented Sep 19, 2024

Yeah, I tried something like this but some of my requests got timed out as well. Although I still managed to fetch transitions for most of molecules in the database. While doing that, I encountered one more bug

ValueError: invalid literal for int() with base 10: '1 1 2' for molecule 028528 

Here 028528 is another molecule which has some formatting bug. I will add a fix for this as well soon on my #3094 PR.

EDIT: I am not sure that only these two TAG values were problematic though, so a more exhaustive check will be nice.

@bsipocz
Copy link
Member

bsipocz commented Sep 19, 2024

Yeap, adding just those one or two to #3094 would be enough, and this issue can be addressed in a follow-up PR that does a the exhaustive checks.

@keflavich
Copy link
Contributor Author

Some notes to self: 028528 = H2NC has a format that simply doesn't resemble others:

<pre>
   H2NC8.3280       0.01    -4.4969 3    0.0091  6 -285282357 1 0 1 2 2 1 3 0 0 0 1 1 1 2
   H2NC9.0770       0.01    -4.8318 3    0.0094  4 -285282357 1 0 1 2 1 1 2 0 0 0 1 1 1 1
   H2NC1.8380       0.01     -5.068 3    0.0004  4 -285282357 1 0 1 2 3 1 2 0 0 0 1 2 1 2
   H2NC4.2190       0.01    -4.0711 3    0.0084  8 -285282357 1 0 1 2 3 1 4 0 0 0 1 2 1 3
   H2NC7.2620       0.01    -4.9743 3    0.0094  4 -285282357 1 0 1 1 1 1 2 0 0 0 1 1 1 1
   H2NC8.2110     0.0039    -5.0637 3    0.0004  2  285282357 1 0 1 2 2 1 1 0 0 0 1 2 1 2
   H2NC9.8310       0.01    -4.8718 3    0.0091  4 -285282357 1 0 1 2 1 1 2 0 0 0 1 1 1 2
   H2NC2.5710       0.01    -4.4705 3    0.0000  4 -285282357 1 0 1 2 3 1 2 0 0 0 1 2 1 1
   H2NC6.5450       0.01    -4.9014 3    0.0004  2 -285282357 1 0 1 2 1 1 1 0 0 0 1 2 1 2
   H2NC8.5020       0.01    -4.4981 3    0.0084  6 -285282357 1 0 1 2 2 1 3 0 0 0 1 2 1 3
   H2NC8.9370       0.01    -4.9016 3    0.0000  2 -285282357 1 0 1 2 2 1 1 0 0 0 1 2 1 1
   H2NC9.4500       0.01    -4.4716 3    0.0004  4 -285282357 1 0 1 2 2 1 2 0 0 0 1 2 1 2
   H2NC0.9470       0.01     -4.199 3    0.0004  6 -285282357 1 0 1 2 3 1 3 0 0 0 1 2 1 2
   H2NC2.3830       0.01    -4.4983 3    0.0091  6 -285282357 1 0 1 1 2 1 3 0 0 0 1 1 1 2
   H2NC7.2550       0.01    -5.0661 3    0.0000  2 -285282357 1 0 1 2 1 1 1 0 0 0 1 2 1 1
   H2NC7.8550       0.01    -4.7682 3    0.0094  4 -285282357 1 0 1 1 2 1 2 0 0 0 1 1 1 1
   H2NC9.9960       0.01    -4.8464 3    0.0084  4 -285282357 1 0 1 2 1 1 2 0 0 0 1 2 1 3
   H2NC0.1780       0.01    -5.0721 3    0.0000  4 -285282357 1 0 1 2 2 1 2 0 0 0 1 2 1 1
   H2NC2.8540       0.01    -4.7046 3    0.0091  2 -285282357 1 0 1 1 1 1 1 0 0 0 1 1 1 2
   H2NC5.3100       0.01    -4.7043 3    0.0094  2 -285282357 1 0 1 1 2 1 1 0 0 0 1 1 1 1
   H2NC8.1550       0.01    -4.5573 3    0.0084  4 -285282357 1 0 1 1 1 1 2 0 0 0 1 2 1 3
   H2NC8.5830       0.01    -4.6054 3    0.0091  4 -285282357 1 0 1 1 2 1 2 0 0 0 1 1 1 2
   H2NC2.5480       0.01    -4.5008 3    0.0084  6 -285282357 1 0 1 1 2 1 3 0 0 0 1 2 1 3
</pre></body></html>

compare to 100501:

<pre>
    1222.5020     0.0001    -7.1791 2    0.0000  3 100501 101 1           0              SiC6
    2445.0039     0.0003    -6.2761 2    0.0408  5 100501 101 2           1              SiC6
    3667.5055     0.0004     -5.748 2    0.1223  7 100501 101 3           2              SiC6
    4890.0068     0.0005    -5.3735 2    0.2447  9 100501 101 4           3              SiC6
    6112.5074     0.0006    -5.0832 2    0.4078 11 100501 101 5           4              SiC6
    7335.0070      0.002    -4.8461 2    0.6117 13-100501 101 6           5              SiC6
...
</pre></body></html>

the order, and the spacing, is completely different. CDMS may be unsupportable if they don't provide some kind of metadata about the table format.... but I'm not willing to remove support, as I use the damned thing.

@keflavich
Copy link
Contributor Author

I think the answer here is that the search tool reformats (malformats) the output. Users should be directed to the native .cat files instead. I'm implementing that, but decoding the QN formats is still more than I can manage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants