Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where to download test data? #32

Open
milechin opened this issue May 22, 2023 · 6 comments
Open

Where to download test data? #32

milechin opened this issue May 22, 2023 · 6 comments

Comments

@milechin
Copy link

I would like to use this gpt tests framework to test an install of gpt on our cluster. I am unable to find the test data. At the bottom of the README file there is reference to an S3 bucket that contains the data. The section indicates I need to go to the "Confluence page" : https://senbox.atlassian.net/wiki/spaces/SENBOX/pages/2490433537/S3+bucket

But I do not have permission to view this page.

Where can I get a copy of this test data?

@TomBlock
Copy link
Contributor

Dear Dennis,

these data are thought as to be used only for internal testing. We will discuss in the team, if we want/can make the data publicly available. The set-up of the GPT test suite is a rather complex task and the dataset covers ~600GB, containing also data that is not free-of-charge to use.

You can safely assume that our team ran the complete test-suite before shipping, so that for your cluster it would probably be sufficient to run a simpler test with one or two nodes just to test the installation.

Cheers,
Tom

@milechin
Copy link
Author

Hello Tom,

Thank you for the prompt response. Do you have a recommendation for a simpler test? I am not the primary user for this software, so a predefined test with included test data would be most helpful for me. We periodically update the operating system for our cluster and so we need to have a test on file that we can run to confirm the software is still operational for our researchers.

Thank you for the help.

Dennis

@TomBlock
Copy link
Contributor

Hi Dennis,

without knowing which data is being processed on the cluster it is difficult to judge - even more difficult to generate a test-graph.

Probably the best advice I can give is to ask the scientists about a processing graph they use (can be a simple re-projection) and take the confirmed result as your reference dataset. Then you can run the graph everytime the cluster-environment changed and run a comparision with the reference dataset (use band-maths to subtract the same variable from reference and actual - if the mean-value of the difference is >0 something is wrong).

Cheers,
Tom

@TomBlock
Copy link
Contributor

More information and examples can be found here:

https://senbox.atlassian.net/wiki/spaces/SNAP/pages/70503475/Bulk+Processing+with+GPT

@milechin
Copy link
Author

Hi Tom,

Thank you for the additional information. I ran the example provided and I was able to determine that the GDAL package that came with the software is not compatible with our system. So this was a good example to run.

Do you know if there are other external tools, like GDAL, used by the program that should be tested?

Thank you,
Dennis

@TomBlock
Copy link
Contributor

TomBlock commented May 24, 2023

Hi Dennis,

SNAP uses some external native libraries. These are extracted on startup to "userhome"/.snap/auxdata
Namely:

  • gdal
  • netcdf
  • openjpeg

Also (if using the s1tbx): jblas (requires libgfortran5)

I hope that's all ...

Cheers,
Tom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants