-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WQX v3.0 Testing, Development and Updates #529
Comments
@cefergus I added some details here from our emails for tracking: If you run: Let's start by just testing if all our functions can run still if we use the new beta services/profiles but the old names. Eventually we’ll need to switch the code base & function outputs to use/reference the new names too. For this, it is still going to be helpful to have a function that can easily covert the columns from back and forth from the “legacy” to the new names. Note on performance: The services have been working really well lately, but folks doing big queries (like our group) still might need to rejigger the expectations of what can come back from a single query. Let's wait to update our big data retrieval functions (automatic chunking of pulls for users if needed) after we switch to the new services. |
Here are the lines in TADA that we will need to update to use the new profiles: # Retrieve all 3 profiles
print("Downloading WQP query results. This may take some time depending upon the query size.")
print(WQPquery)
results.DR <- dataRetrieval::readWQPdata(WQPquery,
dataProfile = "resultPhysChem",
ignore_attributes = TRUE
)
# check if any results are available
if ((nrow(results.DR) > 0) == FALSE) {
print("Returning empty results dataframe: Your WQP query returned no results (no data available). Try a different query. Removing some of your query filters OR broadening your search area may help.")
TADAprofile.clean <- results.DR
} else {
sites.DR <- dataRetrieval::whatWQPsites(WQPquery)
projects.DR <- dataRetrieval::readWQPdata(WQPquery,
ignore_attributes = TRUE,
service = "Project"
)
TADAprofile <- TADA_JoinWQPProfiles(
FullPhysChem = results.DR,
Sites = sites.DR,
Projects = projects.DR
) More from Laura D on USGS dataRetrieval: Just to let you all know on my personal vocabulary - I usually now refer to our classic WQP calls as "legacy". At the moment, that's probably not the best term since it is still the production version of the Portal (WQP). But, we're working towards a release where the beta services on WQP become production. When that happens, the current system will be considered legacy and the new system will just be the default. I have NO idea when that might be. I'll check on what's going on with the missing columns. Most of my tests for "legacy" don't usually specify the resultPhysChem profile (they are even more legacy-y dating back to when WQP didn't even have profiles). I'll let you know if I push up a fix on GitHub. If you want to see how the new profiles look, you can try this: results.DR <- dataRetrieval::readWQPdata(WQPquery,
service = "ResultWQX3",
dataProfile = "basicPhysChem",
ignore_attributes = TRUE
)
sites.DR <- dataRetrieval::whatWQPsites(WQPquery,
legacy = TRUE) At the moment....I ran some tests and simple queries and they do seem to be completing today. There's currently no WQX3 version of the "Project" dataProfile, so that's something that you would need to wait for anyway. |
@cefergus I created a branch for these edits. See: https://github.com/USEPA/EPATADA/tree/WQX3.0betatesting I started editing the TADA_DataRetreival function to use the new 3.0 full phys chem profile |
Pushed a function to rename WQX3.0 column names back to WQX2.0 legacy names referencing the online schema. However, we noticed that there are differences in special characters between the 2.0 names and names used in TADA. Next steps are to identify which column names need to be changed to match TADA. Once this is fixed we can test how well TADA autoclean works with the data uploaded using the new service. |
Updated TADA_RenameColumn function to rename WQX3.0 columns to legacy version and/or names used in TADA_AutoClean. Calling TADA_DataRetrieval with applyautoclean = TRUE is working. Will test it on other queries to see if there are other misalignment issues to work through. |
Is your feature request related to a problem? Please describe:
Services are still under development with WQP 3.0 beta and testing. There is interest in hearing from us re: testing if we notice issues with TADA workflows when we start our updates, especially if there are example workflows that might be useful for testing.
Describe the solution you'd like:
Pushing the changes through in TADA in relation to the WQP 3.0 format will likely be held back for at least half a year (Around March 2025). Prior to this push, there will be a need to validate any impacts to TADA workflow and processes while the dataRetrieval package from USGS are being made so that there is a smooth, quick and efficient process downstream. There is a stable CRAN as well as a developer version on GitHub, and testing should be done with the GitHub version.
Additional context:
Please see GitHub link from USGS on updates, plans and documentation on dataRetrieval function.
dataRetrieval status updates: https://doi-usgs.github.io/dataRetrieval/articles/Status.html
dataRetrieval development plan: https://doi-usgs.github.io/dataRetrieval/articles/wqx3_development_plan.html
Lee Stanish said the dev version is using the 3.0 profiles but is still very much in development
The text was updated successfully, but these errors were encountered: