Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_vpts() #590

Merged
merged 159 commits into from
Jul 7, 2023
Merged

read_vpts() #590

merged 159 commits into from
Jul 7, 2023

Conversation

peterdesmet
Copy link
Collaborator

@peterdesmet peterdesmet commented May 29, 2023

This PR is incomplete, but contains the start of a new read_vpts() function that supports both hdf5 vp files and VPTS CSV files (#551). It would be very useful if the latter was included in bioRad 0.7.

  • The public read_vpts(files) function checks if all provided file extensions are the same and then uses a switch() to read the data with an appropriate helper function: either read_vpts_csv() or read_vpts_hdf5().
    • read_vpts_csv() contains new functionality, but relies heavily on frictionless to read local and remote csv, gzipped or not (all supported by underlying readr dependency) and compares the data with the VPTS CSV table schema. It provides good warning/errors.
    • read_vpts_hdf5() makes use of read_vpfiles() and then bind_into_vpts().
  • A previous function with the name read_vpts() has been renamed to read_stdout(). It is not integrated in the new read_vpts() because 1) that read function requires a lot more metadata parameters (lat, lon, height) that I did not want to integrate in the new read_vpts() and 2) support for vol2bird stdout format will be deprecated in the future.
  • A helper script /tests/testthat/helper.R has been added to aid in the download and temporary storage of hdf5 and csv files required for testing. When tests are run, a temporary folder /temp is created and subsequently removed at the end of the test-read_vpts.R script.

TODO

  • Provide documentation
  • Create a public read_vpts() function
  • Create a private read_vpts_hdf5() function
  • Create a private read_vpts_csv() function
  • read_vpts_csv() can return data as df (useful for the future VPTS objects as data.frame #568)
  • read_vpts_csv() can return data as a vpts object. A start for this has been made, code can take inspiration from read_stdout()
  • Write tests for read_vpts(). Necessary tests are listed, but not written
  • Write tests for read_stdout(). These were not provided yet, but are not required as part of this PR
  • Add deprecation rerouting to read_stdout() if read_vpts() is used with parameters lat, lon, etc.

Unfortunately, I don't have the time to finish this. @iskandari would have the opportunity to complete this functionality? Ping me if you have any questions.

@peterdesmet peterdesmet requested review from iskandari and removed request for iskandari May 29, 2023 09:46
@peterdesmet peterdesmet mentioned this pull request May 29, 2023
@codecov
Copy link

codecov bot commented May 29, 2023

Codecov Report

Merging #590 (66cec50) into master (ca7e2f8) will increase coverage by 0.55%.
The diff coverage is 88.50%.

❗ Current head 66cec50 differs from pull request most recent head 12d81ad. Consider uploading reports for the commit 12d81ad to get more accurate results

@@            Coverage Diff             @@
##           master     #590      +/-   ##
==========================================
+ Coverage   81.53%   82.09%   +0.55%     
==========================================
  Files          59       61       +2     
  Lines        3537     3692     +155     
==========================================
+ Hits         2884     3031     +147     
- Misses        653      661       +8     
Impacted Files Coverage Δ
R/download_pvolfiles.R 93.84% <ø> (ø)
R/hooks.R 100.00% <ø> (ø)
R/regularize_vpts.R 86.56% <ø> (ø)
R/read_stdout.R 84.32% <84.32%> (ø)
R/utils.R 85.86% <85.71%> (+0.15%) ⬆️
R/as.vpts.R 93.10% <93.10%> (ø)
R/read_vpts.R 94.33% <93.87%> (+12.58%) ⬆️
R/as.data.frame.R 100.00% <100.00%> (ø)
R/calculate_vp.R 90.90% <100.00%> (+0.77%) ⬆️
R/sunrise_sunset.R 83.33% <100.00%> (ø)

@iskandari
Copy link
Collaborator

Absolutely!

@adokter adokter mentioned this pull request Jul 7, 2023
R/read_vpts.R Show resolved Hide resolved
inst/extdata/example_vpts.csv Outdated Show resolved Hide resolved
R/as.vpts.R Show resolved Hide resolved
tests/testthat/test-read_vpts.R Outdated Show resolved Hide resolved
@adokter adokter self-requested a review July 7, 2023 15:56
iskandari and others added 24 commits July 7, 2023 12:29
…into read_vpts"

This reverts commit 565a786, reversing
changes made to 2b29cc9.
)

# Check whether time series is regular
heights <- as.integer(unique(data[["height"]]))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why convert to integer? we may have situations where the height is a float

# Throw error if nrows per height are not identical

assertthat::assert_that(
isFactor(dim(data)[1], length(unique(data$height))) > 0,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use lowercase throughout and separating words by _, so this should be rename is_factor. may be confusing though because factor is a specific data type in R, so would use a different word

@iskandari iskandari merged commit 18e931e into master Jul 7, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

frictionless schema obscures NaN values
5 participants