Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip test that uses vcr if casssette is not UTF-8 #99

Open
sckott opened this issue Apr 16, 2019 · 4 comments
Open

skip test that uses vcr if casssette is not UTF-8 #99

sckott opened this issue Apr 16, 2019 · 4 comments

Comments

@sckott
Copy link
Collaborator

sckott commented Apr 16, 2019

Arose from recent realization that strings in cassettes, any string in the cassette (not just the response body), that are not UTF-8 can cause problems on some platforms, e.g.,

  • rcrossref (with commit 7ac23d58eca2e3a125e6dc0879d1350730097041)
    • tools::showNonASCIIfile("tests/fixtures/cr_citation_count.yml")
    • tools::showNonASCIIfile("tests/fixtures/cr_cn_different_base_urls.yml")
  • rbison (with commit ba1241005fce12f555f3902bbecfcca3dec8294c)
    • tools::showNonASCIIfile("tests/fixtures//bison.yml")
  • rdatacite (with commit 2f409f9c56a0a6eba5ac53246c7bce153c6fe552)
    • tools::showNonASCIIfile("tests/fixtures/vcr_cassettes/dc_search_csv.yml")

Options:

  • include helper fxn exported from this pkg to skip tests with vcr usage, e.g., skip_vcr_non_utf8()
  • no fxn, but document the issue and examples of how to fix
  • vcr exported fxn that behaves similar to the privacy key filtering tool to fix utf-8 strings in cassettes prior to reading them if possible? or something like that
@sckott
Copy link
Collaborator Author

sckott commented May 3, 2019

I think best approach is to take the string from the cassette and replace any non ascii strings with unicode equivalent, or if none found, then drop the non-ascii, or something else? ideally would use stringi, but the pkg is so heavy so not sure want to import it. but could have as a Suggest and if not installed, then do something with base R. I like the idea of a filter before loading into yaml because then we don't touch the real request/response

probably happen on the line after this or so https://github.com/ropensci/vcr/blob/master/R/serializers-yaml.R#L76

maybe:

text_cleaner <- function(x) {
    if (requireNamespace("stringi")) {
        stringi::stri_escape_unicode(x)
    } else {
        # some other w/o stringi
    }
}

@sckott
Copy link
Collaborator Author

sckott commented May 3, 2019

another reason to get that JSON serializer done sooner than later, it may help avoid these yaml pkg problems

@maelle
Copy link
Member

maelle commented Nov 27, 2020

another reason to get that JSON serializer done sooner than later, it may help avoid these yaml pkg problems

It seems so at least for my encoding problems. :-D

@sckott
Copy link
Collaborator Author

sckott commented Nov 30, 2020

right. it'd be nice to have a better yaml pkg, but its easier here to just recommend people use json if they have any issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants