Sheets un-even columns range_speedread incorrect #309

HugoGit39 · 2023-12-28T14:05:55Z

Hi

I have a large Google Sheets with uneven column lengths.

When I use range_speedread it doesnt read the last columns correct. Why?

See this example:

Google sheets:

https://docs.google.com/spreadsheets/d/1t2JYOCsCvK05Layi3loXFWwCBGXjXNUDMUAi9xudiHg/

test <- range_speedread(as_id("1t2JYOCsCvK05Layi3loXFWwCBGXjXNUDMUAi9xudiHg"), show_col_types = F)

The text was updated successfully, but these errors were encountered:

jennybc · 2024-01-15T18:54:53Z

I'm not entirely sure what you mean by "doesnt read the last columns correct".

But I think you're just noticing trickiness of column type guessing in the presence of lots of missing data?

The docs for range_speedread() outline various gotchas of this function and point out that, ultimately, readr::read_csv()) gets used.

You can read about readr's column type guessing here:

https://readr.tidyverse.org/articles/column-types.html

But one solution for this dataset is just to instruct readr to use all the rows to guess column type, instead of the first 1000.

library(googlesheets4)
gs4_deauth()

test2 <- range_speedread(
  "1t2JYOCsCvK05Layi3loXFWwCBGXjXNUDMUAi9xudiHg",
  guess_max = Inf
)
#> ✔ Reading from "Test".
#> ℹ Export URL:
#>   <https://docs.google.com/spreadsheets/d/1t2JYOCsCvK05Layi3loXFWwCBGXjXNUDMUAi9xudiHg/export?format=csv>
#> Rows: 4741 Columns: 7
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (7): X0, X1, X2, X3, X4, X5, X6
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
test2
#> # A tibble: 4,741 × 7
#>       X0    X1    X2    X3    X4    X5    X6
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 0.3      NA    NA    NA    NA    NA    NA
#>  2 0.300    NA    NA    NA    NA    NA    NA
#>  3 0.295    NA    NA    NA    NA    NA    NA
#>  4 0.299    NA    NA    NA    NA    NA    NA
#>  5 0.299    NA    NA    NA    NA    NA    NA
#>  6 0.298    NA    NA    NA    NA    NA    NA
#>  7 0.32     NA    NA    NA    NA    NA    NA
#>  8 0.323    NA    NA    NA    NA    NA    NA
#>  9 0.323    NA    NA    NA    NA    NA    NA
#> 10 0.327    NA    NA    NA    NA    NA    NA
#> # ℹ 4,731 more rows
tail(test2)
#> # A tibble: 6 × 7
#>       X0    X1    X2    X3    X4      X5    X6
#>    <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl>
#> 1 42272.  NA    NA    NA   NA         NA  NA  
#> 2 43605.  NA    NA    NA   NA         NA  NA  
#> 3 43870.  NA    NA    NA   NA         NA  NA  
#> 4 44010.  10.2  10.8   7.3  7.54 9193097  42.9
#> 5 43769.  NA    NA    NA   NA         NA  NA  
#> 6 43098.  NA    NA    NA   NA         NA  NA

^{Created on 2024-01-15 with reprex v2.1.0.9000}

HugoGit39 · 2024-01-25T20:55:39Z

Thank you!

jennybc closed this as completed Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sheets un-even columns range_speedread incorrect #309

Sheets un-even columns range_speedread incorrect #309

HugoGit39 commented Dec 28, 2023 •

edited

Loading

jennybc commented Jan 15, 2024

HugoGit39 commented Jan 25, 2024

Sheets un-even columns range_speedread incorrect #309

Sheets un-even columns range_speedread incorrect #309

Comments

HugoGit39 commented Dec 28, 2023 • edited Loading

jennybc commented Jan 15, 2024

HugoGit39 commented Jan 25, 2024

HugoGit39 commented Dec 28, 2023 •

edited

Loading