Skip to content

POSIXt: tzone and differences #660

@r2evans

Description

@r2evans

When I upload POSIXt that has no timezone set, it is interpreted differently on the round-trip query.

bqcon <- DBI::dbConnect(bigrquery::bigquery(), ...) # the connection works
ds <- bigrquery::bq_table(..., table = "r2test")
tbl <- tibble(bool=c(T,F,NA), int=c(0:1, NA), dbl=c(0, 1, NA), chr=c("Yay","Nay",NA),
              dat=c(rep(Sys.Date(),2),NA), psx=c(rep(Sys.time(), 2), NA)) |>
  mutate(psxdbl = as.numeric(psx))
bigrquery::bq_table_upload(ds, tbl)
tbl2 <- DBI::dbGetQuery(bqcon, "select * from r2test") 
# all tables are inserted _backwards_?
tbl2 <- select(tbl2, all_of(names(tbl)))
all.equal(tbl, tbl2, check.attributes = FALSE)
# [1] "Component “psx”: 'tzone' attributes are inconsistent ('' and 'UTC')" "Component “psx”: Mean absolute difference: 18000.06"                

tbl
# # A tibble: 3 × 7
#   bool    int   dbl chr   dat        psx                      psxdbl
#   <lgl> <int> <dbl> <chr> <date>     <dttm>                    <dbl>
# 1 TRUE      0     0 Yay   2026-01-06 2026-01-06 21:05:53 1767733553.
# 2 FALSE     1     1 Nay   2026-01-06 2026-01-06 21:05:53 1767733553.
# 3 NA       NA    NA <NA>  NA         NA                          NA 
tbl2
# # A tibble: 3 × 7
#   bool    int   dbl chr   dat        psx                      psxdbl
#   <lgl> <int> <int> <chr> <date>     <dttm>                    <dbl>
# 1 TRUE      0     0 Yay   2026-01-06 2026-01-06 16:05:53 1767733553.
# 2 FALSE     1     1 Nay   2026-01-06 2026-01-06 16:05:53 1767733553.
# 3 NA       NA    NA <NA>  NA         NA                          NA 
tbl$psx[1] - tbl2$psx[1]
# Time difference of 5.000018 hours

When I do the same thing but with "tzone" set on the data, it is different (and more correct, within 65 milliseconds).

DBI::dbExecute(bqcon, "drop table r2test")
attr(tbl$psx, "tzone") <- "UTC"
bigrquery::bq_table_upload(ds, tbl)
tbl3 <- DBI::dbGetQuery(bqcon, "select * from r2test") 
tbl3 <- select(tbl3, all_of(names(tbl)))
all.equal(tbl, tbl3, check.attributes = FALSE)
# [1] "Component “psx”: Mean absolute difference: 0.06479907"

tbl$psx[1] - tbl3$psx[1]
# Time difference of 0.06479907 secs
tbl$psxdbl[1] - tbl3$psxdbl[1]
# [1] -9.536743e-07

Problems:

  1. If attr(x, "tzone") is NULL on upload, the round-trip data is wrong.
  2. 64.8 milliseconds is a lot of error in the round-trip of POSIXt values, how can we control this? If dbl's precision is maintained through 6 digits, I'd hope that seconds would go to the same precision at least. This seems related to Need some way to control digits used by jsonlite #320, though the fact that it's been open five years with no discussion is certainly frustrating and concerning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions