Unable to access newly created columns in BigQuery table schema

Bigrquery seems to assume that the table schema won't change, which can cause issues. The reprex demonstrates that if you create a table in BigQuery and then add a column, you can't access the new column using dplyr syntax. I'm told that this is due to bigrquery getting the schema from a cache. 

We have a couple of work arounds, but neither are ideal:
1. If you disconnect the connection, wait up to a minute or so and then reconnect, it can see the new column. 
2. Redefining `bq_table_meta` (https://github.com/r-dbi/bigrquery/blob/main/R/bq-table.R) as below works but has been inconsistent when I have tested it. Sometimes just waiting the minute or so works, or sometimes I have had to run `cars_new_col_qu2 = tbl(con, cars_table)` in the reprex twice to get it to work.

```r
bq_table_meta <- function (x, fields = NULL) {
    x <- as_bq_table(x)
    url <- bq_path(x$project, x$dataset, x$table)
    bq_get(url, useQueryCache = FALSE, query = list(fields = fields))
}
``` 

Reprex (uses the standard bigrquery `bq_table_meta`, not the custom one above):
``` r
library(DBI)
library(bigrquery)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(dbplyr)
#> 
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#> 
#>     ident, sql
library(glue)

# Replace xxxx with your values
project = "xxxx"
dataset = "xxxx"
test_id = round(as.numeric(Sys.time()))
cars_table = glue("cars_test_{test_id}")

con = dbConnect(
  bigquery(),
  project = project,
  dataset = dataset,
  billing = project
  )

# Create table in database
cars_df = as.data.frame(mtcars)
dbWriteTable(con, cars_table, cars_df)

# Verify it is there
cars_qu = tbl(con, cars_table)
colnames(cars_qu)
#>  [1] "am"   "carb" "vs"   "qsec" "wt"   "drat" "disp" "hp"   "cyl"  "gear"
#> [11] "mpg"

# Make a new column and write to the table
cars_new_col_qu = cars_qu |> 
  mutate(new_col = cyl / mpg)

cars_new_col_qu_string = sql_render(cars_new_col_qu)
dbExecute(con, glue("create or replace table {cars_table} as {cars_new_col_qu_string}"))
#> [1] 0

# Query the table with the new column (this is the same query as cars_qu)
cars_new_col_qu2 = tbl(con, cars_table) 

# Doesn't show new_col
colnames(cars_new_col_qu2)
#>  [1] "am"   "carb" "vs"   "qsec" "wt"   "drat" "disp" "hp"   "cyl"  "gear"
#> [11] "mpg"

# Gives an error because new_col isn't in the schema
cars_new_col_qu3 = cars_new_col_qu2 |> 
  select(mpg, cyl, new_col) 
#> Error in `select()`:
#> ! Can't subset columns that don't exist.
#> ✖ Column `new_col` doesn't exist.

# But it does exist which we can see by querying the table directly
cars_table_schema = dbColumnInfo(dbSendQuery(con, glue("select * from `{project}.{dataset}.{cars_table}` limit 0")))
print(cars_table_schema)
#>       name    type
#> 1       am INTEGER
#> 2     carb INTEGER
#> 3       vs INTEGER
#> 4     qsec   FLOAT
#> 5       wt   FLOAT
#> 6     drat   FLOAT
#> 7     disp   FLOAT
#> 8       hp INTEGER
#> 9      cyl INTEGER
#> 10    gear INTEGER
#> 11     mpg   FLOAT
#> 12 new_col   FLOAT

# Clean up
dbExecute(con, glue("drop table {cars_table}"))
#> [1] 0
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to access newly created columns in BigQuery table schema #659

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to access newly created columns in BigQuery table schema #659

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions