Generate BigQuery SQL views from JSON
This CLI tool reads a JSON file and produces BigQuery compatible SQL views from the given JSON. It assumes you have a JSON blob column in a BigQuery dataset that you will reference with the created view.
npx @nealwp/blobview@latest [options] <filepath>
Arguments:
filepath path to valid JSON file
Options:
-t TABLE, --table=TABLE specify a table name to use in FROM clause. default: "<table>"
-d DATASET, --dataset=DATASET specify a dataset to use in FROM clause. default: "<dataset>"
-h, --help show help
Default output to STDOUT:
npx @nealwp/blobview ./path/to/file.json
Dataset and table as input options:
npx @nealwp/blobview --dataset=myDataset --table=myTable ./path/to/file.json
# shorthand options
npx @nealwp/blobview -d myDataset -t myTable ./path/to/file.json
Redirect output to file:
npx @nealwp/blobview@latest ./path/to/file.json > my-view-file.sql
- Produces valid BigQuery syntax SQL
- Creates separate query for each nested object
- Detects STRING, DECIMAL, and INTEGER types from JSON data and casts column datatypes accordingly
- Detects GeoJSON Feature Collection type and formats to JSON string
- Auto-formats column names to snake_case from camelCase and PascalCase
- Detects deeply-nested objects and formats to JSON string
- Pre-populates FROM clause with BigQuery-style placeholders
- BigQuery dataset and table name can be supplied as input options
- Does not detect DATE or TIMESTAMP types, or other types like BOOLEAN
- Arrays will get formatted as JSON strings
- Assumes the BigQuery source dataset column name is always
json_blob
- Does not create SQL views in any syntax other than BigQuery
- Requires a local JSON file to read
- Does not include option to write queries to separate files instead of STDOUT
- BigQuery project name cannot be supplied as input
// sample-data.json
{
"stringField": "any string goes here",
"integerField": 1234,
"decimalField": 3.1234,
"childField1": {
"gender": "male",
"latitude": 32.667598
},
"childField2": {
"favoriteFruit": "banana",
"longitude": 4.219472
},
"childWithNestedObject": {
"isNormal": "i sure hope so",
"nestedObject": {
"favoritePhilosopher": "Kant",
"shoeSize": 14.5
},
},
"examplGeoJson": {
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"id": 16,
"geometry": {
"type": "Polygon",
"coordinates": [
[
[-32.667598, -4.239272],
[-32.767999, -4.319272]
]
]
}
}
]
}
}
# terminal command
npx @nealwp/blobview sample-data.json
The above command will produce the following output:
/* stdout */
SELECT
CAST(JSON_VALUE(json_blob.stringField) as STRING) as string_field
, CAST(JSON_VALUE(json_blob.integerField) as INTEGER) as integer_field
, CAST(JSON_VALUE(json_blob.decimalField) as DECIMAL) as decimal_field
, TO_JSON_STRING(json_blob.exampleGeoJson) as example_geo_json
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField1.gender) as STRING) as gender
, CAST(JSON_VALUE(json_blob.childField1.latitude) as DECIMAL) as latitude
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField2.favoriteFruit) as STRING) as favorite_fruit
, CAST(JSON_VALUE(json_blob.childField2.longitude) as DECIMAL) as longitude
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childWithNestedObject.isNormal) as STRING) as is_normal
, TO_JSON_STRING(json_blob.childWithNestedObject.nestedObject) as nested_object
FROM <project>.<dataset>.<table>
# terminal command with input options
npx @nealwp/blobview --dataset=myDataset --table=myTable sample-data.json
Will produce the following output:
/* stdout */
SELECT
CAST(JSON_VALUE(json_blob.stringField) as STRING) as string_field
, CAST(JSON_VALUE(json_blob.integerField) as INTEGER) as integer_field
, CAST(JSON_VALUE(json_blob.decimalField) as DECIMAL) as decimal_field
, TO_JSON_STRING(json_blob.exampleGeoJson) as example_geo_json
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField1.gender) as STRING) as gender
, CAST(JSON_VALUE(json_blob.childField1.latitude) as DECIMAL) as latitude
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField2.favoriteFruit) as STRING) as favorite_fruit
, CAST(JSON_VALUE(json_blob.childField2.longitude) as DECIMAL) as longitude
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childWithNestedObject.isNormal) as STRING) as is_normal
, TO_JSON_STRING(json_blob.childWithNestedObject.nestedObject) as nested_object
FROM <project>.myDataset.myTable