Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC nullable field schema migration #Proof of Concept #70

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Aric1088
Copy link
Contributor

@Aric1088 Aric1088 commented Aug 14, 2020

To solve the issue of schema incompatibility during versioning changes, the following is proposed:

All fields within the BigQuery table are declared to be NULLABLE with the exception of essential fields such as ProjectID and ID.

The bq-notifier does a schema check upon the table URI referenced in the notifier setup config upon initial deployment:

Any new inferred fields (from the bq-notifier schema) that are not found in the existing table's schema, are appended, preserving any fields already existent in the existent table's schema.

Any new inferred fields that have the same names but differing types with fields found in the existing table's schema will cause the setup to fail. This is a fail safe feature preventing bq-notifier from overwriting tables that may not have been initialized by the bq-notifier.

This allows the following kinds of schema changes to occur between bq-notifier versions:

  1. New NULLABLE fields are added to the Schema.

The data in the table will migrate successfully over to the new Schema. Since the new field is NULLABLE, past records that do not possess the field will still satisfy the new schema. (Adding a new REQUIRED field) would cause the table schema constraints to fail.

  1. Existing NULLABLE fields are removed from the Schema.

The data in the table will migrate successfully over to the new Schema. Since schema migrations have an append only effect on the existing table's schema, the "removed" field will remain within the schema. New data will successfully omit the removed field while existing data containing the field will remain intact.

Adding or removing REQUIRED fields will break all schema constraints during migrations.

@Aric1088 Aric1088 requested review from LOZORD and jessieliu1 August 14, 2020 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant