Skip to content

spanner-truncate is a tool to delete all rows from the tables in a Cloud Spanner database without deleting tables themselves.

License

Notifications You must be signed in to change notification settings

cloudspannerecosystem/spanner-truncate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spanner-truncate

run-tests

Overview

spanner-truncate is a tool to delete all rows from the tables in a Cloud Spanner database without deleting tables themselves.

Please feel free to report issues and send pull requests, but note that this application is not officially supported as part of the Cloud Spanner product.

Use cases

  • Delete rows from the database while keeping the underlying splits, which are typically for database pre-warming before the launch.
  • Delete rows from the database without requiring strong IAM permissions for deleting tables or databases.

Motivation

At a glance deleting all rows from the database looks an easy task, but there are several issues we could encounter when we want to delete rows from the real-world databases.

  • If the table size is huge, a simple DELETE statement like DELETE FROM table WHERE true could easily exceed the transaction mutation limit.
  • Rows in interleaved tables which have PARENT ON DELETE NO ACTION must be deleted first before deleting the rows from the parent table, otherwise it will cause a constraint violation error.
  • Rows in the tables which reference other tables with FOREIGN KEY constraints must be deleted first before deleting rows in the referenced tables, otherwise it will cause a constraint violation error.
  • It would take a lot of time if we delete rows from the tables one by one.

How this tool works

To solve the preceding issues, this tool works as follows.

  • Use Partitioned DML to delete all rows from the table to overcome the single transaction mutation limit.
  • Delete rows from multiple tables in parallel to minimize the total time for deletion.
  • Automatically discover the constraints between tables and delete rows from the tables in proper order without violating database constraints.

Limitations

  • This tool does not guarantee the atomicity of deletion. If you access the rows that are being deleted, you will get the inconsistent view of the database.
  • This tool does not delete rows which were inserted while the tool was running.
  • This tool does not support truncating tables that use foreign key constraints in some scenarios:
    • If there is a circular dependency among the tables, truncation will be failed.
    • If --tables is used for a table that is referenced by other tables with ON DELETE CASCADE, such tables will also be truncated.
    • If --exclude-tables is used only for the referencing table that has ON DELETE CASCADE, that table will be truncated by cascade-deletion of the referenced table.

Install

GO111MODULE=on go get github.com/cloudspannerecosystem/spanner-truncate

How to use

Usage:
  spanner-truncate [OPTIONS]

Application Options:
  -p, --project=  (required) GCP Project ID. [$SPANNER_PROJECT_ID]
  -i, --instance= (required) Cloud Spanner Instance ID. [$SPANNER_INSTANCE_ID]
  -d, --database= (required) Cloud Spanner Database ID. [$SPANNER_DATABASE_ID]
  -q, --quiet     Disable all interactive prompts.
  -t, --tables=   Comma separated table names to be truncated. Default to truncate all tables if not specified. If an interleaved table is specified, its descendants tables are also truncated.
  -e, --exclude-tables Comma separated table names to be exempted from truncating. 'tables' and 'exclude-tables' cannot co-exist. If an interleaved table is specified, its ancestors tables are also excluded.
Help Options:
  -h, --help      Show this help message

Example:

$ spanner-truncate -p myproject -i myinstance -d mydb
Fetching table information from projects/myproject/instances/myinstance/databases/mydb
Albums
Concerts
Singers
Songs

Rows in these tables will be deleted. Do you want to continue? [Y/n] Y
Concerts: completed    13s [============================================>] 100% (1,200 / 1,200)
Singers:  completed    13s [============================================>] 100% (6,000 / 6,000)
Albums:   completed    12s [============================================>] 100% (1,800 / 1,800)
Songs:    completed    11s [============================================>] 100% (3,600 / 3,600)

Done! All rows have been deleted successfully.

Import as a Go package

You can also use spanner-truncate as a Go library from your Go application. The entry point is Run function in truncate package. If you have some subsequent processes using a client, you can use RunWithClient. You can pass the externally generated client to the function and avoids the use of redundant clients.

About

spanner-truncate is a tool to delete all rows from the tables in a Cloud Spanner database without deleting tables themselves.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages