Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bigtable): Hot backups #11215

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

bhshkh
Copy link
Contributor

@bhshkh bhshkh commented Dec 3, 2024

Fixes: b/365774730

More about hot backups: https://cloud.google.com/bigtable/docs/backups#hot-backups

Generated protos have 2 new fields in the Backup struct. These fields are being added to various methods in this PR:

// Indicates the backup type of the backup.
BackupType Backup_BackupType `protobuf:"varint,11,opt,name=backup_type,json=backupType,proto3,enum=google.bigtable.admin.v2.Backup_BackupType" json:"backup_type,omitempty"`
// The time at which the hot backup will be converted to a standard backup.
// Once the `hot_to_standard_time` has passed, Cloud Bigtable will convert the
// hot backup to a standard backup. This value must be greater than the backup
// creation time by:
// - At least 24 hours
//
// This field only applies for hot backups. When creating or updating a
// standard backup, attempting to set this field will fail the request.
HotToStandardTime *timestamppb.Timestamp `protobuf:"bytes,12,opt,name=hot_to_standard_time,json=hotToStandardTime,proto3" json:"hot_to_standard_time,omitempty"`


To create a hot backup that expires after 4 days:

hotToStdTime := 
adminClient.CreateBackupWithOptions(ctx, "TableID", "Cluster", "BkpName",
		time.Now().Add(24 * 4 * time.Hour), HotToStandardBackup(time.Now().Add(2 * 24 * time.Hour)))

or without specifying HotToStandardTime i.e. hot backup does not get converted to standard backup

hotToStdTime := time.Now()
adminClient.CreateBackupWithOptions(ctx, "TableID", "Cluster", "BkpName",
		time.Now().Add(24 * 4 * time.Hour), HotBackup())

To update hot_to_standard_time of hot backup:

adminClient.UpdateBackupHotToStandardTime(ctx, "Cluster", "BkpName", &hotToStdTime)

To remove hot_to_standard_time of hot backup:

adminClient.UpdateBackupHotToStandardTime(ctx, "Cluster", "BkpName", nil)

@bhshkh bhshkh requested review from a team as code owners December 3, 2024 20:22
@bhshkh bhshkh changed the title feat(bigtable): hot backups feat(bigtable): Hot backups Dec 3, 2024
@product-auto-label product-auto-label bot added the api: bigtable Issues related to the Bigtable API. label Dec 4, 2024
Comment on lines 2191 to 2193
func HotBackup(hotToStandardTime *time.Time) BackupOption {
return hotBackupOption{htsTime: hotToStandardTime}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this option provided as a pointer instead of passing in a value then converting it into a reference in the library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hotToStandardTime is optional here same as the protos:

// Indicates the backup type of the backup.
BackupType Backup_BackupType `protobuf:"varint,11,opt,name=backup_type,json=backupType,proto3,enum=google.bigtable.admin.v2.Backup_BackupType" json:"backup_type,omitempty"`
// The time at which the hot backup will be converted to a standard backup.
// Once the `hot_to_standard_time` has passed, Cloud Bigtable will convert the
// hot backup to a standard backup. This value must be greater than the backup
// creation time by:
// - At least 24 hours
//
// This field only applies for hot backups. When creating or updating a
// standard backup, attempting to set this field will fail the request.
HotToStandardTime *timestamppb.Timestamp `protobuf:"bytes,12,opt,name=hot_to_standard_time,json=hotToStandardTime,proto3" json:"hot_to_standard_time,omitempty"`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to also come up with ways to represent an optional Time parameter here, even looking at https://stackoverflow.com/questions/30716354/how-do-i-do-a-literal-int64-in-go for help. You could do something like HotBackup(time.Time) then another backup option like HotBackupNil() or something like that. Not sure if it's the cleanest solution to this problem, but would also like to hear from others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the options to HotToStandardBackup and HotBackup
Open to other naming suggestions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave the final decision for the names to the Bigtable team, but setting hot_to_standard_time in the proto means setting the time at which it gets converted, and if doing a hot backup without having a provided value means it doesn't get converted, right?

In that case I'd rename the two to something like HotBackupWithConversion(hotToStandardTime time.Time) and HotBackupWithoutConversion(), as it's more clear to the user what those options are doing.

@@ -2371,6 +2474,29 @@ func (ac *AdminClient) UpdateBackup(ctx context.Context, cluster, backup string,
return err
}

// UpdateBackupHotToStandardTime updates the HotToStandardTime of a hot backup.
func (ac *AdminClient) UpdateBackupHotToStandardTime(ctx context.Context, cluster, backup string, hotToStandardTime *time.Time) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use HotBackupOptions here to eliminate another *time.Time parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the same options for create and update can complicate this later.
Updated the PR to add 2 separate methods similar to other implementation:

// UpdateTableDisableChangeStream updates a table to disable change stream for table ID.
func (ac *AdminClient) UpdateTableDisableChangeStream(ctx context.Context, tableID string) error {
req, err := ac.newUpdateTableRequestProto(tableID)
if err != nil {
return err
}
req.UpdateMask.Paths = []string{changeStreamConfigFieldMask}
return ac.updateTableAndWait(ctx, req)
}
// UpdateTableWithChangeStream updates a table to with the given table ID and change stream config.
func (ac *AdminClient) UpdateTableWithChangeStream(ctx context.Context, tableID string, changeStreamRetention ChangeStreamRetention) error {
req, err := ac.newUpdateTableRequestProto(tableID)
if err != nil {
return err
}
req.UpdateMask.Paths = []string{changeStreamConfigFieldMask + "." + retentionPeriodFieldMaskPath}
req.Table.ChangeStreamConfig = &btapb.ChangeStreamConfig{}
req.Table.ChangeStreamConfig.RetentionPeriod = durationpb.New(changeStreamRetention.(time.Duration))
return ac.updateTableAndWait(ctx, req)
}

Copy link

@DerekLeeCS DerekLeeCS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall logic looks good, just have a few suggestions for the implementation

@@ -2150,9 +2150,60 @@ func (ac *AdminClient) RestoreTableFrom(ctx context.Context, sourceInstance, tab
return longrunning.InternalNewOperation(ac.lroClient, op).Wait(ctx, &resp)
}

type backupOptions struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we include expireTime in backupOptions?

IMO it seems more intuitive, b/c right now backupOptions is only for the hot backup fields

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExpireTime is a required field and not optional. Adding it to backupOptions indicates otherwise

// Required. The expiration time of the backup.
// When creating a backup or updating its `expire_time`, the value must be
// greater than the backup creation time by:
// - At least 6 hours
// - At most 90 days
//
// Once the `expire_time` has passed, Cloud Bigtable will delete the backup.
ExpireTime *timestamppb.Timestamp `protobuf:"bytes,3,opt,name=expire_time,json=expireTime,proto3" json:"expire_time,omitempty"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we can rename backupOptions if that's the issue

my suggestion was to have all configurable fields for a backup in a single struct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not unreasonable to subsume expiration time into the set of variadic options if you explicitly validate the backupOptions has any necessary fields in CreateWithOptions.

The other way forward might would be if there's a reasonable default expiration (7d, etc). However in that case I'd probably recommend that the service makes those behavioral choices via an optional field, and documents the defaults. Otherwise you end up with a potential consistency problem across client implementations.

// Once the `hot_to_standard_time` has passed, Cloud Bigtable will convert the
// hot backup to a standard backup. This value must be greater than the backup
// creation time by at least 24 hours.
type hotBackupOption struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need these hotBackupOption lines and functions?

I'm wondering how useful these will be, as opposed to a user creating their own backupOptions to create a hot backup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hotBackupOption is a private struct and not visible to the user. Having this simplifies the user experience. To create a hot backup, user can just do below instead of having to create their own struct.

hotToStdTime := time.Now()
adminClient.CreateBackupWithOptions(ctx, "TableID", "Cluster", "BkpName",
		time.Now().Add(24 * 4 * time.Hour), HotBackup())

This is the common pattern followed across libraries in google-cloud-go to create optional parameters e.g.

// WithBackoff allows configuration of the backoff timing used for retries.
// Available configuration options (Initial, Max and Multiplier) are described
// at https://pkg.go.dev/github.com/googleapis/gax-go/v2#Backoff. If any fields
// are not supplied by the user, gax default values will be used.
func WithBackoff(backoff gax.Backoff) RetryOption {
return &withBackoff{
backoff: backoff,
}
}
type withBackoff struct {
backoff gax.Backoff
}
func (wb *withBackoff) apply(config *retryConfig) {
config.backoff = &wb.backoff
}
// WithMaxAttempts configures the maximum number of times an API call can be made
// in the case of retryable errors.
// For example, if you set WithMaxAttempts(5), the operation will be attempted up to 5
// times total (initial call plus 4 retries).
// Without this setting, operations will continue retrying indefinitely
// until either the context is canceled or a deadline is reached.
func WithMaxAttempts(maxAttempts int) RetryOption {
return &withMaxAttempts{
maxAttempts: maxAttempts,
}
}
type withMaxAttempts struct {
maxAttempts int
}
func (wb *withMaxAttempts) apply(config *retryConfig) {
config.maxAttempts = &wb.maxAttempts
}
// RetryPolicy describes the available policies for which operations should be
// retried. The default is `RetryIdempotent`.
type RetryPolicy int
const (
// RetryIdempotent causes only idempotent operations to be retried when the
// service returns a transient error. Using this policy, fully idempotent
// operations (such as `ObjectHandle.Attrs()`) will always be retried.
// Conditionally idempotent operations (for example `ObjectHandle.Update()`)
// will be retried only if the necessary conditions have been supplied (in
// the case of `ObjectHandle.Update()` this would mean supplying a
// `Conditions.MetagenerationMatch` condition is required).
RetryIdempotent RetryPolicy = iota
// RetryAlways causes all operations to be retried when the service returns a
// transient error, regardless of idempotency considerations.
RetryAlways
// RetryNever causes the client to not perform retries on failed operations.
RetryNever
)
// WithPolicy allows the configuration of which operations should be performed
// with retries for transient errors.
func WithPolicy(policy RetryPolicy) RetryOption {
return &withPolicy{
policy: policy,
}
}
type withPolicy struct {
policy RetryPolicy
}
func (ws *withPolicy) apply(config *retryConfig) {
config.policy = ws.policy
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, I'll defer to the readability reviewer

return ac.createBackup(ctx, table, cluster, backup, expireTime, opts...)
}

func (ac *AdminClient) createBackup(ctx context.Context, table, cluster, backup string, expireTime time.Time, opts ...BackupOption) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we go with adding expireTime to backupOptions, then we can have the original CreateBackup (that takes expireTime as an arg) create its own backupOptions and pass it to this function

and then we won't need expireTime as an arg to this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExpireTime is a required field and not optional. Adding it to backupOptions would indicate otherwise

// Required. The expiration time of the backup.
// When creating a backup or updating its `expire_time`, the value must be
// greater than the backup creation time by:
// - At least 6 hours
// - At most 90 days
//
// Once the `expire_time` has passed, Cloud Bigtable will delete the backup.
ExpireTime *timestamppb.Timestamp `protobuf:"bytes,3,opt,name=expire_time,json=expireTime,proto3" json:"expire_time,omitempty"`

@bhshkh bhshkh enabled auto-merge (squash) December 10, 2024 20:52
}

// BackupOption can be used to specify parameters for backup operations.
type BackupOption interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main question here for the option style here is how complex you expect the options to become. I see only a single option with two behavior fields, so struct-based options that satisfy the apply() contract may be more than you need.

By way of comparison, if the contract is simplified to the core backupOptions and some funcs that can modify it:

type BackupOption func(*backupOptions)

func WithHotBackupOption(t time.Time) BackupOption {
  return func(bo *backupOptions) {
     bo.backupType = blah
     bo.hotToStandardTime = blah
  }
}

func WithExpiration(t time.Time) BackupOption {
  return func(bo *backupOptions) {
     bo.expireTime = blah
  }
}

@@ -2150,9 +2150,60 @@ func (ac *AdminClient) RestoreTableFrom(ctx context.Context, sourceInstance, tab
return longrunning.InternalNewOperation(ac.lroClient, op).Wait(ctx, &resp)
}

type backupOptions struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not unreasonable to subsume expiration time into the set of variadic options if you explicitly validate the backupOptions has any necessary fields in CreateWithOptions.

The other way forward might would be if there's a reasonable default expiration (7d, etc). However in that case I'd probably recommend that the service makes those behavioral choices via an optional field, and documents the defaults. Otherwise you end up with a potential consistency problem across client implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants