-
Notifications
You must be signed in to change notification settings - Fork 14
Conversation
Container string // Example: trainer, loader | ||
Name string // Example: model, model-saved, data | ||
Mounts []BucketMount // Example: model, data, logs | ||
ReadOnly bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this type correspond to a single fs mount or all of them on the container? Mounts
has me think it captures all of them but ReadOnly
should probably only apply to a single mount point and not all right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is why it works, only if dataset is nil `r.Cloud.MountBucket gets called another time with a single additional single BucketMount which adds a readonly mount for dataset. See example below:
if err := r.Cloud.MountBucket(&job.Spec.Template.ObjectMeta, &job.Spec.Template.Spec, model, cloud.MountBucketConfig{
Name: "model",
Mounts: []cloud.BucketMount{
{BucketSubdir: "model", ContentSubdir: "model"},
{BucketSubdir: "logs", ContentSubdir: "logs"},
},
Container: containerName,
ReadOnly: false,
}); err != nil {
return nil, fmt.Errorf("mounting model: %w", err)
}
if dataset != nil {
if err := r.Cloud.MountBucket(&job.Spec.Template.ObjectMeta, &job.Spec.Template.Spec, dataset, cloud.MountBucketConfig{
Name: "dataset",
Mounts: []cloud.BucketMount{
{BucketSubdir: "data", ContentSubdir: "data"},
},
Container: containerName,
ReadOnly: true,
}); err != nil {
return nil, fmt.Errorf("mounting dataset: %w", err)
}
}
if baseModel != nil {
if err := r.Cloud.MountBucket(&job.Spec.Template.ObjectMeta, &job.Spec.Template.Spec, baseModel, cloud.MountBucketConfig{
Name: "basemodel",
Mounts: []cloud.BucketMount{
{BucketSubdir: "model", ContentSubdir: "saved-model"},
},
Container: containerName,
ReadOnly: true,
}); err != nil {
return nil, fmt.Errorf("mounting base model: %w", err)
}
}
I do think it's a bit confusing since both Brandon and I were similarly confused by this.
edit: Below was my original response:
nice catch. I think ReadOnly needs to move to BucketMount type. Otherwise how would you express this?
model(readonly=False), model-saved(readonly=True), data(readonly=True)
The code further down makes me believe this might be a bug indeed: https://github.com/substratusai/substratus/pull/109/files#diff-a1c985bf4c51fa451b98fc1431485adc3910b38bf897053cd4a881323083008fR66
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCSFuse allows for specifying readonly on an entire volume. The reason why I have multiple .volumes
being appended is: I wanted to respect the bucket url reported by a given object instead of assuming that all objects in a given cluster are stored in the same bucket. One nice side effect of this: a Dataset object could eventually be imported into a GKE cluster that references a dataset from S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR on rational: Every time MountBucket is called it is given an Object to mount from. If this Object has a .status.artifacts.url
, MountBucket will respect this url. This means MountBucket is 1:1 with a new .volume
. So the context of the .ReadOnly
field is OK to be scoped at the .volumes
level as well as the .volumeMounts
level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably rename to type ObjectBucketMounts struct
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation - makes good sense now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions worth addressing but overall this is huge and like 95% there
merging and then tagging and releasing v0.5.0 |
cloud
pkgparams.json
(fixes Support for params.json #114)