-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: support QoS based on capacity for rbd volume #5016
base: devel
Are you sure you want to change the base?
Conversation
8a47010
to
c07a41a
Compare
63ea87d
to
f60f7c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is really a nice feature to have. A couple of comments that need to be addressed to improve the maintainability and usability.
- move all qos related functions, structs, constants to a new
qos.go
file - all new functions that receive a *rbdVolume as argument should become a function of the rbdVolume struct (see example for the
SetQOS
function) - new parameters for the StorageClass need to be added to the documentation
- documentation should explain that the QoS settings are currently only used in combination with the NBD mounter
- ideally add unit tests for functions that do not interact with a Ceph cluster
- ideally add some form or e2e testing, maybe set QoS parameters in the StorageClass, and verify that those are set on an RBD-image?
Thanks again, this is a great start!
2b895ad
to
31ce867
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
31ce867
to
2b56d76
Compare
internal/rbd/controllerserver.go
Outdated
@@ -424,6 +430,19 @@ func (cs *ControllerServer) CreateVolume( | |||
return nil, err | |||
} | |||
|
|||
// Apply Qos parameters to rbd image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CreateVolume function is already at the limit of the complexity (number of if/else and loop statements). Maybe it makes sense to move these two parts to the createBackingImage()
function?
https://github.com/ceph/ceph-csi/actions/runs/12390057314/job/34584308943?pr=5016#step:3:841
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I full agree.
internal/rbd/controllerserver.go
Outdated
if err != nil { | ||
log.DebugLog(ctx, "failed apply QOS for rbd image: %v", err) | ||
|
||
return nil, status.Error(codes.Internal, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that a CSI function like CreateVolume
needs to return a status.Error()
or status.Errorf()
generated type of error. Functions like createBackingImage()
should return a 'normal' error. This is not in-sync with the error returned from SaveQOS()
in the call below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get
Looks quite complete, many thanks for adding the unit and e2e tests! |
/test ci/centos/mini-e2e/k8s-1.31 |
not sure why that did not work, but the CI is running now (takes ~2 hours to complete): https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/job/mini-e2e_k8s-1.31/283/display/redirect |
Write it briefly to verify that e2e testing is available. More scenes need to be added. :-) |
4d90640
to
0c09729
Compare
47bd9ab
to
795ea78
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
efeffcd
to
bd4d17e
Compare
1. QoS provides settings for rbd volume read/write iops and read/write bandwidth. 2. All QoS parameters are placed in the SC, send QoS parameters from SC to Cephcsi through PVC create request. 3. We need provide QoS parameters in the SC as below: - BaseReadIops - BaseWriteIops - BaseReadBytesPerSecond - BaseWriteBytesPerSecond - ReadIopsPerGB - WriteIopsPerGB - ReadBpsPerGB - WriteBpsPerGB - BaseVolSizeBytes There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size. 4. If PVC has resize request, adjust the QoS limit according to the QoS parameters after resizing. Signed-off-by: Yite Gu <[email protected]>
bd4d17e
to
4774a82
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
1 similar comment
/test ci/centos/mini-e2e/k8s-1.31 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size.
Describe what this PR does
Provide some context for the reviewer
Is there anything that requires special attention
Do you have any questions?
Is the change backward compatible?
Are there concerns around backward compatibility?
Provide any external context for the change, if any.
For example:
Related issues
Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.
Fixes: #issue_number
Future concerns
List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.
Checklist:
guidelines in the developer
guide.
Request
notes
updated with breaking and/or notable changes for the next major release.
Show available bot commands
These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>
: retest the<job-name>
after unrelatedfailure (please report the failure too!)