-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Retina e2e scale test #720
base: main
Are you sure you want to change the base?
test: Retina e2e scale test #720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this is a draft, and I ordinarily don't review drafts since things are subject to change... but I figured these would be good suggestions early-on if you're planning to write some more here.
@@ -62,6 +62,20 @@ func DeleteTestInfra(subID, clusterName, location string) *types.Job { | |||
return job | |||
} | |||
|
|||
func InstallRetina(kubeConfigFilePath, chartPath string) *types.Job { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this? Can you refactor InstallAndTestRetinaBasicMetrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I preferred not to change InstallAndTestRetinaBasicMetrics
since this might be in use by someone else and the scope seems to be different.
The plan for the new pipeline is to Create Infra, Install Retina, Scale Up and Run Tests, so my thinking is to create different jobs for each stage so each job could be reused somewhere else. So in the future instead of using InstallAndTestRetinaBasicMetrics
, one could [re-]use InstallRetina
, Test Basic Metrics
, and perhaps add some other set of tests. Does that make sense?
7b88495
to
a853797
Compare
6662faf
to
69189ce
Compare
This PR will be closed in 7 days due to inactivity. |
83abb41
to
6ef0d3b
Compare
6ef0d3b
to
dd8089a
Compare
2904623
to
84b6bd2
Compare
695817d
to
dce97a3
Compare
"k8s.io/client-go/tools/clientcmd" | ||
) | ||
|
||
type CreateNetworkPolicies struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a create network policy in the stdlib, can we either move this there or converge on to one way of creating netpols?
https://github.com/microsoft/retina/blob/main/test/e2e/framework/kubernetes/create-network-policy.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that could be done with some refactoring.
kubernetes.CreateDenyAllNetworkPolicy step creates a specific NetworkPolicy (NP) that is encapsulated in a way where the only Namespace and one podSelector label could be provided.
scaletest.CreateNetworkPolicies step creates a number of NPs that are also specific with some podSelector labels needed for the scale-tests.
The first one would need to have more fields to allow more customization on generated NP. But it would also require to check and change the code everywhere it's currently being used.
If it were done that way, how should this step that generates a single NP be reused to generate many NPs? Inside the ScaleTest function that generates the job (file test/e2e/jobs/scale.go) add a loop to add this same Step several times? Or a new step should be created encapsulating these multiple NP creations inside of it? (I'm not sure if the framework currently allows creation/reuse of steps inside of steps.)
Anyway, if this is the best approach, maybe this could be done in a separate refactoring task. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the reasoning, while there might be a way to accommodate both scenarios, that may be out of scope for this pr given current limitations. Existing cases are verifying granular drop behavior unlike this suite, can shelve for later
"k8s.io/client-go/tools/clientcmd" | ||
) | ||
|
||
type CreateResources struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to reuse something like this?
https://github.com/microsoft/retina/blob/main/test/e2e/framework/kubernetes/create-resource.go#L23
pidgeonholing these relatively common actions to specific tests is going to result in a lot of redundant code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being reused inside Run() method (line 66). This step puts together the generation of all objects required for this scale test and then it reuses the CreateResource function you mention to apply that to the cluster.
What would you suggest to reuse it in a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought stems from the fact that creating/destroying Kubernetes resources is a core component of these suites, and if there was a way of making this CreateResources step able to be shared amongst other suites.
In other such cases we've seen people create resources that require pod/job/daemonset to be ready, but the creating resources aspect of the test just creates the resource in apiserver and moves on without waiting for job or pod, which then results in faulty tests. Then it is necessary to add a wait for pod ready like the added step in this pr and/or verify stable cluster, etc.
I don't have a clear solution for this at the moment so I won't block, but it seems everybody has different way of creating test resources, so just starting a dialog around the topic
|
||
var ( | ||
kapingerPort = int32(8080) | ||
KapingerDeployment = appsv1.Deployment{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is a dupe of this?
https://github.com/microsoft/retina/blob/main/test/e2e/framework/kubernetes/create-kapinger-deployment.go
already it's proven difficult when we tweak kapinger to track down all of the places that use it, having another place where a DeploymentSpec is define makes that more difficult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just have not seen it before. Re-using previous implementation
// Returning an error will cause the test to fail | ||
func (po *ValidateAndPrintOptions) Run() error { | ||
|
||
log.Printf(`Starting to scale with folowing options: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use %+v
formatting directive to print struct keys and values
log.Printf(`Starting to scale with folowing options: | |
log.Printf(`Starting to scale with following options %+v", po.Options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
KubeConfigFilePath string | ||
} | ||
|
||
// Useful when wanting to do parameter checking, for example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is a bit redundant when copy pasted to every struct :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment comes from the interface. I use a plugin to automatically create the methods from the interface that I want to implement, so it also gets the comment from it :)
) | ||
|
||
var ( | ||
NetworkPolicy = netv1.NetworkPolicy{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this doesn't necessarily seem scale test specific, let's put it in a more common directory and/or reuse with existing create deny netpol steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put it in the folder test/e2e/framework/kubernetes/templates
and existing create deny netpol step can be changed in another task to reuse it and perform changes needed. Is that ok?
) | ||
|
||
var ( | ||
KapingerClusterRole = rbacv1.ClusterRole{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+same for this, let's stick to one way of deploying kapinger and tweak that if we need to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
type CreateNamespace struct { | ||
Namespace string | ||
KubeConfigFilePath string | ||
DryRun bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any strong use cases for dry run? In a lot of these cases so far all core step code is gated by this conditional, and when dry run is set to true, it just pulls kubeconfig
For what it's worth the prevalidate step is effectively a dry run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script in which this code is based on has a flag to stop execution after generating objects and before applying them to cluster, to allow inspection of generated objects. The idea here is to print manifests to logs without applying them. Perhaps this could be removed. @vipul-21 what do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @huntergregory for his thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this for simplicity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was primarily used to calculate how many IP sets et cetera that npm will create
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DryRun removed
Signed-off-by: Alex Castilio dos Santos <[email protected]>
Signed-off-by: Alex Castilio dos Santos <[email protected]>
Signed-off-by: Alex Castilio dos Santos <[email protected]>
dce97a3
to
8e3600a
Compare
Signed-off-by: Alex Castilio dos Santos <[email protected]>
Signed-off-by: Alex Castilio dos Santos <[email protected]>
Signed-off-by: Alex Castilio dos Santos <[email protected]>
8e3600a
to
b34dad4
Compare
Description
Create Retina E2E Scale tests using test/e2e framework.
Related Issue
If this pull request is related to any issue, please mention it here. Additionally, make sure that the issue is assigned to you before submitting this pull request.
Checklist
git commit -S -s ...
). See this documentation on signing commits.Screenshots (if applicable) or Testing Completed
Please add any relevant screenshots or GIFs to showcase the changes made.
Additional Notes
Add any additional notes or context about the pull request here.
Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.