Skip to content

Commit 6b79b0b

Browse files
committed
Draft E2E test design
Signed-off-by: Rei1010 <[email protected]> Signed-off-by: wen.rui <[email protected]>
1 parent 6545472 commit 6b79b0b

File tree

2 files changed

+175
-0
lines changed

2 files changed

+175
-0
lines changed

docs/proposals/e2e_test.md

+175
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Support E2E Testing
2+
3+
<!-- toc -->
4+
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [Test Scope](#test-scope)
11+
- [Implementation Details](#implementation-details)
12+
- [User Stories (Optional)](#user-stories-optional)
13+
- [Story 1](#story-1)
14+
- [Story 2](#story-2)
15+
- [Story 3](#story-3)
16+
- [Risks and Mitigations](#risks-and-mitigations)
17+
- [Design Details](#design-details)
18+
19+
<!-- /toc -->
20+
21+
## Summary
22+
23+
<!--
24+
This section is incredibly important for producing high-quality, user-focused
25+
documentation such as release notes or a development roadmap. It should be
26+
possible to collect this information before implementation begins, in order to
27+
avoid requiring implementors to split their attention between writing release
28+
notes and implementing the feature itself. KEP editors and SIG Docs
29+
should help to ensure that the tone and content of the `Summary` section is
30+
useful for a wide audience.
31+
32+
A good summary is probably at least a paragraph in length.
33+
34+
Both in this section and below, follow the guidelines of the [documentation
35+
style guide]. In particular, wrap lines to a reasonable length, to make it
36+
easier for reviewers to cite specific portions, and to minimize diff churn on
37+
updates.
38+
39+
[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md
40+
-->
41+
42+
This KEP proposes to support End-to-End (E2E) testing for HAMi, ensuring its functionality and compatibility
43+
within the Kubernetes ecosystem.
44+
45+
It introduces mechanisms to validate the entire workflow of the feature and
46+
guarantee that it meets production-level requirements.
47+
48+
## Motivation
49+
50+
<!--
51+
This section is for explicitly listing the motivation, goals, and non-goals of
52+
this KEP. Describe why the change is important and the benefits to users. The
53+
motivation section can optionally provide links to [experience reports] to
54+
demonstrate the interest in a KEP within the wider Kubernetes community.
55+
56+
[experience reports]: https://github.com/golang/go/wiki/ExperienceReports
57+
-->
58+
59+
End-to-end (E2E) tests validate the complete functionality of a system, ensuring that the end-user experience
60+
aligns with developer specifications.
61+
62+
While unit and integration tests provide valuable feedback, they are often insufficient in distributed systems.
63+
Minor changes may pass unit and integration tests but still introduce unforeseen issues at the system level.
64+
65+
Comprehensive E2E test coverage is essential to mitigate the risks of regressions, improve reliability,
66+
and maintain confidence in the system's seamless integration with Kubernetes.
67+
Without it, HAMi's robustness and user trust may be compromised.
68+
69+
### Goals
70+
71+
<!--
72+
List the specific goals of the KEP. What is it trying to achieve? How will we
73+
know that this has succeeded?
74+
-->
75+
76+
- Setup E2E testing basic environment.
77+
- Define the scope and scenarios for E2E testing of HAMi.
78+
- Implement E2E tests that cover key workflows and edge cases.
79+
- Ensure integration with the Kubernetes CI pipelines (e.g., Prow).
80+
- Establish a reliable and repeatable test framework for future enhancements.
81+
82+
### Non-Goals
83+
84+
<!--
85+
What is out of scope for this KEP? Listing non-goals helps to focus discussion
86+
and make progress.
87+
-->
88+
89+
- Unit or integration testing of the feature (covered elsewhere).
90+
- Performance benchmarking beyond basic scenarios.
91+
92+
## Proposal
93+
94+
<!--
95+
This is where we get down to the specifics of what the proposal actually is.
96+
This should have enough detail that reviewers can understand exactly what
97+
you're proposing, but should not include things like API designs or
98+
implementation. What is the desired outcome and how do we measure success?.
99+
The "Design Details" section below is for the real
100+
nitty-gritty.
101+
-->
102+
103+
This proposal aims to integrate E2E testing for HAMi. Tests will be implemented using the
104+
Kubernetes E2E testing framework (e.g., Ginkgo) and adhere to community best practices.
105+
106+
### Test Scope
107+
108+
- Core functionality: Validate basic operations and workflows of the feature.
109+
- Edge cases: Test unusual scenarios or invalid inputs to ensure robustness.
110+
- Compatibility:
111+
- Verify that the feature integrates with different heterogeneous devices.
112+
- Verify that the feature integrates with different operations systems.
113+
- Verify that the feature integrates with different Kubernetes versions.
114+
- Error handling: Ensure appropriate error messages and recovery mechanisms are in place.
115+
116+
### Implementation Details
117+
118+
- Test environment will hold in Daocloud local environment.
119+
- Tests will be written using the [Ginkgo](https://onsi.github.io/ginkgo/) testing framework.
120+
- All tests will use isolated namespaces to avoid conflicts.
121+
- Resource cleanup will be automated after each test run.
122+
- CI integration will ensure tests run against PRs, daily builds, and releases.
123+
124+
### User Stories (Optional)
125+
126+
<!--
127+
Detail the things that people will be able to do if this KEP is implemented.
128+
Include as much detail as possible so that people can understand the "how" of
129+
the system. The goal here is to make this feel real for users without getting
130+
bogged down.
131+
-->
132+
133+
#### Story 1
134+
Automating E2E testing with helm deployment
135+
136+
#### Story 2
137+
Automating E2E testing with resource validation
138+
139+
### Story 3
140+
Automating E2E testing with Kubernetes resource deploy
141+
142+
### Risks and Mitigations
143+
144+
<!--
145+
What are the risks of this proposal, and how do we mitigate? Think broadly.
146+
For example, consider both security and how this will impact the larger
147+
Kubernetes ecosystem.
148+
149+
How will security be reviewed, and by whom?
150+
151+
How will UX be reviewed, and by whom?
152+
153+
Consider including folks who also work outside the SIG or subproject.
154+
-->
155+
156+
- Resource Limitations
157+
- During E2E testing, testing clusters may encounter resource constraints,
158+
such as insufficient CPU, memory, or storage. This could lead to test failures,
159+
degraded performance, or timeouts during deployments.
160+
- Environment Instability
161+
- Instabilities in the testing environment, such as network latency, intermittent network failures,
162+
or cluster node failures, can cause tests to fail or behave inconsistently.
163+
164+
## Design Details
165+
166+
<!--
167+
This section should contain enough information that the specifics of your
168+
change are understandable. This may include API specs (though not always
169+
required) or even code snippets. If there's any ambiguity about HOW your
170+
proposal will be implemented, this is the place to discuss them.
171+
-->
172+
173+
174+
![gpu_utilization](e2e_test.png)
175+

docs/proposals/e2e_test.png

539 KB
Loading

0 commit comments

Comments
 (0)