-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(e2e): extend logging test to 3 node cluster #18806
Conversation
Hi @ghouscht. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
cd7b0f9
to
7b37673
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted filessee 21 files with indirect coverage changes @@ Coverage Diff @@
## main #18806 +/- ##
==========================================
+ Coverage 68.73% 68.75% +0.02%
==========================================
Files 420 420
Lines 35512 35512
==========================================
+ Hits 24410 24418 +8
+ Misses 9675 9663 -12
- Partials 1427 1431 +4 Continue to review full report in Codecov by Sentry.
|
tests/e2e/logging_test.go
Outdated
|
||
var lines []string | ||
for i := range tc.clusterSize { | ||
lines = append(lines, epc.Procs[i].Logs().Lines()...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we need to collect log lines after we close the server, did you encounter error with that?
If I remember safely get logs from closed process you need to get handle of logs epc.Procs[i].Logs()
before closing process and call .Lines()
after closing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did encounter the allowed error but apparently not all the time. It indeed does make a difference if .Lines()
is called after closing. By doing so the resulting log lines seem to be constant. Will update the PR.
{ | ||
name: "single node cluster", | ||
clusterSize: 1, | ||
allowedErrors: map[string]bool{"setting up serving from embedded etcd failed.": true}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to followup why we get this log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, saw the comment in the issue. From what I saw in release-3.5
branch this does not happen. I'll have a look where exactly it comes from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, thanks @ghouscht !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that there are multiple sources for this error. One example is:
{"level":"error","ts":"2024-10-30T09:17:44.048443+0100","caller":"embed/etcd.go:875","msg":"setting up serving from embedded etcd failed.","error":"http: Server closed" ...
and that is because the errHandler
function (https://github.com/ghouscht/etcd/blob/issue-17329/server/embed/etcd.go#L873-L875) does not check for the http.ErrServerClosed
error. This error is always returned by Serve
methods after a call to Close
or Shutdown
: https://pkg.go.dev/net/http#ErrServerClosed.
After ignoring the error mentioned above I also saw this:
{"level":"error","ts":"2024-10-30T09:17:44.048443+0100","caller":"embed/etcd.go:875","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:20000: use of closed network connection" ...
Currently trying to figure out where this comes from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error logs are generated when closing/shutting down etcd. The listeners are closed firstly, then the accepter will return errors. They should be expected. One possible additional test case is to verify that there is no any error logs before closing etcd, which of course can be discussed & implemented separately.
{"level":"info","ts":"2024-10-30T08:25:35.363864Z","caller":"osutil/interrupt_unix.go:65","msg":"received signal; shutting down","signal":"interrupt"}
{"level":"info","ts":"2024-10-30T08:25:35.363960Z","caller":"embed/etcd.go:398","msg":"closing etcd server","name":"default","data-dir":"default.etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
......
{"level":"error","ts":"2024-10-30T08:25:35.364157Z","caller":"embed/etcd.go:874","msg":"setting up serving from embedded etcd failed.","error":"http: Server closed","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:874\ngo.etcd.io/etcd/server/v3/embed.(*serveCtx).serve.func3\n\tgo.etcd.io/etcd/server/v3/embed/serve.go:176"}
{"level":"error","ts":"2024-10-30T08:25:35.364250Z","caller":"embed/etcd.go:874","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:2379: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:874\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).serveClients.func1\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:776"}
...
{"level":"error","ts":"2024-10-30T08:25:35.394822Z","caller":"embed/etcd.go:874","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:2380: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:874\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:614"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is what I figured out as well. In that case these errors are expected and we could simply adapt the errorHandler
function to ignore them, e.g. like this:
if err != nil &&
!errors.Is(err, http.ErrServerClosed) &&
!errors.Is(err, net.ErrClosed) {
e.GetLogger().Error("setting up serving from embedded etcd failed.", zap.Error(err))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to be certain whether http.ErrServerClose
or net.ErrClosed
is expected by just one error itself. Suggest not to change the production code for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think it is. https://cs.opensource.google/go/go/+/refs/tags/go1.23.2:src/net/http/server.go;l=3286-3288;drc=773767def0e0f29584a69bd760430167b7479d7d
http.ErrServerClose means.
Signed-off-by: Thomas Gosteli <[email protected]>
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahrtr, ghouscht, serathius The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
See also #17329
This is how I would extend the logging test to include both a single node and a three node cluster. If it is ok like that I'd submit PRs for backporting as written in the linked issue
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.