-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle limited open Connections due to keepalive connections #9916
base: master
Are you sure you want to change the base?
Conversation
a2c3068
to
3ef99c5
Compare
This change set created a linkage error (android), the 'usual' type_info, vtable thing. |
net/Socket.cpp
Outdated
@@ -1298,6 +1385,52 @@ LocalServerSocket::~LocalServerSocket() | |||
# define LOG_CHUNK(X) | |||
#endif | |||
|
|||
bool StreamSocket::checkForcedRemoval(std::chrono::steady_clock::time_point now) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might need to move this impl to after
"#endif // !MOBILEAPP" (if mobile needs it) and/or put ifdef MOBILEAPP around it in the header if it doesn't or a !MOBILE stub, something like that anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. Great eyes. Thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @sgothel. There are some really good stuff here!
I understand this is a draft, but it reads like a mix of random fixes and improvements (some of it is good and useful, like adding const
and = default
, make_shared
, etc.), re-inventing methods for doing things already supported, and debugging/tracing support for your understanding (to be removed).
It's very hard to read, as there is no clear theme or purpose.
I left various comments, but I didn't dive too deep in a number of areas because it was unclear why we are either replacing an existing approach, or inventing a new one when we can improve on the existing one (where there are shortcomings).
At a minimum, I'd separate the backtrace improvements from the socket improvements, and likely the misc. improvements from new code/features/tests.
Finally, I'm troubled by vformatString
and the re-working a lot of the socket states and internals with a huge set of new members that change the existing semantics with unclear benefits.
Perhaps there is some context I'm missing. Perhaps this was agreed on and I'm unaware.
Thank you @Ashod for the review.
The draft status was motivated by my intention to discuss the actual semantics and goals, Which brings us to the next point I guess.
When reading commit by commit, I believe everything is well split apart semantically. Indeed, I added a few unrelated single commits into this review. Understanding that we allow
I am not aware of really replacing anything. I merely moved and added functionality.
As described above, a zero side-effect change in its own commit. The Backtrace object in particular is useful for all, as we can produce an instance
I just like printf formatting as it is more performant and flexible (at least to me).
Allow me to continue answering your other remarks below. |
2d51475
to
508ccb0
Compare
I think it'll be easier to manage this if its split up so that the mostly uncontroversial misc cleanups of existing stuff where no change in logic is intended, the make_unique, make_shared, privator ctor -> = delete, constifying, expansion of auto& to real type, emplace_back, etc goes into a separate pr. The backtrace debugging aid looks probably useful to have to me, but can also be its own pr I think (and probably not worth fighting the "CodeQL scanning - action / Analyze" step which apparently can't handle the constexpr use which is blocking ci from passing this pr). And generally trim this pr down to the core change to make it easier to think about it |
Thanks for the responses, @sgothel! We've discussed some offline, so didn't reply here (e.g. printf-style formatting being fast--in fact it's very slow compared to only serialization).
In general, agreed. But I'll echo @caolanm's comment here that breaking up into PRs, that can be decided to merge independent of other parts, is always helpful. I should also point out that this PR is well above a 1000 lines. This is a rare size. Considering that some parts change the interface of a large hierarchy (the sockets), which is a fundamental core part of the whole project and very delicate (can easily break things catastrophically), it requires reading and studying far more code (i.e. the context and consequences) to review properly. An isolated and low-risk change of a few 100 lines is fine. A complex few 100 line change (such as in the sockets) is challenging and time consuming. Sometimes we have no choice because we can't break it up. But a 1000 lines that can be easily broken up is hardly necessary. Finally, even though we can review individual commits, it's preferable to have a single theme to a PR. Helps the reviewer. In some cases a few misc. changes can go together (comment fixes, cosmetics, formatting, logging, etc.). But mixing functional changes with non-functional changes dilutes the process and can reduce the quality of the review. More so with high-risk changes (i.e. socket code) mixed with debugging code (i.e. backtracing). Regardless, you have a lot of great improvements here, and I don't want to undermine the importance of your contribution! |
508ccb0
to
cb801db
Compare
Thank you both for your review and kind words. I have split up single commits into 4 branches, i.e. this pull-request branch and
For this feature work, I need to complete the unit test incl. the timeout configuration. |
{ | ||
_socketHandler->onDisconnect(); | ||
if( isOpen() ) { | ||
// FIXME: Ensure proper semantics of onDisconnect() |
Check notice
Code scanning / CodeQL
FIXME comment Note
void sendPong(std::chrono::steady_clock::time_point now, | ||
const char* data, const size_t len, | ||
const std::shared_ptr<StreamSocket>& socket) | ||
{ | ||
if (!_isClient) | ||
LOG_WRN("Servers should not send pongs, only clients"); | ||
// assert(!_isClient); |
Check notice
Code scanning / CodeQL
Commented-out code Note
cb801db
to
6fc374e
Compare
6fc374e
to
31c1b94
Compare
@sgothel, there were multiple suggestions to split this large PR up to reduce its size. That was when it had 18 commits and was "only" a 1000 lines. Now it's past 1500 lines and is 1 commit! I'm sensing this isn't going in the direction of the feedback. What's the plan, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said needs to be split up; lets try to get the less controversial pieces in and merged as separate pull requests - to slim this down =) Can you work on that Sven ?
will do the split later then (tomorrow) + finishing the tests |
As I wrote, I will split out the config and TimeAverage (~400 lines or so) and put the remainder at the bottom -> 3 commits within this pull request. Perhaps more, w/ a fresh look. But splitting this up in multiple pull request (PR) makes would probably makes no sense, as these pieces are required and each PR must be tested. IMHO. |
d6d5711
to
661dcde
Compare
…own and queries. SocketPoll::poll robustness Preparation for cool#9833. Socket Restructuring: - Unify closing, query, stats(bytes, time-points) - Add type, port info. - Add std::string queries/toString SocketPoll::poll robustness - Ensure erasing Socket if !isOpen() WebSocketHandler::shutdown* - Add shutdownSilent(), preparation to force shutdown on timeout w/o closing frame. Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So - some good things here; but Socket* is the foundation of our I/O work.
As such - expect lots of scrutiny and review here, far more than elsewhere.
Again I would like to separate the cosmetic re-working of things from functional changes.
I'd like each functional change to be its own separate thing. Eg. silent-shutdown - great feature,non-controversial, we need it - pull it out into its own commit. Lets get that in early.
Pushing down the bytesSent etc. statistics from StreamSocket -> Socket - interesting, not sure of the rationale - perhaps harmless, but would like to know why - where does this simplify something else ?
And of course, other things - not so convinced about ;-)
Thanks for persisting and iterating! - we will get there.
{ | ||
if (type != Type::Unix) | ||
if (_type != Type::Unix) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again changing this in lots of places is not my favourite change =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type became a private field (underscore) .. and I wanted to clarify access (see above).
The overall work was to clean up socket, side effects etc .. while reading and understanding the code.
Mixed field access is very often a PIA .. and nowadays our C++ "constexpr" getter or queries are validated and inlined, see e.g. isIPType(). Perhaps a matter of taster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of the extra braces burning vertical space =) nothing more.
const int _fd; | ||
/// True if this socket is open. | ||
bool _open; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to care about size here. Can we use bit-fields to pack _type, _clientPort, and _open into the same space ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I often use bitfields when having considerable flags indeed.
However, each socket is acompanied by much larger buffers (1-8k) and I doubt 3 x sizeof(int) .. but well, keeping it in mind that its OK for you. An artifical encoding of different types (int + bool) is however .. overkill?
Sure, if we would talk about an object handle, function-ptr or any short lived type in quantities of >= 100k w/o any other side-effects or memory costs otherwise (sure) .. but I don't see it here.
Was thinking about the added memory costs of Socket with added statistics, correct. But as described above .. I doubt this is material here. I could be wrong ofc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not lift _closed
from StreamSocket
to its parent, here?
At a minimum we should review StreamSocket::_closed
, now that we have _open
. Otherwise, horrors when these two go out of sync and the left hand fights the right.
Of course it's also good when adding such common flags to review how we dealt with them for so long, to avoid redundancy and unify the design. Perhaps _open
was enough and this is redundant. Perhaps not. Worth sharing your analysis either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StreamSocket::_closed has been moved to Socket::_open
…s with efficient types Also have their `toString` string literal conversion method return `const char*`. Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
661dcde
to
17a7f5c
Compare
@Ashod and @mmeeks .. I have cleaned up the commits and followed most if not all of your ideas/recommendations. For this .. I have rearranged the single commits, so that the first ones merely cleanup/restructure Socket etc w/o features. Then I completed the feature and unit tests, also based on latest discussion (max-connection -> keep old-connections alive but block new ones as already implemented via MAX_CONNECTIONS). So this would be a double-safety-net. Since I am quite happy now, I move this to 'review'. |
.. oops, scan detected a merge conflict of mine .. one sec. |
17a7f5c
to
3863949
Compare
…uns only for short constexpr getter. Socket::notifyBytes*() -> protected This change has been condensed based on our code review, same as with part-2. get<Noun>() might imply more processing, hence using plain <noun>() as a non-processing get accessor. The accessor are helpful to split read/write access as writing for certain fields is only allowed for derived classes. Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…eout and maxConnections) Preparation for cool#9833. Using runtime configuration for network properties allows us to test the socket limitation code as well as to fine-tune certain properties for clients. Costs are removal of constexpr properties, using local (cached immutable) variables, hence rendering the code less optimized and more pessimistic. However, the singleton instance is fetched at object creation w/o cache hit using a static-local, networking timing outweights memory reads and no hot-spot performance heavy loop is affected. Hence rendering the feature testable at runtime shall outweight the lesser costs. - All network limits and timeouts are configurable now, - see net::Config in NetUtil.hpp - see COOLWSD { "net.ws.ping.timeout", "2000000" }, // WebSocketHandler ping timeout in us (2s). Zero disables metric. { "net.ws.ping.period", "3000000" }, // WebSocketHandler ping period in us (3s), i.e. duration until next ping. { "net.http.timeout", "30000000" }, // http::Session timeout in us (30s). Zero disables metric. { "net.maxconnections", "100000" }, // Socket maximum connections (100000). Zero disables metric. { "net.maxduration", "43200" }, // Socket maximum duration in seconds (12h). Zero disables metric. { "net.minbps", "0" }, // Socket minimum bits per seconds throughput (0). Increase for debugging. Zero disables metric. { "net.socketpoll.timeout", "64000000" }, // SocketPoll timeout in us (64s). Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
Preparation for cool#9833, to be used to provide a time-based ping response time average. TimeAverage allows us to calculate the time based average of a value while adding new time-point/value tuples at arbitrary time-points. The result is stable against small fluctuations and shall provide a reasonable actionable criteria. Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…oint as creationTime to ctor This change has been condensed based on our code review, same as with parts [2-3]. This could reduce costs involved via std::chrono::steady_clock::now(), but also allows socket creation using a dedicated time-point. Further moved `StreamSocket::_lastSeenHTTPHeader` to `std::chrono::steady_clock::time_point &lastHTTPHeader` where it is being required - reduces general footprint. Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
- ProtocolHandlerInterface::checkTimeout(..) is now called via StreamSocket::checkRemoval() to ensure tests even w/o input data (was: handlePoll()) - ProtocolHandlerInterface::checkTimeout(..) - Add bool return value: true -> shutdown connection, caller shall stop processing - Implemented for http::Session - Timeout (30s) net::Config::HTTPTimeout with missing response - Implemented for WebSocketHandler - Timeout (2s = net::Config::WSPingTimeout) after missing WS native frame ping/pong (server only) - StreamSocket::checkRemoval(..) - called directly from SocketPoll::poll() - only for IPv4/v6 network connections - similar to ProtocolHandlerInterface::checkTimeout(..) - calls ProtocolHandlerInterface::checkTimeout(..) - added further criteria (age, throughput, ..) - Timeout (64s = net::Config::SocketPollTimeout) if (now - lastSeen) > timeout - Timeout (12 hours) net::Config::MaxDuration if (now - creationTime) > timeout - TODO: Throughput/bandwitdh disabled, find proper metrics/timing - TODO: Add maximimal IPv4/IPv6 socket-count criteria, drop oldest. - SocketPoll::poll() - Additionally erases if !socket->isOpen() || socket->checkRemoval() Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
Adding connection limit where registered via SocketPoll::setLimiter(), i.e. DocumentBrokerPoll and WebServerPoll only. SocketPoll::poll() will drop _new_ overhead connections exceeding MaxConnections. This has been discussed, in favor of dropping the oldest connections. Aligned net::Config::MaxConnections w/ pre-existing MAX_CONNECTIONS and COOLWSD::MaxConnections. COOLWSD / net::Config - Aligned MAX_CONNECTIONS/COOLWSD::MaxConnections with config "net.maxconnections", net::Config::MaxConnections, having a minimum of 3 - defaults to 9999. SocketPoll::setLimiter(): - Increments given non-zero connectionLimit by one for WS upgrade socket. Added http::StatusCode::None(0), allowing to use for debugging Unit Tests: - Using base class UnitTimeoutBase for UnitTimeoutConnections + UnitTimeoutNone, - Testing http, ws-ping and wsd-chat-ping on multiple parallel connections w/ and w/o a connection limit - UnitTimeoutSocket tests socket max-duration on http + WS sessions - UnitTimeoutWSPing tests WS Ping (native frame) timeout limit on WS sessions Signed-off-by: Sven Göthel <[email protected]> Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
3863949
to
b8ec82a
Compare
Re-arranged sneaking in 2 missing commits in the chain. |
Thanks Sven - sounds very promising =) good stuff. |
I'm confused. I still see some patches eg. this one: |
I didn't remove this one (Socket and WebSocketHandler) as visible by the date + hash. The next parts 2, 3 address single issues and also fix the std::string 'abuse' for osstream, All of these are orthogonal to the feature - so could be seen as bisectable ex-feature. A later Socket change for the passing of time-point, then the 2 feature commits. Perhaps not all 100% to your preference in regards to sliced down commits .. If you insist on re-splitting it (i.e. this 1st commit perhaps), advise .. would do it tomorrow then (edit: re this web-UI, yes esp when commenting etc .. I usually use git commandline, git-cola, gitk and qgit .. also when reviewing for the overview) |
Well - CI passes which is a good sign; I'm really not happy at having cosmetic cleanups merged with drastic functional change - today you can see where the functional change is amid the big chunk of non-functional change: how about in two years ? =) We can get this in, but not again for socket code. My only concern now is what happens during save-on shutdown; I had a nightmare last night (literally) that the lifecycle of ClientSession is essentially owned by the SocketPoll (is it?) - and if that is the last ClientSession that has been hard disconnected from, and we have un-saved data - we need to keep that ClientSession alive until it is saved. It struck me that this might well be why we have shutdown but not deleted Socket's in the SocketPoll - or did I miss something there? @Ashod can you do a last quick review for that use-case, and double-check we have a unit test that covers it effectively ? =) Thanks! |
Hi @mmeeks,
It's complicated and one of the most sensitive parts of the code, with a lot of history.
Sort of. SocketPoll does in fact control when we find out that the socket is disconnected (
Indeed, as you expect, that's what we do. We don't blindly remove the I have really tried to understand if this PR could affect this in any way, but I can't with any confidence say I found out the answer. On the face of it, nothing stood out. But the sockets are such a foundational part of everything we do, that it's hard to say with a straight face there are no side-effects to worry about.
I can't tell. So much has changed, from socket lifetime management, to the socket poll, to refactorings of socket open/closed state, to the timeout management, that I cannot estimate the impact of all of this. (My worry is that we've mangled things badly enough that we either randomly disconnect sockets, or we don't properly handle edge cases that many other functions depend on.) But I do know that we had far more benign code than this, that had harbored great ills and nasty surprises.
We do have unit-tests for the last-editor disconnecting with modifications to save. Unfortunately, the unit-tests related to save and upload have become very unstable of late, with very high rate of random failure. So that's no consolation. P.S. This 5th pass was anything but quick. Considering there is ~1000 new lines changed since my last review, at ~2400 lines, I'm surprised it only took me 3 hours. (Actually, no, I'm not surprised; I had to stop.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @sgothel for not making this any larger.
I left some concerns and thoughts. I refrained from repeating myself. My questions aren't rhetorical. Some decisions in this code were discussed, and I had made various suggestions, but they were ignored.
But perhaps there has been discussions and decisions that I wasn't part of.
At any rate, I leave my comments for your consideration.
Thank you.
EofFlushWrites, // finish up writes and close | ||
Closed }; | ||
|
||
static const char* toString(State s) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't using STATE_ENUM
be better?
Let's reuse and improve what we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought, but then didn't want to change even more code. So yes, we could reuse our macro .. even if it would render the enum underling bigger (I used uint8_t .. the macro uses int in general, no biggy).
net/Socket.cpp
Outdated
std::string Socket::toStringImpl() const noexcept | ||
{ | ||
std::string s("Socket[#"); | ||
s.append(std::to_string(getFD())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using to_string()
is a good hint that you want ostringstream
instead.
This code is simply inefficient. It's also not very readable. So what's the advantage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has been revised earlier, is now std::ostream& Socket::streamImpl(std::ostream& os) const
net/Socket.cpp
Outdated
|
||
std::string Socket::toString() const noexcept | ||
{ | ||
return toStringImpl(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have the contents of toStringImpl()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1st .. the code has changed to offer the ostream Socket::stream()
version.
The reason for the private implementation variant is that the Socket::stream()
is virtual
to allow speciation to give more detailed information.
Then I use the private Socket implementation only for ctor + dtor, as they should not call into virtual methods.
net/Socket.cpp
Outdated
return "HTTP"; | ||
} | ||
|
||
std::string StreamSocket::toString() const noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this noexcept? It certainly isn't because it doesn't throw, nor because it is fatal if it throws. It seems unnecessarily strict to take the whole server down simply because this raises an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been revised to std::ostream& StreamSocket::stream(std::ostream& os) const
in my latest change set.
Pls note that I also have removed noexcept
as of late for more complex methods.
if( added <= std::numeric_limits<size_t>::max() - res ) { | ||
res += added; | ||
} else { | ||
// overflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fun fact: If we create 4 billion sockets per second, even after 136 years, we would still not overflow size_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, this was my paranoia code .. instead of forcing w/ a static_cast.
LOG_WRN("SocketPoll::ConnectionCount: Underflow " << res << " - " << removed); | ||
res = 0; | ||
} | ||
StatsConnectionCount.store(res, std::memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the proper way of doing this.
The atomic must be used for incrementing and decrementing.
Using intermediate variables and adding and subtracting and then storing is a certain way of writing thread unsafe code.
Either we don't need the atomic, and this function is fine, or we do need the atomic, and this function is broken. Either way, this code is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing happens in a locked state, no data-race.
Reading is allowed to see older values (just in case), so using a relaxed atomic.
This is an IMHO nice pattern to still allow lock-free decisions while ensuring correct modification.
In this use case max-connections IMHO warranted.
The actual lock-free read to make a decision is in the poll() where we may drop newSockets.
If this lazy decision is deemed too risky, OK - then we have to hard lock it in poll for dropping newSockets as well (not for simple logging of course).
In case the whole max-connection handling is redundant here (see above re MAX_CONNECTIONS), great - then we shall drop this altogether.
std::chrono::steady_clock::time_point _lastPingSentTime; | ||
int _pingTimeUs; | ||
size_t _openPingCount; | ||
Util::TimeAverage _pingMicroS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between pingPeriod, pingTimeout, and pingMicroS ? And why we need a ping count? Isn't it enough to compare now
with the time we got the last data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_pingMicroS is a TimeAverage for this value, i.e. averaging out multiple measurements over time, which removes one-offs etc.
_openPingCount is the open ping count, i.e. pings from server w/o client reply. Initially I intended this for debugging only, 1st validation.
Reviewing its use in WebSocketHandler::checkTimeout() is wrongly used IMHO:
if (_openPingCount > 0 && _pingTimeout.count() > std::numeric_limits<double>::epsilon() && _pingMicroS.average() >= _pingTimeout.count()) {
instead it should just read
if (_pingTimeout.count() > std::numeric_limits<double>::epsilon() && _pingMicroS.average() >= _pingTimeout.count()) {
So yes, _openPingCount
should not be material and be removed.
--_openPingCount; | ||
} | ||
LOGA_TRC(WebSocket, "Pong received: " << us << "us, avg " << avg_us << "us over " | ||
<< (int)_pingMicroS.duration() << "s, open pings left " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of all the excessive renaming in this PR, _pingMicroS
is perhaps the only one that is now misleading and should've been changed. Alas.
And why not log the fractional seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As replied below (my 2nd one) .. TimeAverage manages arbitrary values, here ping-micro-seconds.
The TimeAverage::duration() in seconds only gives us a rough idea for how long this average has been taken, where values are added to the average .. like every 3 seconds by default (net::Config::WSPingPeriod).
The granularity in seconds is enough for the logging and manual validation.
const auto now = std::chrono::steady_clock::now(); | ||
const int64_t us = std::chrono::duration_cast<std::chrono::microseconds> | ||
(now - _lastPingSentTime).count(); | ||
const double avg_us = _pingMicroS.add(_lastPingSentTime, static_cast<double>(us)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_pingMicroS.duration()
returns seconds, but we are adding here micro seconds. Is this intentional? TimeAverage is so convoluted and unnecessarily over-engineered that it's not obvious what the units are (and no useful comments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a bug of mine, outch .. yes, should pass
static_cast<double>(us)/1000000.0
. Great catch, thank you! Note: My unit test didn't catch this either, b/c fluctuations are very low but I should have done a better job in interpretation of the values. So I take the blame here - bad mistake.
The us
value in microseconds managed by TimeAverage is correct! (and so is the unit test)
Sorry, your comment confused myself .. the value managed by TimeAverage is of arbitrary nature
(and dimension), as mentioned in TimeAverage::add
So here microseconds is used.
The confusing part is probably that this value itself reflects time duration, however, TimeAverage::duration
is orthogonal, as it only refers over the measured total time and not the semantics of this value (ping response duration in microseconds).
So yes, annotations in TimeAverage probably should be more emphasized, only the add
method does. sigh
So I did a bit of reading myself; this is fine: I thought SocketPoll controlled lifecycle of the ClientSession - but in fact the DocumentBroker member:
template holds a hard reference on it - so not a big deal =) |
...
I was testing (~1 week ago) the socket-close on these criteria of this patch with firefox editing a document. The connection was build-up again and no artifacts were visible. |
Summary
(This commit is a WIP, pushed for discussion and to be amended!)
All network limits and timeouts are configurable now, see COOLWSD
and net::Defaults in NetUtil.hpp
ProtocolHandlerInterface::checkTimeout(..)
is now called via StreamSocket::checkRemoval()
to ensure tests even w/o input data (was: handlePoll())
ProtocolHandlerInterface::checkTimeout(..)
after missing WS native frame ping/pong (server only)
StreamSocket -> Socket (properties moved)
Socket (added properties)
if (now - lastSeen) > timeout
if (now - creationTime) > timeout
SocketPoll::poll()
TODO
Checklist
make prettier-write
and formatted the code.make check
make run
and manually verified that everything looks okay