Skip to content

Commit

Permalink
Don't cascade reconnect requests on cascading failures
Browse files Browse the repository at this point in the history
The way the SerialFallbackProvider's fallback logic was written, if 8
requests came in while a provider had recently closed, the requests
would all trigger reconnects, cascading through any fallback providers
and ultimately creating a storm of reconnects. Across multiple serial
fallback providers, the problem was magnified, creating a tremendous
volume of connection/reconnection loops when one provider failed.

Some very light tweaks now check for whether there's already a
retry/reconnect in progress. If such a reconnect is happening, the
request is retried on the new provider. If such a reconnect has not yet
been initiated, it is triggered. This ensures only one reconnect is
being attempted at the same time.

Still missing is any sort of backoff if all providers fail.
  • Loading branch information
Shadowfiend committed Dec 23, 2024
1 parent 9b914b2 commit 8490364
Showing 1 changed file with 17 additions and 1 deletion.
18 changes: 17 additions & 1 deletion background/services/chain/serial-fallback-provider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,7 @@ export default class SerialFallbackProvider extends JsonRpcProvider {
(this.currentProvider._pendingBatch as { length: number } | undefined)
: undefined
const pendingBatchSize = pendingBatch?.length
const existingProviderIndex = this.currentProviderIndex

if (
pendingBatch &&
Expand Down Expand Up @@ -514,6 +515,19 @@ export default class SerialFallbackProvider extends JsonRpcProvider {
/WebSocket is already in CLOSING|bad response|missing response|we can't execute this request|failed response|TIMEOUT|NETWORK_ERROR/,
)
) {
// If a new provider is already in the process of being tried, go ahead
// and fire off into the new provider.
if (this.currentProviderIndex !== existingProviderIndex) {
logger.debug(
"Retrying on newly connected provider on chain",
this.chainID,
": ",
method,
params,
)
return await this.routeRpcCall(messageId)
}

// If there is another provider to try - try to send the message on that provider
if (this.currentProviderIndex + 1 < this.providerCreators.length) {
return await this.attemptToSendMessageOnNewProvider(messageId)
Expand All @@ -534,6 +548,8 @@ export default class SerialFallbackProvider extends JsonRpcProvider {
stringifiedError.match(/bad result from backend/)
) {
if (
// If the current provider is the one we tried with initially.
this.currentProviderIndex === existingProviderIndex &&
// If there is another provider to try and we have exceeded the
// number of retries try to send the message on that provider
this.currentProviderIndex + 1 < this.providerCreators.length &&
Expand Down Expand Up @@ -701,7 +717,7 @@ export default class SerialFallbackProvider extends JsonRpcProvider {
// If every other provider failed and we're on the alchemy provider,
// reconnect to the first provider once we've handled this request
// as we should limit relying on alchemy as a fallback
if (isAlchemyFallback) {
if (isAlchemyFallback && this.currentProviderIndex !== 0) {
this.currentProviderIndex = 0
this.reconnectProvider()
}
Expand Down

0 comments on commit 8490364

Please sign in to comment.