Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AbortController break fetch after first timeout because aborted signal can't be re-used #861

Closed
lbeschastny opened this issue Jun 26, 2024 · 13 comments · Fixed by #892
Closed

Comments

@lbeschastny
Copy link

lbeschastny commented Jun 26, 2024

Node.js Version:

↪ node --version
v20.12.2

Operating System:

↪ uname -a
Darwin Mac-WP9QH543YX 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64

The problem:

AbortSignal both stores it's state and provides an abort event to signal when abort is happening.

And AbortController newer updates it's signal, leaving it in aborted state forever:

> abortController = new AbortController()
AbortController { signal: AbortSignal { aborted: false } }
> abortController.abort()
undefined
> abortController
AbortController { signal: AbortSignal { aborted: true } }

It looks like undici fetch uses both aborted property and abort event to skip sending the request if it is already aborted.

So, once the action times out for the first time, it switches AbortController into aborted state and all subsequent fetch calls just abort immediately.

Minimal reproducible example:

const { describe, it, before, after, afterEach, mock } = require('node:test');
const assert = require('node:assert/strict');
const http = require('node:http');

const CircuitBreaker = require('opossum');

describe('Circuit breaker with abort controller', () => {
    const abortController = new AbortController();
    const get = mock.fn((timeout) => {
        const { port } = server.address();
        const { signal } = abortController;
        const headers = { 'x-timeout': timeout };
        return fetch(`http://localhost:${port}`, { signal, headers });
    });
    const breaker = new CircuitBreaker(get, {
        abortController,
        timeout: 1000,
        volumeThreshold: 3
    });

    const server = http.createServer((req, res) => {
        const timeout = Number(req.headers['x-timeout']);
        setTimeout(() => res.end(`delayed for ${timeout}ms`), timeout);
    });

    before(() => server.listen());
    after(() => server.close());
    afterEach(() => get.mock.resetCalls());

    it('Should work', async () => {
        const response = await breaker.fire(100);
        const text = await response.text();
        assert.equal(text, 'delayed for 100ms');
    });

    it('Should timeout and abort request', async () => {
        await assert.rejects(() => breaker.fire(2000), {
            message: 'Timed out after 1000ms'
        });
        await assert.rejects(() => get.mock.calls[0].result, {
            message: 'This operation was aborted'
        });
    });

    it('Should work again', async () => {
        try {
            await breaker.fire(100);
        } catch ({ name, message }) {
            throw Object.assign(new Error(message), { name });
        }
    });
});
↪ node --test
▶ Circuit breaker with abort controller
  ✔ Should work (153.779083ms)
  ✔ Should timeout (1008.148333ms)
  ✖ Should work again (1.391459ms)
    Error [AbortError]: This operation was aborted
        at TestContext.<anonymous> (/Users/leonbesc/Documents/viaplay/opossum-abort-controller-issue/test.js:42:33)
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async Test.run (node:internal/test_runner/test:640:9)
        at async Suite.processPendingSubtests (node:internal/test_runner/test:382:7)

Possible ways to fix this:

It seems we cannot re-use the same AbortController and AbortSignal pair for all requests since AbortSignal preserves its aborted state.

I can only see one way to fix this problem - use new AbortSignal for every requests (see #780).

For example, opossum could pass the signal object as the first argument to the action handler.

@cedrick-ah
Copy link

Hello @lbeschastny, I would to work on this with you but it seem like it is sait that "The intent is that when a circuit times out, an assumption is made that all other ongoing requests to the same endpoint will time out as well". That is why the subsequent request failed. I think what we need is a reset timeout for the abort controller.

@lbeschastny
Copy link
Author

lbeschastny commented Jul 19, 2024

@cedrick-ah the problem here is not about other ongoing requests since this limitation is clearly stated in the readme.

The problem is that all new requests are immediately aborted by fetch after the first timeout due to the AbortController state which is changed to aborted the first time the abort signal is sent.

@cedrick-ah
Copy link

cedrick-ah commented Jul 19, 2024

Okay, maybe you can try this approach to change abort for each request.

const { describe, it, before, after } = require('node:test');
const assert = require('node:assert/strict');
const http = require('node:http');

const CircuitBreaker = require('opossum');

describe('Circuit breaker with abort controller', () => {
  let abortController;
  const get = timeout => {
    const { port } = server.address();
    abortController = new AbortController();
    const { signal } = abortController;
    const headers = { 'x-timeout': timeout };
    return fetch(`http://localhost:${port}`, { signal, headers });
  };
  const breaker = new CircuitBreaker(get, {
    abortController,
    timeout: 1000,
    volumeThreshold: 3
  });

  const server = http.createServer((req, res) => {
    const timeout = Number(req.headers['x-timeout']);
    setTimeout(() => res.end(`delayed for ${timeout}ms`), timeout);
  });

  before(() => server.listen());
  after(() => server.close());

  it('Should work', async () => {
    const response = await breaker.fire(100);
    const text = await response.text();
    assert.equal(text, 'delayed for 100ms');
  });

  it('Should timeout', async () => {
    await assert.rejects(() => breaker.fire(2000), {
      message: 'Timed out after 1000ms'
    });
  });

  it('Should work again', async () => {
    try {
      await breaker.fire(100);
    } catch ({ name, message }) {
      throw Object.assign(new Error(message), { name });
    }
  });
});
node --test
▶ Circuit breaker with abort controller
  ✔ Should work (286.944232ms)
  ✔ Should timeout (1006.156862ms)
  ✔ Should work again (117.404018ms)
▶ Circuit breaker with abort controller (1430.090547ms)

ℹ tests 3
ℹ suites 1
ℹ pass 3
ℹ fail 0
ℹ cancelled 0
ℹ skipped 0
ℹ todo 0
ℹ duration_ms 7338.956067

@lbeschastny
Copy link
Author

lbeschastny commented Jul 20, 2024

@cedrick-ah you create new AbortController for every request here, but for opossum the abortController option in your example is undefined.

So, requests just newer get aborted.

@cedrick-ah
Copy link

It will be defined after breaker.fire() is called. Also, abort controller and signal only come in pairs as you said earlier. I tested and the new requests was not aborted.

@lbeschastny
Copy link
Author

@cedrick-ah I updated my Minimal reproducible example with an additional assertion to verify that the operation was actually aborted.

@cedrick-ah
Copy link

Okay, I understand.

Copy link
Contributor

This issue is stale because it has been open 30 days with no activity.

@klippx
Copy link

klippx commented Sep 10, 2024

We are in the same situation, we use "static" CircuitBreaker instances that are living outside request scope (in the service scope to be precise) so that all requests can share the same instances (one per host). That's how it should work, this is how state is kept track so requests to a certain host can share information and be short circuited.

But the design choice here according to the documentation is to create an AbortController that lives in the same scope as CircuitBreaker (i.e. in the service scope) which makes no sense! That way seems quite unusable to me, who wants to create a new CircuitBreaker per request?

I can only see one way to fix this problem - use new AbortSignal for every requests (see #780).

I have an idea for a nice and easy to understand API:

  const ac = new AbortController();
  await breaker.withAbortController(ac).fire(100);

Boom 😄

...and consider getting rid of abortController as an option, or explain to me how it is working and what I am missing but it seems like it doesn't work for anyone?

Copy link
Contributor

This issue is stale because it has been open 30 days with no activity.

@WillianAgostini
Copy link
Contributor

WillianAgostini commented Oct 22, 2024

I'm suggesting a solution where we reuse the same AbortSignal for multiple requests, renewing the AbortController when the circuit enters the 'halfClose' or 'close' state.

I propose adding a new option to the CircuitBreakerautoRenewSignal. This flag would automatically handle the renewal of the AbortController.

Example:

const options = {
  timeout: 300,
  resetTimeout: 100,
  autoRenewSignal: true, // new flag
};
const breaker = new CircuitBreaker(asyncFunctionThatCouldFail, options);
const signal = breaker.getSignal(); // get the current signal

if the circuit is 'open', return an aborted signal.
If the state is 'halfOpen' or 'close', return a fresh, non-aborted signal.

@lholmquist
Copy link
Member

That could an interesting addition. would you like to send a PR for it?

WillianAgostini added a commit to WillianAgostini/opossum that referenced this issue Oct 23, 2024
…pecific states

When state changes to 'halfOpen' or 'close', the AbortController will be recreated to handle reuse.

nodeshift#861
WillianAgostini added a commit to WillianAgostini/opossum that referenced this issue Oct 23, 2024
…pecific states

When state changes to 'halfOpen' or 'close', the AbortController will be recreated to handle reuse.

nodeshift#861
WillianAgostini added a commit to WillianAgostini/opossum that referenced this issue Oct 23, 2024
@WillianAgostini
Copy link
Contributor

PR: #892

WillianAgostini added a commit to WillianAgostini/opossum that referenced this issue Oct 28, 2024
lholmquist pushed a commit that referenced this issue Oct 28, 2024
…lose' or 'close' state (#892)

* feat: add option to auto-recreate AbortController if aborted during specific states

When state changes to 'halfOpen' or 'close', the AbortController will be recreated to handle reuse.

#861

* feat(circuit-breaker): extract abort controller renewal logic into separate function

#861

* docs: update README to include autoRenewAbortController configuration options

#861
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants