🐛 Bug Report — Runtime APIs - Durable Objects with SQLite storage - calling deleteAll() in alarm handler causes internal error #2993

lucidplot · 2024-10-24T13:39:01Z

I think I've found a bug in the SQLite storage backend for Durable Objects. When I call deleteAll() from an alarm handler, it causes an internal error. Which means the alarm handler fails and keeps getting retried.

However, if I do await this.ctx.storage.deleteAlarm(); before calling deleteAll(), it works fine.

(I'm using this as a workaround with no problem. But it seems like a bug.)

Use case: I want to use alarms to delete all data, for eg a verify link that auto-expires.

FWIW this problem didn't seem to happen when I was using the KV storage backend.

The text was updated successfully, but these errors were encountered:

joshthoward · 2024-10-24T15:58:29Z

@lucidplot we've confirmed this and are working on a fix. Thank you for reporting this issue!

stayallive · 2024-11-04T11:05:46Z

I think my issue is related or at least related to the fix. I see the PR was merged but I am assuming this is not live yet. I am wondering if this will also clear up the affected DO's since I have quite a substantial number of them stuck in a retry loop. The constructor is called but the alarm callback is never fired for these and delete storage or alarm for the DO IDs have no effect. It's starting to rack up to quite a number of requests / CPU time (I'm assuming).

jclee · 2024-11-05T01:52:16Z

I think my issue is related or at least related to the fix. I see the PR was merged but I am assuming this is not live yet.

Correct... we are in the process of rolling out a version of the runtime containing the fix (as well as other updates), but at the time of your comment, it was not yet present in production.

If you wanted to address the error prior to the fix being fully deployed, one possible workaround (as lucidplot pointed out above) is for the Durable Object to explicitly delete the alarm before calling deleteAll(). It turns out that the bug only manifests when deleteAll() is called while an alarm is in place -- which is always the case during alarm handlers, if the alarm is not explicitly deleted, since the handler-running code does not actually clear the alarm until the handler successfully completes.

stayallive · 2024-11-05T09:34:43Z

I have been doing that, calling deleteAlarm and after deleteAll. I did that via a RPC call to a function that looks like:

	public async destroy(): Promise<void> {
		await this.ctx.blockConcurrencyWhile(async () => {
			await this.ctx.storage.deleteAlarm();
			await this.ctx.storage.deleteAll();
		});
	}

But it is possible that I did that incorrect or I did something else wrong. Because of the volume of errors I have recreated my worker to workaround the issue for now. But good to hear the runtime is rolling out. I'll report back if the problem starts to re-appear but hopeful it's solved. Appreciate the update!

jclee · 2024-11-06T03:59:29Z

@stayallive Thanks for the details. As far as this bug goes, I would expect that code to work OK, on both the new and old versions of the runtime, so maybe there's some additional wrinkle I'm missing.

I suspect that the awaits and the blockConcurrencyWhile() call are not strictly necessary, as I believe with sqlite-backed DOs, the deleteAlarm() and deleteAll() calls are effectively synchronous, from the viewpoint of the worker.

But I don't think they would hurt anything either... unless something is somehow causing either deleteAlarm() or deleteAll() to throw? In that case, it would break the DO's input lock and terminate it. Per the docs, it might be worth adding exception handling inside the blockConcurrencyWhile() lambda to see if any problems might be caused by something throwing there?

(Hmm... Looking through the docs again, I suppose another potential source of failure might be if the lambda takes longer than the 30s it is allowed? But I'd be surprised that those two calls would take an excessive amount of time.)

kentonv assigned jclee Oct 24, 2024

jclee mentioned this issue Oct 28, 2024

Sqlite-backed DOs: Allow calling deleteAll() inside an alarm handler #3018

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Bug Report — Runtime APIs - Durable Objects with SQLite storage - calling deleteAll() in alarm handler causes internal error #2993

🐛 Bug Report — Runtime APIs - Durable Objects with SQLite storage - calling deleteAll() in alarm handler causes internal error #2993

lucidplot commented Oct 24, 2024 •

edited

Loading

joshthoward commented Oct 24, 2024

stayallive commented Nov 4, 2024

jclee commented Nov 5, 2024

stayallive commented Nov 5, 2024

jclee commented Nov 6, 2024 •

edited

Loading

🐛 Bug Report — Runtime APIs - Durable Objects with SQLite storage - calling deleteAll() in alarm handler causes internal error #2993

🐛 Bug Report — Runtime APIs - Durable Objects with SQLite storage - calling deleteAll() in alarm handler causes internal error #2993

Comments

lucidplot commented Oct 24, 2024 • edited Loading

joshthoward commented Oct 24, 2024

stayallive commented Nov 4, 2024

jclee commented Nov 5, 2024

stayallive commented Nov 5, 2024

jclee commented Nov 6, 2024 • edited Loading

lucidplot commented Oct 24, 2024 •

edited

Loading

jclee commented Nov 6, 2024 •

edited

Loading