You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are observing that approximately 50% of the time, our sub-orchestrations or activities become stuck in the Pending or TaskScheduled states, taking many hours to become unstuck and progress again.
Possibly a red herring, but the Application Insights logging within the Azure portal seems to suggest that the sub-orchestrator job "forks" itself, with one copy using the parent ID and the other using its own unique ID.
These two orchestrator run entries show what started as the same job with the same timestamp.
This behaviour has been observed using both a dedicated (newly provisioned) storage account, and using the Netherite storage provider.
Expected behavior
All activities and (sub)orchestrators should complete succesfully in a reasonable timeframe.
Actual behavior
Oftentimes the jobs appear to get stuck around the same point, taking a very long time to continue.
There is no other activity within the Functions app or the storage/event resources during this period.
Relevant source code snippets
Propritrary codebase, however can prepare relevant snippets on request.
Looks like the issue is with another department. Turns out they've been getting "stuck" due to blocking on a long-running operation from Graph, which I didn't spot due to a bit of a lazy implementation and the fact it was handled by another library. I refactored to avoid blocking and use a more appropriate polling pattern, which revealed that sometimes these jobs are taking 17 hours instead of the usual few seconds. Apologies for wasting your time, closing :)
It is interesting that these jobs showed as TaskScheduled rather than running, though, and that I was unable to get any meaningful logs out until the run had completed.
Description
We are observing that approximately 50% of the time, our sub-orchestrations or activities become stuck in the
Pending
orTaskScheduled
states, taking many hours to become unstuck and progress again.Possibly a red herring, but the Application Insights logging within the Azure portal seems to suggest that the sub-orchestrator job "forks" itself, with one copy using the parent ID and the other using its own unique ID.
These two orchestrator run entries show what started as the same job with the same timestamp.
This behaviour has been observed using both a dedicated (newly provisioned) storage account, and using the Netherite storage provider.
Expected behavior
All activities and (sub)orchestrators should complete succesfully in a reasonable timeframe.
Actual behavior
Oftentimes the jobs appear to get stuck around the same point, taking a very long time to continue.
There is no other activity within the Functions app or the storage/event resources during this period.
Relevant source code snippets
Propritrary codebase, however can prepare relevant snippets on request.
Known workarounds
None
App Details
All related dependencies:
Screenshots
Gantt chart showing issue from Durable Functions Monitor:
If deployed to Azure
Timeframe issue observed:
2025-01-27T17:15:00
-2025-01-28T10:31:00
(UTC)Azure region: UK South
Orchestration instance ID(s):
c11225bfd6ed45c9b73e147a5e204626
,2786e9f0fadd479fa3975b99638443fa:11
Function names:
Migration
(top-level orchestrator)Migration_contentTypes
(sub-orchestrator, runs once)Migration_contentTypes_sync
(activity, runs in loop sequentially)The Functions app is Consumption tier, running on Window 64bit (although this was also observed on 32bit). Scaling settings are left as default.
Resource names can be provided privately if required.
The text was updated successfully, but these errors were encountered: