Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry()
Tetsuo Handa has reported that it is possible to bypass the short sleep for PF_WQ_WORKER threads which was introduced by commit 373ccbe ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress") and lock up the system if OOM. The primary reason is that WQ_MEM_RECLAIM WQs are not guaranteed to run even when they have a rescuer available. Those workers might be essential for reclaim to make a forward progress, however. If we are too unlucky all the allocations requests can get stuck waiting for a WQ_MEM_RECLAIM work item and the system is essentially stuck in an OOM condition without much hope to move on. Tetsuo has seen the reclaim stuck on drain_local_pages_wq or xlog_cil_push_work (xfs). There might be others. Since should_reclaim_retry() should be a natural reschedule point, let's do the short sleep for PF_WQ_WORKER threads unconditionally in order to guarantee that other pending work items are started. This will workaround this problem and it is less fragile than hunting down when the sleep is missed. Having a single sleeping point is more robust. [[email protected]: reflow comment to 80 cols to save a couple of lines] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Debugged-by: Tetsuo Handa <[email protected]> Reported-by: Tetsuo Handa <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: David Rientjes <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
- Loading branch information