On Tue, 2017-03-07 at 14:30 +0100, Michal Hocko wrote: > From: Michal Hocko > > Tetsuo Handa has reported [1][2] that direct reclaimers might get > stuck > in too_many_isolated loop basically for ever because the last few > pages > on the LRU lists are isolated by the kswapd which is stuck on fs > locks > when doing the pageout or slab reclaim. This in turn means that there > is > nobody to actually trigger the oom killer and the system is basically > unusable. > > too_many_isolated has been introduced by 35cd78156c49 ("vmscan: > throttle > direct reclaim when too many pages are isolated already") to prevent > from pre-mature oom killer invocations because back then no reclaim > progress could indeed trigger the OOM killer too early. But since the > oom detection rework 0a0337e0d1d1 ("mm, oom: rework oom detection") > the allocation/reclaim retry loop considers all the reclaimable pages > and throttles the allocation at that layer so we can loosen the > direct > reclaim throttling. It only does this to some extent.  If reclaim made no progress, for example due to immediately bailing out because the number of already isolated pages is too high (due to many parallel reclaimers), the code could hit the "no_progress_loops > MAX_RECLAIM_RETRIES" test without ever looking at the number of reclaimable pages. Could that create problems if we have many concurrent reclaimers? It may be OK, I just do not understand all the implications. I like the general direction your patch takes the code in, but I would like to understand it better... -- All rights reversed