From: mhocko@kernel.org To: <linux-mm@kvack.org> Cc: Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Johannes Weiner <hannes@cmpxchg.org>, Rik van Riel <riel@redhat.com>, David Rientjes <rientjes@google.com>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, LKML <linux-kernel@vger.kernel.org>, Michal Hocko <mhocko@suse.com> Subject: [RFC 2/3] mm: throttle on IO only when there are too many dirty and writeback pages Date: Thu, 29 Oct 2015 16:17:14 +0100 [thread overview] Message-ID: <1446131835-3263-3-git-send-email-mhocko@kernel.org> (raw) In-Reply-To: <1446131835-3263-1-git-send-email-mhocko@kernel.org> From: Michal Hocko <mhocko@suse.com> wait_iff_congested has been used to throttle allocator before it retried another round of direct reclaim to allow the writeback to make some progress and prevent reclaim from looping over dirty/writeback pages without making any progress. We used to do congestion_wait before 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone") but that led to undesirable stalls and sleeping for the full timeout even when the BDI wasn't congested. Hence wait_iff_congested was used instead. But it seems that even wait_iff_congested doesn't work as expected. We might have a small file LRU list with all pages dirty/writeback and yet the bdi is not congested so this is just a cond_resched in the end and can end up triggering pre mature OOM. This patch replaces the unconditional wait_iff_congested by congestion_wait which is executed only if we _know_ that the last round of direct reclaim didn't make any progress and dirty+writeback pages are more than a half of the reclaimable pages on the zone which might be usable for our target allocation. This shouldn't reintroduce stalls fixed by 0e093d99763e because congestion_wait is called only when we are getting hopeless when sleeping is a better choice than OOM with many pages under IO. Signed-off-by: Michal Hocko <mhocko@suse.com> --- mm/page_alloc.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9c0abb75ad53..0518ca6a9776 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3191,8 +3191,23 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, */ if (__zone_watermark_ok(zone, order, min_wmark_pages(zone), ac->high_zoneidx, alloc_flags, target)) { - /* Wait for some write requests to complete then retry */ - wait_iff_congested(zone, BLK_RW_ASYNC, HZ/50); + unsigned long writeback = zone_page_state(zone, NR_WRITEBACK), + dirty = zone_page_state(zone, NR_FILE_DIRTY); + + if (did_some_progress) + goto retry; + + /* + * If we didn't make any progress and have a lot of + * dirty + writeback pages then we should wait for + * an IO to complete to slow down the reclaim and + * prevent from pre mature OOM + */ + if (2*(writeback + dirty) > reclaimable) + congestion_wait(BLK_RW_ASYNC, HZ/10); + else + cond_resched(); + goto retry; } } -- 2.6.1
WARNING: multiple messages have this Message-ID (diff)
From: mhocko@kernel.org To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Johannes Weiner <hannes@cmpxchg.org>, Rik van Riel <riel@redhat.com>, David Rientjes <rientjes@google.com>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, LKML <linux-kernel@vger.kernel.org>, Michal Hocko <mhocko@suse.com> Subject: [RFC 2/3] mm: throttle on IO only when there are too many dirty and writeback pages Date: Thu, 29 Oct 2015 16:17:14 +0100 [thread overview] Message-ID: <1446131835-3263-3-git-send-email-mhocko@kernel.org> (raw) In-Reply-To: <1446131835-3263-1-git-send-email-mhocko@kernel.org> From: Michal Hocko <mhocko@suse.com> wait_iff_congested has been used to throttle allocator before it retried another round of direct reclaim to allow the writeback to make some progress and prevent reclaim from looping over dirty/writeback pages without making any progress. We used to do congestion_wait before 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone") but that led to undesirable stalls and sleeping for the full timeout even when the BDI wasn't congested. Hence wait_iff_congested was used instead. But it seems that even wait_iff_congested doesn't work as expected. We might have a small file LRU list with all pages dirty/writeback and yet the bdi is not congested so this is just a cond_resched in the end and can end up triggering pre mature OOM. This patch replaces the unconditional wait_iff_congested by congestion_wait which is executed only if we _know_ that the last round of direct reclaim didn't make any progress and dirty+writeback pages are more than a half of the reclaimable pages on the zone which might be usable for our target allocation. This shouldn't reintroduce stalls fixed by 0e093d99763e because congestion_wait is called only when we are getting hopeless when sleeping is a better choice than OOM with many pages under IO. Signed-off-by: Michal Hocko <mhocko@suse.com> --- mm/page_alloc.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9c0abb75ad53..0518ca6a9776 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3191,8 +3191,23 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, */ if (__zone_watermark_ok(zone, order, min_wmark_pages(zone), ac->high_zoneidx, alloc_flags, target)) { - /* Wait for some write requests to complete then retry */ - wait_iff_congested(zone, BLK_RW_ASYNC, HZ/50); + unsigned long writeback = zone_page_state(zone, NR_WRITEBACK), + dirty = zone_page_state(zone, NR_FILE_DIRTY); + + if (did_some_progress) + goto retry; + + /* + * If we didn't make any progress and have a lot of + * dirty + writeback pages then we should wait for + * an IO to complete to slow down the reclaim and + * prevent from pre mature OOM + */ + if (2*(writeback + dirty) > reclaimable) + congestion_wait(BLK_RW_ASYNC, HZ/10); + else + cond_resched(); + goto retry; } } -- 2.6.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-29 15:17 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-10-29 15:17 RFC: OOM detection rework v1 mhocko 2015-10-29 15:17 ` mhocko 2015-10-29 15:17 ` [RFC 1/3] mm, oom: refactor oom detection mhocko 2015-10-29 15:17 ` mhocko 2015-10-30 4:10 ` Hillf Danton 2015-10-30 4:10 ` Hillf Danton 2015-10-30 8:36 ` Michal Hocko 2015-10-30 8:36 ` Michal Hocko 2015-10-30 10:14 ` Michal Hocko 2015-10-30 10:14 ` Michal Hocko 2015-10-30 13:32 ` Tetsuo Handa 2015-10-30 13:32 ` Tetsuo Handa 2015-10-30 14:55 ` Michal Hocko 2015-10-30 14:55 ` Michal Hocko 2015-10-31 3:57 ` Hillf Danton 2015-10-31 3:57 ` Hillf Danton 2015-10-30 5:23 ` Kamezawa Hiroyuki 2015-10-30 5:23 ` Kamezawa Hiroyuki 2015-10-30 8:23 ` Michal Hocko 2015-10-30 8:23 ` Michal Hocko 2015-10-30 9:41 ` Kamezawa Hiroyuki 2015-10-30 9:41 ` Kamezawa Hiroyuki 2015-10-30 10:18 ` Michal Hocko 2015-10-30 10:18 ` Michal Hocko 2015-11-12 12:39 ` Michal Hocko 2015-11-12 12:39 ` Michal Hocko 2015-10-29 15:17 ` mhocko [this message] 2015-10-29 15:17 ` [RFC 2/3] mm: throttle on IO only when there are too many dirty and writeback pages mhocko 2015-10-30 4:18 ` Hillf Danton 2015-10-30 4:18 ` Hillf Danton 2015-10-30 8:37 ` Michal Hocko 2015-10-30 8:37 ` Michal Hocko 2015-10-30 5:48 ` Kamezawa Hiroyuki 2015-10-30 5:48 ` Kamezawa Hiroyuki 2015-10-30 8:38 ` Michal Hocko 2015-10-30 8:38 ` Michal Hocko 2015-10-29 15:17 ` [RFC 3/3] mm: use watermak checks for __GFP_REPEAT high order allocations mhocko 2015-10-29 15:17 ` mhocko 2015-11-12 12:44 ` RFC: OOM detection rework v1 Michal Hocko 2015-11-12 12:44 ` Michal Hocko 2015-11-18 13:03 [RFC 0/3] OOM detection rework v2 Michal Hocko 2015-11-18 13:03 ` [RFC 2/3] mm: throttle on IO only when there are too many dirty and writeback pages Michal Hocko 2015-11-19 23:12 ` David Rientjes 2015-11-20 9:15 ` Michal Hocko 2015-12-01 12:56 [RFC 0/3] OOM detection rework v3 Michal Hocko 2015-12-01 12:56 ` [RFC 2/3] mm: throttle on IO only when there are too many dirty and writeback pages Michal Hocko 2015-12-02 7:09 ` Hillf Danton 2015-12-11 16:25 ` Johannes Weiner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1446131835-3263-3-git-send-email-mhocko@kernel.org \ --to=mhocko@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhocko@suse.com \ --cc=penguin-kernel@I-love.SAKURA.ne.jp \ --cc=riel@redhat.com \ --cc=rientjes@google.com \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.