From: Dave Chinner <david@fromorbit.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH] mm: disallow direct reclaim page writeback Date: Tue, 13 Apr 2010 10:17:58 +1000 [thread overview] Message-ID: <1271117878-19274-1-git-send-email-david@fromorbit.com> (raw) From: Dave Chinner <dchinner@redhat.com> When we enter direct reclaim we may have used an arbitrary amount of stack space, and hence enterring the filesystem to do writeback can then lead to stack overruns. This problem was recently encountered x86_64 systems with 8k stacks running XFS with simple storage configurations. Writeback from direct reclaim also adversely affects background writeback. The background flusher threads should already be taking care of cleaning dirty pages, and direct reclaim will kick them if they aren't already doing work. If direct reclaim is also calling ->writepage, it will cause the IO patterns from the background flusher threads to be upset by LRU-order writeback from pageout() which can be effectively random IO. Having competing sources of IO trying to clean pages on the same backing device reduces throughput by increasing the amount of seeks that the backing device has to do to write back the pages. Hence for direct reclaim we should not allow ->writepages to be entered at all. Set up the relevant scan_control structures to enforce this, and prevent sc->may_writepage from being set in other places in the direct reclaim path in response to other events. Reported-by: John Berthels <john@humyo.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> --- mm/vmscan.c | 13 ++++++------- 1 files changed, 6 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index e0e5f15..5321ac4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1826,10 +1826,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, * writeout. So in laptop mode, write out the whole world. */ writeback_threshold = sc->nr_to_reclaim + sc->nr_to_reclaim / 2; - if (total_scanned > writeback_threshold) { + if (total_scanned > writeback_threshold) wakeup_flusher_threads(laptop_mode ? 0 : total_scanned); - sc->may_writepage = 1; - } /* Take a nap, wait for some writeback to complete */ if (!sc->hibernation_mode && sc->nr_scanned && @@ -1871,7 +1869,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order, { struct scan_control sc = { .gfp_mask = gfp_mask, - .may_writepage = !laptop_mode, + .may_writepage = 0, .nr_to_reclaim = SWAP_CLUSTER_MAX, .may_unmap = 1, .may_swap = 1, @@ -1893,7 +1891,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem, struct zone *zone, int nid) { struct scan_control sc = { - .may_writepage = !laptop_mode, + .may_writepage = 0, .may_unmap = 1, .may_swap = !noswap, .swappiness = swappiness, @@ -1926,7 +1924,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont, { struct zonelist *zonelist; struct scan_control sc = { - .may_writepage = !laptop_mode, + .may_writepage = 0, .may_unmap = 1, .may_swap = !noswap, .nr_to_reclaim = SWAP_CLUSTER_MAX, @@ -2567,7 +2565,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) struct reclaim_state reclaim_state; int priority; struct scan_control sc = { - .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE), + .may_writepage = (current_is_kswapd() && + (zone_reclaim_mode & RECLAIM_WRITE)), .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP), .may_swap = 1, .nr_to_reclaim = max_t(unsigned long, nr_pages, -- 1.6.5
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH] mm: disallow direct reclaim page writeback Date: Tue, 13 Apr 2010 10:17:58 +1000 [thread overview] Message-ID: <1271117878-19274-1-git-send-email-david@fromorbit.com> (raw) From: Dave Chinner <dchinner@redhat.com> When we enter direct reclaim we may have used an arbitrary amount of stack space, and hence enterring the filesystem to do writeback can then lead to stack overruns. This problem was recently encountered x86_64 systems with 8k stacks running XFS with simple storage configurations. Writeback from direct reclaim also adversely affects background writeback. The background flusher threads should already be taking care of cleaning dirty pages, and direct reclaim will kick them if they aren't already doing work. If direct reclaim is also calling ->writepage, it will cause the IO patterns from the background flusher threads to be upset by LRU-order writeback from pageout() which can be effectively random IO. Having competing sources of IO trying to clean pages on the same backing device reduces throughput by increasing the amount of seeks that the backing device has to do to write back the pages. Hence for direct reclaim we should not allow ->writepages to be entered at all. Set up the relevant scan_control structures to enforce this, and prevent sc->may_writepage from being set in other places in the direct reclaim path in response to other events. Reported-by: John Berthels <john@humyo.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> --- mm/vmscan.c | 13 ++++++------- 1 files changed, 6 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index e0e5f15..5321ac4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1826,10 +1826,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, * writeout. So in laptop mode, write out the whole world. */ writeback_threshold = sc->nr_to_reclaim + sc->nr_to_reclaim / 2; - if (total_scanned > writeback_threshold) { + if (total_scanned > writeback_threshold) wakeup_flusher_threads(laptop_mode ? 0 : total_scanned); - sc->may_writepage = 1; - } /* Take a nap, wait for some writeback to complete */ if (!sc->hibernation_mode && sc->nr_scanned && @@ -1871,7 +1869,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order, { struct scan_control sc = { .gfp_mask = gfp_mask, - .may_writepage = !laptop_mode, + .may_writepage = 0, .nr_to_reclaim = SWAP_CLUSTER_MAX, .may_unmap = 1, .may_swap = 1, @@ -1893,7 +1891,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem, struct zone *zone, int nid) { struct scan_control sc = { - .may_writepage = !laptop_mode, + .may_writepage = 0, .may_unmap = 1, .may_swap = !noswap, .swappiness = swappiness, @@ -1926,7 +1924,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont, { struct zonelist *zonelist; struct scan_control sc = { - .may_writepage = !laptop_mode, + .may_writepage = 0, .may_unmap = 1, .may_swap = !noswap, .nr_to_reclaim = SWAP_CLUSTER_MAX, @@ -2567,7 +2565,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) struct reclaim_state reclaim_state; int priority; struct scan_control sc = { - .may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE), + .may_writepage = (current_is_kswapd() && + (zone_reclaim_mode & RECLAIM_WRITE)), .may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP), .may_swap = 1, .nr_to_reclaim = max_t(unsigned long, nr_pages, -- 1.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2010-04-13 1:20 UTC|newest] Thread overview: 248+ messages / expand[flat|nested] mbox.gz Atom feed top 2010-04-13 0:17 Dave Chinner [this message] 2010-04-13 0:17 ` [PATCH] mm: disallow direct reclaim page writeback Dave Chinner 2010-04-13 8:31 ` KOSAKI Motohiro 2010-04-13 8:31 ` KOSAKI Motohiro 2010-04-13 10:29 ` Dave Chinner 2010-04-13 10:29 ` Dave Chinner 2010-04-13 11:39 ` KOSAKI Motohiro 2010-04-13 11:39 ` KOSAKI Motohiro 2010-04-13 14:36 ` Dave Chinner 2010-04-13 14:36 ` Dave Chinner 2010-04-14 3:12 ` Dave Chinner 2010-04-14 3:12 ` Dave Chinner 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-15 1:56 ` Dave Chinner 2010-04-15 1:56 ` Dave Chinner 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 7:36 ` Dave Chinner 2010-04-14 7:36 ` Dave Chinner 2010-04-13 9:58 ` Mel Gorman 2010-04-13 9:58 ` Mel Gorman 2010-04-13 11:19 ` Dave Chinner 2010-04-13 11:19 ` Dave Chinner 2010-04-13 19:34 ` Mel Gorman 2010-04-13 19:34 ` Mel Gorman 2010-04-13 20:20 ` Chris Mason 2010-04-13 20:20 ` Chris Mason 2010-04-14 1:40 ` Dave Chinner 2010-04-14 1:40 ` Dave Chinner 2010-04-14 4:59 ` KAMEZAWA Hiroyuki 2010-04-14 4:59 ` KAMEZAWA Hiroyuki 2010-04-14 5:41 ` Dave Chinner 2010-04-14 5:41 ` Dave Chinner 2010-04-14 5:54 ` KOSAKI Motohiro 2010-04-14 5:54 ` KOSAKI Motohiro 2010-04-14 6:13 ` Minchan Kim 2010-04-14 7:19 ` Minchan Kim 2010-04-14 7:19 ` Minchan Kim 2010-04-14 9:42 ` KAMEZAWA Hiroyuki 2010-04-14 9:42 ` KAMEZAWA Hiroyuki 2010-04-14 9:42 ` KAMEZAWA Hiroyuki 2010-04-14 10:01 ` Minchan Kim 2010-04-14 10:01 ` Minchan Kim 2010-04-14 10:07 ` Mel Gorman 2010-04-14 10:07 ` Mel Gorman 2010-04-14 10:07 ` Mel Gorman 2010-04-14 10:16 ` Minchan Kim 2010-04-14 10:16 ` Minchan Kim 2010-04-14 7:06 ` Dave Chinner 2010-04-14 7:06 ` Dave Chinner 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 7:28 ` Dave Chinner 2010-04-14 7:28 ` Dave Chinner 2010-04-14 8:51 ` Mel Gorman 2010-04-14 8:51 ` Mel Gorman 2010-04-15 1:34 ` Dave Chinner 2010-04-15 1:34 ` Dave Chinner 2010-04-15 1:34 ` Dave Chinner 2010-04-15 4:09 ` KOSAKI Motohiro 2010-04-15 4:09 ` KOSAKI Motohiro 2010-04-15 4:11 ` [PATCH 1/4] vmscan: delegate pageout io to flusher thread if current is kswapd KOSAKI Motohiro 2010-04-15 4:11 ` KOSAKI Motohiro 2010-04-15 4:11 ` KOSAKI Motohiro 2010-04-15 8:05 ` Suleiman Souhlal 2010-04-15 8:05 ` Suleiman Souhlal 2010-04-15 8:17 ` KOSAKI Motohiro 2010-04-15 8:17 ` KOSAKI Motohiro 2010-04-15 8:26 ` KOSAKI Motohiro 2010-04-15 8:26 ` KOSAKI Motohiro 2010-04-15 10:30 ` Johannes Weiner 2010-04-15 10:30 ` Johannes Weiner 2010-04-15 17:24 ` Suleiman Souhlal 2010-04-15 17:24 ` Suleiman Souhlal 2010-04-20 2:56 ` Ying Han 2010-04-20 2:56 ` Ying Han 2010-04-15 9:32 ` Dave Chinner 2010-04-15 9:32 ` Dave Chinner 2010-04-15 9:41 ` KOSAKI Motohiro 2010-04-15 9:41 ` KOSAKI Motohiro 2010-04-15 17:27 ` Suleiman Souhlal 2010-04-15 17:27 ` Suleiman Souhlal 2010-04-15 23:33 ` Dave Chinner 2010-04-15 23:33 ` Dave Chinner 2010-04-15 23:41 ` Suleiman Souhlal 2010-04-15 23:41 ` Suleiman Souhlal 2010-04-16 9:50 ` Alan Cox 2010-04-16 9:50 ` Alan Cox 2010-04-17 3:06 ` Dave Chinner 2010-04-17 3:06 ` Dave Chinner 2010-04-15 8:18 ` KOSAKI Motohiro 2010-04-15 8:18 ` KOSAKI Motohiro 2010-04-15 8:18 ` KOSAKI Motohiro 2010-04-15 10:31 ` Mel Gorman 2010-04-15 10:31 ` Mel Gorman 2010-04-15 11:26 ` KOSAKI Motohiro 2010-04-15 11:26 ` KOSAKI Motohiro 2010-04-15 4:13 ` [PATCH 2/4] vmscan: kill prev_priority completely KOSAKI Motohiro 2010-04-15 4:13 ` KOSAKI Motohiro 2010-04-15 4:13 ` KOSAKI Motohiro 2010-04-15 4:14 ` [PATCH 3/4] vmscan: move priority variable into scan_control KOSAKI Motohiro 2010-04-15 4:14 ` KOSAKI Motohiro 2010-04-15 4:14 ` KOSAKI Motohiro 2010-04-15 4:15 ` [PATCH 4/4] vmscan: delegate page cleaning io to flusher thread if VM pressure is low KOSAKI Motohiro 2010-04-15 4:15 ` KOSAKI Motohiro 2010-04-15 4:15 ` KOSAKI Motohiro 2010-04-15 4:35 ` [PATCH] mm: disallow direct reclaim page writeback KOSAKI Motohiro 2010-04-15 4:35 ` KOSAKI Motohiro 2010-04-15 6:32 ` Dave Chinner 2010-04-15 6:32 ` Dave Chinner 2010-04-15 6:44 ` KOSAKI Motohiro 2010-04-15 6:44 ` KOSAKI Motohiro 2010-04-15 6:58 ` Dave Chinner 2010-04-15 6:58 ` Dave Chinner 2010-04-15 6:20 ` Dave Chinner 2010-04-15 6:20 ` Dave Chinner 2010-04-15 6:35 ` KOSAKI Motohiro 2010-04-15 6:35 ` KOSAKI Motohiro 2010-04-15 8:54 ` Dave Chinner 2010-04-15 8:54 ` Dave Chinner 2010-04-15 10:21 ` KOSAKI Motohiro 2010-04-15 10:21 ` KOSAKI Motohiro 2010-04-15 10:23 ` [PATCH 1/4] vmscan: simplify shrink_inactive_list() KOSAKI Motohiro 2010-04-15 10:23 ` KOSAKI Motohiro 2010-04-15 13:15 ` Mel Gorman 2010-04-15 13:15 ` Mel Gorman 2010-04-15 15:01 ` Andi Kleen 2010-04-15 15:01 ` Andi Kleen 2010-04-15 15:01 ` Andi Kleen 2010-04-15 15:44 ` Mel Gorman 2010-04-15 15:44 ` Mel Gorman 2010-04-15 16:54 ` Andi Kleen 2010-04-15 16:54 ` Andi Kleen 2010-04-15 23:40 ` Dave Chinner 2010-04-15 23:40 ` Dave Chinner 2010-04-16 7:13 ` Andi Kleen 2010-04-16 7:13 ` Andi Kleen 2010-04-16 14:57 ` Mel Gorman 2010-04-16 14:57 ` Mel Gorman 2010-04-17 2:37 ` Dave Chinner 2010-04-17 2:37 ` Dave Chinner 2010-04-16 14:55 ` Mel Gorman 2010-04-16 14:55 ` Mel Gorman 2010-04-15 18:22 ` Valdis.Kletnieks 2010-04-16 9:39 ` Mel Gorman 2010-04-16 9:39 ` Mel Gorman 2010-04-15 10:24 ` [PATCH 2/4] [cleanup] mm: introduce free_pages_prepare KOSAKI Motohiro 2010-04-15 10:24 ` KOSAKI Motohiro 2010-04-15 10:24 ` KOSAKI Motohiro 2010-04-15 13:33 ` Mel Gorman 2010-04-15 13:33 ` Mel Gorman 2010-04-15 10:24 ` [PATCH 3/4] mm: introduce free_pages_bulk KOSAKI Motohiro 2010-04-15 10:24 ` KOSAKI Motohiro 2010-04-15 10:24 ` KOSAKI Motohiro 2010-04-15 13:46 ` Mel Gorman 2010-04-15 13:46 ` Mel Gorman 2010-04-15 10:26 ` [PATCH 4/4] vmscan: replace the pagevec in shrink_inactive_list() with list KOSAKI Motohiro 2010-04-15 10:26 ` KOSAKI Motohiro 2010-04-15 10:28 ` [PATCH] mm: disallow direct reclaim page writeback Mel Gorman 2010-04-15 10:28 ` Mel Gorman 2010-04-15 13:42 ` Chris Mason 2010-04-15 13:42 ` Chris Mason 2010-04-15 17:50 ` tytso 2010-04-15 17:50 ` tytso 2010-04-15 17:50 ` tytso 2010-04-16 15:05 ` Mel Gorman 2010-04-16 15:05 ` Mel Gorman 2010-04-19 15:15 ` Mel Gorman 2010-04-19 15:15 ` Mel Gorman 2010-04-19 15:15 ` Mel Gorman 2010-04-19 17:38 ` Chris Mason 2010-04-16 15:05 ` Mel Gorman 2010-04-16 4:14 ` Dave Chinner 2010-04-16 4:14 ` Dave Chinner 2010-04-16 15:14 ` Mel Gorman 2010-04-16 15:14 ` Mel Gorman 2010-04-18 0:32 ` Andrew Morton 2010-04-18 0:32 ` Andrew Morton 2010-04-18 19:05 ` Christoph Hellwig 2010-04-18 19:05 ` Christoph Hellwig 2010-04-18 16:31 ` Andrew Morton 2010-04-18 16:31 ` Andrew Morton 2010-04-18 19:35 ` Christoph Hellwig 2010-04-18 19:35 ` Christoph Hellwig 2010-04-18 19:11 ` Sorin Faibish 2010-04-18 19:11 ` Sorin Faibish 2010-04-18 19:11 ` Sorin Faibish 2010-04-18 19:10 ` Sorin Faibish 2010-04-18 19:10 ` Sorin Faibish 2010-04-18 19:10 ` Sorin Faibish 2010-04-18 21:30 ` James Bottomley 2010-04-18 21:30 ` James Bottomley 2010-04-18 23:34 ` Sorin Faibish 2010-04-18 23:34 ` Sorin Faibish 2010-04-18 23:34 ` Sorin Faibish 2010-04-19 3:08 ` tytso 2010-04-19 3:08 ` tytso 2010-04-19 0:35 ` Dave Chinner 2010-04-19 0:35 ` Dave Chinner 2010-04-19 0:49 ` Arjan van de Ven 2010-04-19 0:49 ` Arjan van de Ven 2010-04-19 1:08 ` Dave Chinner 2010-04-19 1:08 ` Dave Chinner 2010-04-19 4:32 ` Arjan van de Ven 2010-04-19 4:32 ` Arjan van de Ven 2010-04-19 15:20 ` Mel Gorman 2010-04-19 15:20 ` Mel Gorman 2010-04-23 1:06 ` Dave Chinner 2010-04-23 1:06 ` Dave Chinner 2010-04-23 10:50 ` Mel Gorman 2010-04-23 10:50 ` Mel Gorman 2010-04-15 14:57 ` Andi Kleen 2010-04-15 14:57 ` Andi Kleen 2010-04-15 2:37 ` Johannes Weiner 2010-04-15 2:37 ` Johannes Weiner 2010-04-15 2:43 ` KOSAKI Motohiro 2010-04-15 2:43 ` KOSAKI Motohiro 2010-04-16 23:56 ` Johannes Weiner 2010-04-16 23:56 ` Johannes Weiner 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 6:52 ` KOSAKI Motohiro 2010-04-14 10:06 ` Andi Kleen 2010-04-14 10:06 ` Andi Kleen 2010-04-14 10:06 ` Andi Kleen 2010-04-14 11:20 ` Chris Mason 2010-04-14 11:20 ` Chris Mason 2010-04-14 12:15 ` Andi Kleen 2010-04-14 12:15 ` Andi Kleen 2010-04-14 12:15 ` Andi Kleen 2010-04-14 12:32 ` Alan Cox 2010-04-14 12:32 ` Alan Cox 2010-04-14 12:34 ` Andi Kleen 2010-04-14 12:34 ` Andi Kleen 2010-04-14 13:23 ` Mel Gorman 2010-04-14 13:23 ` Mel Gorman 2010-04-14 14:07 ` Chris Mason 2010-04-14 14:07 ` Chris Mason 2010-04-14 0:24 ` Minchan Kim 2010-04-14 0:24 ` Minchan Kim 2010-04-14 4:44 ` Dave Chinner 2010-04-14 4:44 ` Dave Chinner 2010-04-14 7:54 ` Minchan Kim 2010-04-14 7:54 ` Minchan Kim 2010-04-16 1:13 ` KAMEZAWA Hiroyuki 2010-04-16 1:13 ` KAMEZAWA Hiroyuki 2010-04-16 4:18 ` KAMEZAWA Hiroyuki 2010-04-16 4:18 ` KAMEZAWA Hiroyuki
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1271117878-19274-1-git-send-email-david@fromorbit.com \ --to=david@fromorbit.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.