From: Jens Axboe <axboe@kernel.dk> To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, clm@fb.com, jack@suse.cz, Jens Axboe <axboe@kernel.dk> Subject: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush Date: Tue, 19 Sep 2017 13:53:07 -0600 [thread overview] Message-ID: <1505850787-18311-7-git-send-email-axboe@kernel.dk> (raw) In-Reply-To: <1505850787-18311-1-git-send-email-axboe@kernel.dk> A few callers pass in nr_pages == 0 when they wakeup the flusher threads, which means that the flusher should just flush everything that was currently dirty. If we are tight on memory, we can get tons of these queued from kswapd/vmscan. This causes (at least) two problems: 1) We consume a ton of memory just allocating writeback work items. 2) We spend so much time processing these work items, that we introduce a softlockup in writeback processing. Fix this by adding a 'zero_pages' bit to the writeback structure, and set that when someone queues a nr_pages==0 flusher thread wakeup. The bit is cleared when we start writeback on that work item. If the bit is already set when we attempt to queue !nr_pages writeback, then we simply ignore it. This provides us one of full flush in flight, with one pending as well, and makes for more efficient handling of this type of writeback. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/fs-writeback.c | 30 ++++++++++++++++++++++++++++-- include/linux/backing-dev-defs.h | 1 + 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index a9a86644cb9f..e0240110b36f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -53,6 +53,7 @@ struct wb_writeback_work { unsigned int for_background:1; unsigned int for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned int auto_free:1; /* free on completion */ + unsigned int zero_pages:1; /* nr_pages == 0 writeback */ enum wb_reason reason; /* why was writeback initiated? */ struct list_head list; /* pending work list */ @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, bool range_cyclic, enum wb_reason reason) { struct wb_writeback_work *work; + bool zero_pages = false; if (!wb_has_dirty_io(wb)) return; /* - * If someone asked for zero pages, we write out the WORLD + * If someone asked for zero pages, we write out the WORLD. + * Places like vmscan and laptop mode want to queue a wakeup to + * the flusher threads to clean out everything. To avoid potentially + * having tons of these pending, ensure that we only allow one of + * them pending and inflight at the time */ - if (!nr_pages) + if (!nr_pages) { + if (test_bit(WB_zero_pages, &wb->state)) + return; + set_bit(WB_zero_pages, &wb->state); nr_pages = get_nr_dirty_pages(); + zero_pages = true; + } /* * This is WB_SYNC_NONE writeback, so if allocation fails just @@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, work->range_cyclic = range_cyclic; work->reason = reason; work->auto_free = 1; + work->zero_pages = zero_pages; wb_queue_work(wb, work); } @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb) list_del_init(&work->list); } spin_unlock_bh(&wb->work_lock); + + /* + * Once we start processing a work item that had !nr_pages, + * clear the wb state bit for that so we can allow more. + */ + if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state)) + clear_bit(WB_zero_pages, &wb->state); + return work; } @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb) trace_writeback_exec(wb, work); wrote += wb_writeback(wb, work); finish_writeback_work(wb, work); + + /* + * If we have a lot of pending work, make sure we take + * an occasional breather, if needed. + */ + cond_resched(); } /* diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 866c433e7d32..7494f6a75458 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -24,6 +24,7 @@ enum wb_state { WB_shutting_down, /* wb_shutdown() in progress */ WB_writeback_running, /* Writeback is in progress */ WB_has_dirty_io, /* Dirty inodes on ->b_{dirty|io|more_io} */ + WB_zero_pages, /* nr_pages == 0 flush pending */ }; enum wb_congested_state { -- 2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk> To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, clm@fb.com, jack@suse.cz, Jens Axboe <axboe@kernel.dk> Subject: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush Date: Tue, 19 Sep 2017 13:53:07 -0600 [thread overview] Message-ID: <1505850787-18311-7-git-send-email-axboe@kernel.dk> (raw) In-Reply-To: <1505850787-18311-1-git-send-email-axboe@kernel.dk> A few callers pass in nr_pages == 0 when they wakeup the flusher threads, which means that the flusher should just flush everything that was currently dirty. If we are tight on memory, we can get tons of these queued from kswapd/vmscan. This causes (at least) two problems: 1) We consume a ton of memory just allocating writeback work items. 2) We spend so much time processing these work items, that we introduce a softlockup in writeback processing. Fix this by adding a 'zero_pages' bit to the writeback structure, and set that when someone queues a nr_pages==0 flusher thread wakeup. The bit is cleared when we start writeback on that work item. If the bit is already set when we attempt to queue !nr_pages writeback, then we simply ignore it. This provides us one of full flush in flight, with one pending as well, and makes for more efficient handling of this type of writeback. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/fs-writeback.c | 30 ++++++++++++++++++++++++++++-- include/linux/backing-dev-defs.h | 1 + 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index a9a86644cb9f..e0240110b36f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -53,6 +53,7 @@ struct wb_writeback_work { unsigned int for_background:1; unsigned int for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned int auto_free:1; /* free on completion */ + unsigned int zero_pages:1; /* nr_pages == 0 writeback */ enum wb_reason reason; /* why was writeback initiated? */ struct list_head list; /* pending work list */ @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, bool range_cyclic, enum wb_reason reason) { struct wb_writeback_work *work; + bool zero_pages = false; if (!wb_has_dirty_io(wb)) return; /* - * If someone asked for zero pages, we write out the WORLD + * If someone asked for zero pages, we write out the WORLD. + * Places like vmscan and laptop mode want to queue a wakeup to + * the flusher threads to clean out everything. To avoid potentially + * having tons of these pending, ensure that we only allow one of + * them pending and inflight at the time */ - if (!nr_pages) + if (!nr_pages) { + if (test_bit(WB_zero_pages, &wb->state)) + return; + set_bit(WB_zero_pages, &wb->state); nr_pages = get_nr_dirty_pages(); + zero_pages = true; + } /* * This is WB_SYNC_NONE writeback, so if allocation fails just @@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, work->range_cyclic = range_cyclic; work->reason = reason; work->auto_free = 1; + work->zero_pages = zero_pages; wb_queue_work(wb, work); } @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb) list_del_init(&work->list); } spin_unlock_bh(&wb->work_lock); + + /* + * Once we start processing a work item that had !nr_pages, + * clear the wb state bit for that so we can allow more. + */ + if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state)) + clear_bit(WB_zero_pages, &wb->state); + return work; } @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb) trace_writeback_exec(wb, work); wrote += wb_writeback(wb, work); finish_writeback_work(wb, work); + + /* + * If we have a lot of pending work, make sure we take + * an occasional breather, if needed. + */ + cond_resched(); } /* diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 866c433e7d32..7494f6a75458 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -24,6 +24,7 @@ enum wb_state { WB_shutting_down, /* wb_shutdown() in progress */ WB_writeback_running, /* Writeback is in progress */ WB_has_dirty_io, /* Dirty inodes on ->b_{dirty|io|more_io} */ + WB_zero_pages, /* nr_pages == 0 flush pending */ }; enum wb_congested_state { -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-09-19 19:53 UTC|newest] Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-09-19 19:53 [PATCH 0/6] More graceful flusher thread memory reclaim wakeup Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 19:53 ` [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 20:05 ` Johannes Weiner 2017-09-19 20:05 ` Johannes Weiner 2017-09-20 14:17 ` Jan Kara 2017-09-20 14:17 ` Jan Kara 2017-09-20 15:18 ` Jens Axboe 2017-09-20 15:18 ` Jens Axboe 2017-09-19 19:53 ` [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 20:05 ` Johannes Weiner 2017-09-19 20:05 ` Johannes Weiner 2017-09-20 14:19 ` Jan Kara 2017-09-20 14:19 ` Jan Kara 2017-09-19 19:53 ` [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 20:06 ` Johannes Weiner 2017-09-19 20:06 ` Johannes Weiner 2017-09-20 14:35 ` Jan Kara 2017-09-20 14:35 ` Jan Kara 2017-09-20 15:19 ` Jens Axboe 2017-09-20 15:19 ` Jens Axboe 2017-09-19 19:53 ` [PATCH 4/6] fs-writeback: make wb_start_writeback() static Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 20:07 ` Johannes Weiner 2017-09-19 20:07 ` Johannes Weiner 2017-09-20 14:35 ` Jan Kara 2017-09-20 14:35 ` Jan Kara 2017-09-19 19:53 ` [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe 2017-09-19 19:53 ` Jens Axboe 2017-09-19 20:07 ` Johannes Weiner 2017-09-19 20:07 ` Johannes Weiner 2017-09-20 14:41 ` Jan Kara 2017-09-20 14:41 ` Jan Kara 2017-09-20 15:05 ` Jens Axboe 2017-09-20 15:05 ` Jens Axboe 2017-09-20 15:36 ` Jan Kara 2017-09-20 15:36 ` Jan Kara 2017-09-20 15:40 ` Jens Axboe 2017-09-20 15:40 ` Jens Axboe 2017-09-19 19:53 ` Jens Axboe [this message] 2017-09-19 19:53 ` [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush Jens Axboe 2017-09-19 20:18 ` Johannes Weiner 2017-09-19 20:18 ` Johannes Weiner 2017-09-19 20:39 ` Jens Axboe 2017-09-19 20:39 ` Jens Axboe 2017-09-20 1:57 ` Jens Axboe 2017-09-20 1:57 ` Jens Axboe 2017-09-20 3:10 ` Amir Goldstein 2017-09-20 3:10 ` Amir Goldstein 2017-09-20 4:13 ` Jens Axboe 2017-09-20 4:13 ` Jens Axboe 2017-09-20 6:05 ` Amir Goldstein 2017-09-20 6:05 ` Amir Goldstein 2017-09-20 12:35 ` Jens Axboe 2017-09-20 12:35 ` Jens Axboe 2017-09-20 14:43 ` Jan Kara 2017-09-20 14:43 ` Jan Kara 2017-09-20 19:29 ` [PATCH 0/6] More graceful flusher thread memory reclaim wakeup John Stoffel 2017-09-20 19:29 ` John Stoffel 2017-09-20 19:32 ` Jens Axboe 2017-09-20 19:32 ` Jens Axboe 2017-09-20 23:11 ` Johannes Weiner 2017-09-20 23:11 ` Johannes Weiner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1505850787-18311-7-git-send-email-axboe@kernel.dk \ --to=axboe@kernel.dk \ --cc=clm@fb.com \ --cc=hannes@cmpxchg.org \ --cc=jack@suse.cz \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.