[PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup
@ 2017-09-20 15:32 Jens Axboe
  2017-09-20 15:32 ` [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:32 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack

We've had some issues with writeback in presence of memory reclaim
at Facebook, and this patch set attempts to fix it up. The real
functional change is the last patch in the series, the first 5 are
prep and cleanup patches.

The basic idea is that we have callers that call
wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
everything'. For memory reclaim situations, we can end up queuing
a TON of these kinds of writeback units. This can cause softlockups
and further memory issues, since we allocate huge amounts of
struct wb_writeback_work to handle this writeback. Handle this
situation more gracefully.

Changes since v1:

- Rename WB_zero_pages to WB_start_all (Amir).
- Remove a test_bit() for a condition where we always expect the bit
  to be set.
- Remove 'nr_pages' from the wakeup flusher threads helpers, since
  everybody now passes in zero. Enables further cleanups in later
  patches too (Jan).
- Fix a case where I forgot to clear WB_start_all if 'work' allocation
  failed.
- Get rid of cond_resched() in the wb_do_writeback() loop.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
@ 2017-09-20 15:32 ` Jens Axboe
  2017-09-21 14:55   ` Christoph Hellwig
  2017-09-20 15:32 ` [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads() Jens Axboe
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:32 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

This whole function is... interesting. Change the wakeup call
to the flusher threads to pass in nr_pages == 0, instead of
some random number of pages. This matches more closely what
similar cases do for memory shortage/reclaim.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 170df856bdb9..9471a445e370 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -260,7 +260,7 @@ static void free_more_memory(void)
 	struct zoneref *z;
 	int nid;
 
-	wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
+	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
 	yield();
 
 	for_each_online_node(nid) {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads()
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
  2017-09-20 15:32 ` [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
@ 2017-09-20 15:32 ` Jens Axboe
  2017-09-21 14:55   ` Christoph Hellwig
  2017-09-22 13:12   ` Jan Kara
  2017-09-20 15:32 ` [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:32 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Everybody is passing in 0 now, let's get rid of the argument.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/buffer.c               | 2 +-
 fs/fs-writeback.c         | 9 ++++-----
 fs/sync.c                 | 2 +-
 include/linux/writeback.h | 2 +-
 mm/vmscan.c               | 2 +-
 5 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 9471a445e370..cf71926797d3 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -260,7 +260,7 @@ static void free_more_memory(void)
 	struct zoneref *z;
 	int nid;
 
-	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
+	wakeup_flusher_threads(WB_REASON_FREE_MORE_MEM);
 	yield();
 
 	for_each_online_node(nid) {
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 245c430a2e41..bb6148dc6d24 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1947,12 +1947,12 @@ void wb_workfn(struct work_struct *work)
 }
 
 /*
- * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
- * the whole world.
+ * Wakeup the flusher threads to start writeback of all currently dirty pages
  */
-void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
+void wakeup_flusher_threads(enum wb_reason reason)
 {
 	struct backing_dev_info *bdi;
+	long nr_pages;
 
 	/*
 	 * If we are expecting writeback progress we must submit plugged IO.
@@ -1960,8 +1960,7 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 	if (blk_needs_flush_plug(current))
 		blk_schedule_flush_plug(current);
 
-	if (!nr_pages)
-		nr_pages = get_nr_dirty_pages();
+	nr_pages = get_nr_dirty_pages();
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
diff --git a/fs/sync.c b/fs/sync.c
index a576aa2e6b09..09f96a18dd93 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -108,7 +108,7 @@ SYSCALL_DEFINE0(sync)
 {
 	int nowait = 0, wait = 1;
 
-	wakeup_flusher_threads(0, WB_REASON_SYNC);
+	wakeup_flusher_threads(WB_REASON_SYNC);
 	iterate_supers(sync_inodes_one_sb, NULL);
 	iterate_supers(sync_fs_one_sb, &nowait);
 	iterate_supers(sync_fs_one_sb, &wait);
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d5815794416c..1f9c6db5e29a 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -189,7 +189,7 @@ bool try_to_writeback_inodes_sb(struct super_block *, enum wb_reason reason);
 bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
 				   enum wb_reason reason);
 void sync_inodes_sb(struct super_block *);
-void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
+void wakeup_flusher_threads(enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 13d711dd8776..42a7fdd52d87 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1867,7 +1867,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * also allow kswapd to start writing pages during reclaim.
 		 */
 		if (stat.nr_unqueued_dirty == nr_taken) {
-			wakeup_flusher_threads(0, WB_REASON_VMSCAN);
+			wakeup_flusher_threads(WB_REASON_VMSCAN);
 			set_bit(PGDAT_DIRTY, &pgdat->flags);
 		}
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi()
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
  2017-09-20 15:32 ` [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
  2017-09-20 15:32 ` [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads() Jens Axboe
@ 2017-09-20 15:32 ` Jens Axboe
  2017-09-21 14:56   ` Christoph Hellwig
  2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:32 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Similar to wakeup_flusher_threads(), except that we only wake
up the flusher threads on the specified backing device.

No functional changes in this patch.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c         | 39 +++++++++++++++++++++++++++++----------
 include/linux/writeback.h |  2 ++
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index bb6148dc6d24..c7f99fd2c7f0 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1947,6 +1947,33 @@ void wb_workfn(struct work_struct *work)
 }
 
 /*
+ * Start writeback of `nr_pages' pages on this bdi. If `nr_pages' is zero,
+ * write back the whole world.
+ */
+static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
+					 long nr_pages, enum wb_reason reason)
+{
+	struct bdi_writeback *wb;
+
+	if (!bdi_has_dirty_io(bdi))
+		return;
+
+	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
+		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
+					   false, reason);
+}
+
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
+				enum wb_reason reason)
+{
+	long nr_pages = get_nr_dirty_pages();
+
+	rcu_read_lock();
+	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
+	rcu_read_unlock();
+}
+
+/*
  * Wakeup the flusher threads to start writeback of all currently dirty pages
  */
 void wakeup_flusher_threads(enum wb_reason reason)
@@ -1963,16 +1990,8 @@ void wakeup_flusher_threads(enum wb_reason reason)
 	nr_pages = get_nr_dirty_pages();
 
 	rcu_read_lock();
-	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
-		struct bdi_writeback *wb;
-
-		if (!bdi_has_dirty_io(bdi))
-			continue;
-
-		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
-			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
-					   false, reason);
-	}
+	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
+		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
 	rcu_read_unlock();
 }
 
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 1f9c6db5e29a..9c0091678af4 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -190,6 +190,8 @@ bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
 				   enum wb_reason reason);
 void sync_inodes_sb(struct super_block *);
 void wakeup_flusher_threads(enum wb_reason reason);
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
+				enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
                   ` (2 preceding siblings ...)
  2017-09-20 15:32 ` [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe
@ 2017-09-20 15:32 ` Jens Axboe
  2017-09-21 14:59   ` Christoph Hellwig
                     ` (2 more replies)
  2017-09-20 15:33 ` [PATCH 5/7] fs-writeback: make wb_start_writeback() static Jens Axboe
                   ` (2 subsequent siblings)
  6 siblings, 3 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:32 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Laptop mode really wants to writeback the number of dirty
pages and inodes. Instead of calculating this in the caller,
just pass in 0 and let wakeup_flusher_threads() handle it.

Use the new wakeup_flusher_threads_bdi() instead of rolling
our own. This changes the writeback to not be range cyclic,
but that should not matter for laptop mode flush-all
semantics.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/page-writeback.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0b9c5cbe8eba..8d1fc593bce8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1980,23 +1980,8 @@ int dirty_writeback_centisecs_handler(struct ctl_table *table, int write,
 void laptop_mode_timer_fn(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *)data;
-	int nr_pages = global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS);
-	struct bdi_writeback *wb;
 
-	/*
-	 * We want to write everything out, not just down to the dirty
-	 * threshold
-	 */
-	if (!bdi_has_dirty_io(q->backing_dev_info))
-		return;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
-		if (wb_has_dirty_io(wb))
-			wb_start_writeback(wb, nr_pages, true,
-					   WB_REASON_LAPTOP_TIMER);
-	rcu_read_unlock();
+	wakeup_flusher_threads_bdi(q->backing_dev_info, WB_REASON_LAPTOP_TIMER);
 }
 
 /*
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/7] fs-writeback: make wb_start_writeback() static
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
                   ` (3 preceding siblings ...)
  2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
@ 2017-09-20 15:33 ` Jens Axboe
  2017-09-21 15:01   ` Christoph Hellwig
  2017-09-20 15:33 ` [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
  2017-09-20 15:33 ` [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush Jens Axboe
  6 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:33 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

We don't have any callers outside of fs-writeback.c anymore,
make it private.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c           | 4 ++--
 include/linux/backing-dev.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index c7f99fd2c7f0..ecbd26d1121d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,8 +933,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason)
+static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
+			       bool range_cyclic, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 854e1bdd0b2a..157e950a70dc 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -38,8 +38,6 @@ static inline struct backing_dev_info *bdi_alloc(gfp_t gfp_mask)
 	return bdi_alloc_node(gfp_mask, NUMA_NO_NODE);
 }
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason);
 void wb_start_background_writeback(struct bdi_writeback *wb);
 void wb_workfn(struct work_struct *work);
 void wb_wakeup_delayed(struct bdi_writeback *wb);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
                   ` (4 preceding siblings ...)
  2017-09-20 15:33 ` [PATCH 5/7] fs-writeback: make wb_start_writeback() static Jens Axboe
@ 2017-09-20 15:33 ` Jens Axboe
  2017-09-21 15:02   ` Christoph Hellwig
  2017-09-22 13:17   ` Jan Kara
  2017-09-20 15:33 ` [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush Jens Axboe
  6 siblings, 2 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:33 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Now that we have no external callers of wb_start_writeback(), we
can shuffle the passing in of 'nr_pages'. Everybody passes in 0
at this point, so just kill the argument and move the dirty
count retrieval to that function.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c | 42 ++++++++++++++++++------------------------
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index ecbd26d1121d..3916ea2484ae 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,8 +933,19 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
-static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			       bool range_cyclic, enum wb_reason reason)
+/*
+ * Add in the number of potentially dirty inodes, because each inode
+ * write can dirty pagecache in the underlying blockdev.
+ */
+static unsigned long get_nr_dirty_pages(void)
+{
+	return global_node_page_state(NR_FILE_DIRTY) +
+		global_node_page_state(NR_UNSTABLE_NFS) +
+		get_nr_dirty_inodes();
+}
+
+static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
+			       enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
 
@@ -954,7 +965,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 	}
 
 	work->sync_mode	= WB_SYNC_NONE;
-	work->nr_pages	= nr_pages;
+	work->nr_pages	= wb_split_bdi_pages(wb, get_nr_dirty_pages());
 	work->range_cyclic = range_cyclic;
 	work->reason	= reason;
 	work->auto_free	= 1;
@@ -1814,17 +1825,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 	return work;
 }
 
-/*
- * Add in the number of potentially dirty inodes, because each inode
- * write can dirty pagecache in the underlying blockdev.
- */
-static unsigned long get_nr_dirty_pages(void)
-{
-	return global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS) +
-		get_nr_dirty_inodes();
-}
-
 static long wb_check_background_flush(struct bdi_writeback *wb)
 {
 	if (wb_over_bg_thresh(wb)) {
@@ -1951,7 +1951,7 @@ void wb_workfn(struct work_struct *work)
  * write back the whole world.
  */
 static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
-					 long nr_pages, enum wb_reason reason)
+					 enum wb_reason reason)
 {
 	struct bdi_writeback *wb;
 
@@ -1959,17 +1959,14 @@ static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 		return;
 
 	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
-		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
-					   false, reason);
+		wb_start_writeback(wb, false, reason);
 }
 
 void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 				enum wb_reason reason)
 {
-	long nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
-	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
+	__wakeup_flusher_threads_bdi(bdi, reason);
 	rcu_read_unlock();
 }
 
@@ -1979,7 +1976,6 @@ void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 void wakeup_flusher_threads(enum wb_reason reason)
 {
 	struct backing_dev_info *bdi;
-	long nr_pages;
 
 	/*
 	 * If we are expecting writeback progress we must submit plugged IO.
@@ -1987,11 +1983,9 @@ void wakeup_flusher_threads(enum wb_reason reason)
 	if (blk_needs_flush_plug(current))
 		blk_schedule_flush_plug(current);
 
-	nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
-		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
+		__wakeup_flusher_threads_bdi(bdi, reason);
 	rcu_read_unlock();
 }
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
                   ` (5 preceding siblings ...)
  2017-09-20 15:33 ` [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
@ 2017-09-20 15:33 ` Jens Axboe
  2017-09-21 15:05   ` Christoph Hellwig
  6 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2017-09-20 15:33 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

When someone calls wakeup_flusher_threads() or
wakeup_flusher_threads_bdi(), they schedule writeback of all dirty
pages in the system (or on that bdi). If we are tight on memory, we
can get tons of these queued from kswapd/vmscan. This causes (at
least) two problems:

1) We consume a ton of memory just allocating writeback work items.
2) We spend so much time processing these work items, that we
   introduce a softlockup in writeback processing.

Fix this by adding a 'start_all' bit to the writeback structure, and
set that when someone attempts to flush all dirty page.  The bit is
cleared when we start writeback on that work item. If the bit is
already set when we attempt to queue !nr_pages writeback, then we
simply ignore it.

This provides us one full flush in flight, with one pending as well,
and makes for more efficient handling of this type of writeback.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c                | 24 ++++++++++++++++++++++++
 include/linux/backing-dev-defs.h |  1 +
 2 files changed, 25 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 3916ea2484ae..6205319d0c24 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -53,6 +53,7 @@ struct wb_writeback_work {
 	unsigned int for_background:1;
 	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
 	unsigned int auto_free:1;	/* free on completion */
+	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
 	enum wb_reason reason;		/* why was writeback initiated? */
 
 	struct list_head list;		/* pending work list */
@@ -953,12 +954,26 @@ static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
 		return;
 
 	/*
+	 * All callers of this function want to start writeback of all
+	 * dirty pages. Places like vmscan can call this at a very
+	 * high frequency, causing pointless allocations of tons of
+	 * work items and keeping the flusher threads busy retrieving
+	 * that work. Ensure that we only allow one of them pending and
+	 * inflight at the time
+	 */
+	if (test_bit(WB_start_all, &wb->state))
+		return;
+
+	set_bit(WB_start_all, &wb->state);
+
+	/*
 	 * This is WB_SYNC_NONE writeback, so if allocation fails just
 	 * wakeup the thread for old dirty data writeback
 	 */
 	work = kzalloc(sizeof(*work),
 		       GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);
 	if (!work) {
+		clear_bit(WB_start_all, &wb->state);
 		trace_writeback_nowork(wb);
 		wb_wakeup(wb);
 		return;
@@ -969,6 +984,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
 	work->range_cyclic = range_cyclic;
 	work->reason	= reason;
 	work->auto_free	= 1;
+	work->start_all = 1;
 
 	wb_queue_work(wb, work);
 }
@@ -1822,6 +1838,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 		list_del_init(&work->list);
 	}
 	spin_unlock_bh(&wb->work_lock);
+
+	/*
+	 * Once we start processing a work item that had !nr_pages,
+	 * clear the wb state bit for that so we can allow more.
+	 */
+	if (work && work->start_all)
+		clear_bit(WB_start_all, &wb->state);
+
 	return work;
 }
 
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 866c433e7d32..420de5c7c7f9 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -24,6 +24,7 @@ enum wb_state {
 	WB_shutting_down,	/* wb_shutdown() in progress */
 	WB_writeback_running,	/* Writeback is in progress */
 	WB_has_dirty_io,	/* Dirty inodes on ->b_{dirty|io|more_io} */
+	WB_start_all,		/* nr_pages == 0 (all) work pending */
 };
 
 enum wb_congested_state {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-20 15:32 ` [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
@ 2017-09-21 14:55   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 14:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads()
  2017-09-20 15:32 ` [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads() Jens Axboe
@ 2017-09-21 14:55   ` Christoph Hellwig
  2017-09-22 13:12   ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 14:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi()
  2017-09-20 15:32 ` [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe
@ 2017-09-21 14:56   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 14:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
@ 2017-09-21 14:59   ` Christoph Hellwig
  2017-09-25 14:57     ` Jens Axboe
  2017-09-21 15:00   ` Christoph Hellwig
  2017-09-22 13:14   ` Jan Kara
  2 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 14:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed, Sep 20, 2017 at 09:32:59AM -0600, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own. This changes the writeback to not be range cyclic,
> but that should not matter for laptop mode flush-all
> semantics.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

While we're at sorting out the laptop_mode_wb_timer mess:
can we move initializing and deleting it from the block code
to the backing-dev code given that it now doesn't assume anything
about block devices any more?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
  2017-09-21 14:59   ` Christoph Hellwig
@ 2017-09-21 15:00   ` Christoph Hellwig
  2017-09-22 13:14   ` Jan Kara
  2 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 15:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed, Sep 20, 2017 at 09:32:59AM -0600, Jens Axboe wrote:
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own. This changes the writeback to not be range cyclic,
> but that should not matter for laptop mode flush-all
> semantics.

Oh btw - I think this actually is the more important change and
should probably be in the subject line.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/7] fs-writeback: make wb_start_writeback() static
  2017-09-20 15:33 ` [PATCH 5/7] fs-writeback: make wb_start_writeback() static Jens Axboe
@ 2017-09-21 15:01   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 15:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 15:33 ` [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
@ 2017-09-21 15:02   ` Christoph Hellwig
  2017-09-22 13:17   ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 15:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-20 15:33 ` [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush Jens Axboe
@ 2017-09-21 15:05   ` Christoph Hellwig
  2017-09-21 15:36     ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 15:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed, Sep 20, 2017 at 09:33:02AM -0600, Jens Axboe wrote:
> When someone calls wakeup_flusher_threads() or
> wakeup_flusher_threads_bdi(), they schedule writeback of all dirty
> pages in the system (or on that bdi). If we are tight on memory, we
> can get tons of these queued from kswapd/vmscan. This causes (at
> least) two problems:
> 
> 1) We consume a ton of memory just allocating writeback work items.
> 2) We spend so much time processing these work items, that we
>    introduce a softlockup in writeback processing.
> 
> Fix this by adding a 'start_all' bit to the writeback structure, and
> set that when someone attempts to flush all dirty page.  The bit is
> cleared when we start writeback on that work item. If the bit is
> already set when we attempt to queue !nr_pages writeback, then we
> simply ignore it.
> 
> This provides us one full flush in flight, with one pending as well,
> and makes for more efficient handling of this type of writeback.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Tested-by: Chris Mason <clm@fb.com>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/fs-writeback.c                | 24 ++++++++++++++++++++++++
>  include/linux/backing-dev-defs.h |  1 +
>  2 files changed, 25 insertions(+)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 3916ea2484ae..6205319d0c24 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>  	unsigned int for_background:1;
>  	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
>  	unsigned int auto_free:1;	/* free on completion */
> +	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
>  	enum wb_reason reason;		/* why was writeback initiated? */
>  
>  	struct list_head list;		/* pending work list */
> @@ -953,12 +954,26 @@ static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
>  		return;
>  
>  	/*
> +	 * All callers of this function want to start writeback of all
> +	 * dirty pages. Places like vmscan can call this at a very
> +	 * high frequency, causing pointless allocations of tons of
> +	 * work items and keeping the flusher threads busy retrieving
> +	 * that work. Ensure that we only allow one of them pending and
> +	 * inflight at the time
> +	 */
> +	if (test_bit(WB_start_all, &wb->state))
> +		return;
> +
> +	set_bit(WB_start_all, &wb->state);

This should be test_and_set_bit here..

But more importantly once we are not guaranteed that we only have
a single global wb_writeback_work per bdi_writeback we should just
embedd that into struct bdi_writeback instead of dynamically
allocating it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-21 15:05   ` Christoph Hellwig
@ 2017-09-21 15:36     ` Jens Axboe
  2017-09-21 16:00       ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2017-09-21 15:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On 09/21/2017 09:05 AM, Christoph Hellwig wrote:
> On Wed, Sep 20, 2017 at 09:33:02AM -0600, Jens Axboe wrote:
>> When someone calls wakeup_flusher_threads() or
>> wakeup_flusher_threads_bdi(), they schedule writeback of all dirty
>> pages in the system (or on that bdi). If we are tight on memory, we
>> can get tons of these queued from kswapd/vmscan. This causes (at
>> least) two problems:
>>
>> 1) We consume a ton of memory just allocating writeback work items.
>> 2) We spend so much time processing these work items, that we
>>    introduce a softlockup in writeback processing.
>>
>> Fix this by adding a 'start_all' bit to the writeback structure, and
>> set that when someone attempts to flush all dirty page.  The bit is
>> cleared when we start writeback on that work item. If the bit is
>> already set when we attempt to queue !nr_pages writeback, then we
>> simply ignore it.
>>
>> This provides us one full flush in flight, with one pending as well,
>> and makes for more efficient handling of this type of writeback.
>>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>> Tested-by: Chris Mason <clm@fb.com>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/fs-writeback.c                | 24 ++++++++++++++++++++++++
>>  include/linux/backing-dev-defs.h |  1 +
>>  2 files changed, 25 insertions(+)
>>
>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>> index 3916ea2484ae..6205319d0c24 100644
>> --- a/fs/fs-writeback.c
>> +++ b/fs/fs-writeback.c
>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>  	unsigned int for_background:1;
>>  	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
>>  	unsigned int auto_free:1;	/* free on completion */
>> +	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
>>  	enum wb_reason reason;		/* why was writeback initiated? */
>>  
>>  	struct list_head list;		/* pending work list */
>> @@ -953,12 +954,26 @@ static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
>>  		return;
>>  
>>  	/*
>> +	 * All callers of this function want to start writeback of all
>> +	 * dirty pages. Places like vmscan can call this at a very
>> +	 * high frequency, causing pointless allocations of tons of
>> +	 * work items and keeping the flusher threads busy retrieving
>> +	 * that work. Ensure that we only allow one of them pending and
>> +	 * inflight at the time
>> +	 */
>> +	if (test_bit(WB_start_all, &wb->state))
>> +		return;
>> +
>> +	set_bit(WB_start_all, &wb->state);
> 
> This should be test_and_set_bit here..

That's on purpose, doesn't matter if we race here, and if we're
being hammered with flusher thread wakeups, then we don't want to
turn that unlocked test into a locked instruction.

> But more importantly once we are not guaranteed that we only have
> a single global wb_writeback_work per bdi_writeback we should just
> embedd that into struct bdi_writeback instead of dynamically
> allocating it.
We could do this as a followup. But right now the logic is that we
can have on started (inflight), and still have one new queued.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-21 15:36     ` Jens Axboe
@ 2017-09-21 16:00       ` Jens Axboe
  2017-09-21 17:33         ` Christoph Hellwig
  2017-09-25  9:35         ` Jan Kara
  0 siblings, 2 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-21 16:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On 09/21/2017 09:36 AM, Jens Axboe wrote:
>> But more importantly once we are not guaranteed that we only have
>> a single global wb_writeback_work per bdi_writeback we should just
>> embedd that into struct bdi_writeback instead of dynamically
>> allocating it.
>
> We could do this as a followup. But right now the logic is that we
> can have on started (inflight), and still have one new queued.

Something like the below would fit on top to do that. Gets rid of the
allocation and embeds the work item for global start-all in the
bdi_writeback structure.

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6205319d0c24..9f3872e28c3f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -40,27 +40,6 @@ struct wb_completion {
 };
 
 /*
- * Passed into wb_writeback(), essentially a subset of writeback_control
- */
-struct wb_writeback_work {
-	long nr_pages;
-	struct super_block *sb;
-	unsigned long *older_than_this;
-	enum writeback_sync_modes sync_mode;
-	unsigned int tagged_writepages:1;
-	unsigned int for_kupdate:1;
-	unsigned int range_cyclic:1;
-	unsigned int for_background:1;
-	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
-	unsigned int auto_free:1;	/* free on completion */
-	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
-	enum wb_reason reason;		/* why was writeback initiated? */
-
-	struct list_head list;		/* pending work list */
-	struct wb_completion *done;	/* set if the caller waits */
-};
-
-/*
  * If one wants to wait for one or more wb_writeback_works, each work's
  * ->done should be set to a wb_completion defined using the following
  * macro.  Once all work items are issued with wb_queue_work(), the caller
@@ -181,6 +160,8 @@ static void finish_writeback_work(struct bdi_writeback *wb,
 
 	if (work->auto_free)
 		kfree(work);
+	else if (work->start_all)
+		clear_bit(WB_start_all, &wb->state);
 	if (done && atomic_dec_and_test(&done->cnt))
 		wake_up_all(&wb->bdi->wb_waitq);
 }
@@ -945,8 +926,7 @@ static unsigned long get_nr_dirty_pages(void)
 		get_nr_dirty_inodes();
 }
 
-static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
-			       enum wb_reason reason)
+static void wb_start_writeback(struct bdi_writeback *wb, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
 
@@ -961,29 +941,16 @@ static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
 	 * that work. Ensure that we only allow one of them pending and
 	 * inflight at the time
 	 */
-	if (test_bit(WB_start_all, &wb->state))
-		return;
-
-	set_bit(WB_start_all, &wb->state);
-
-	/*
-	 * This is WB_SYNC_NONE writeback, so if allocation fails just
-	 * wakeup the thread for old dirty data writeback
-	 */
-	work = kzalloc(sizeof(*work),
-		       GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);
-	if (!work) {
-		clear_bit(WB_start_all, &wb->state);
-		trace_writeback_nowork(wb);
-		wb_wakeup(wb);
+	if (test_bit(WB_start_all, &wb->state) ||
+	    test_and_set_bit(WB_start_all, &wb->state))
 		return;
-	}
 
+	work = &wb->wb_all_work;
+	memset(work, 0, sizeof(*work));
 	work->sync_mode	= WB_SYNC_NONE;
 	work->nr_pages	= wb_split_bdi_pages(wb, get_nr_dirty_pages());
-	work->range_cyclic = range_cyclic;
+	work->range_cyclic = false;
 	work->reason	= reason;
-	work->auto_free	= 1;
 	work->start_all = 1;
 
 	wb_queue_work(wb, work);
@@ -1838,14 +1805,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 		list_del_init(&work->list);
 	}
 	spin_unlock_bh(&wb->work_lock);
-
-	/*
-	 * Once we start processing a work item that had !nr_pages,
-	 * clear the wb state bit for that so we can allow more.
-	 */
-	if (work && work->start_all)
-		clear_bit(WB_start_all, &wb->state);
-
 	return work;
 }
 
@@ -1983,7 +1942,7 @@ static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 		return;
 
 	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
-		wb_start_writeback(wb, false, reason);
+		wb_start_writeback(wb, reason);
 }
 
 void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 420de5c7c7f9..4e1146ce5584 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -24,7 +24,7 @@ enum wb_state {
 	WB_shutting_down,	/* wb_shutdown() in progress */
 	WB_writeback_running,	/* Writeback is in progress */
 	WB_has_dirty_io,	/* Dirty inodes on ->b_{dirty|io|more_io} */
-	WB_start_all,		/* nr_pages == 0 (all) work pending */
+	WB_start_all,		/* wb->wb_all_work queued/allocated */
 };
 
 enum wb_congested_state {
@@ -65,6 +65,58 @@ struct bdi_writeback_congested {
 };
 
 /*
+ * fs/fs-writeback.c
+ */
+enum writeback_sync_modes {
+	WB_SYNC_NONE,	/* Don't wait on anything */
+	WB_SYNC_ALL,	/* Wait on every mapping */
+};
+
+/*
+ * why some writeback work was initiated
+ */
+enum wb_reason {
+	WB_REASON_BACKGROUND,
+	WB_REASON_VMSCAN,
+	WB_REASON_SYNC,
+	WB_REASON_PERIODIC,
+	WB_REASON_LAPTOP_TIMER,
+	WB_REASON_FREE_MORE_MEM,
+	WB_REASON_FS_FREE_SPACE,
+	/*
+	 * There is no bdi forker thread any more and works are done
+	 * by emergency worker, however, this is TPs userland visible
+	 * and we'll be exposing exactly the same information,
+	 * so it has a mismatch name.
+	 */
+	WB_REASON_FORKER_THREAD,
+
+	WB_REASON_MAX,
+};
+
+/*
+ * Passed into wb_writeback(), essentially a subset of writeback_control
+ */
+struct wb_completion;
+struct wb_writeback_work {
+	long nr_pages;
+	struct super_block *sb;
+	unsigned long *older_than_this;
+	enum writeback_sync_modes sync_mode;
+	unsigned int tagged_writepages:1;
+	unsigned int for_kupdate:1;
+	unsigned int range_cyclic:1;
+	unsigned int for_background:1;
+	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
+	unsigned int auto_free:1;	/* free on completion */
+	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
+	enum wb_reason reason;		/* why was writeback initiated? */
+
+	struct list_head list;		/* pending work list */
+	struct wb_completion *done;	/* set if the caller waits */
+};
+
+/*
  * Each wb (bdi_writeback) can perform writeback operations, is measured
  * and throttled, independently.  Without cgroup writeback, each bdi
  * (bdi_writeback) is served by its embedded bdi->wb.
@@ -125,6 +177,8 @@ struct bdi_writeback {
 
 	struct list_head bdi_node;	/* anchored at bdi->wb_list */
 
+	struct wb_writeback_work wb_all_work;
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 	struct percpu_ref refcnt;	/* used only for !root wb's */
 	struct fprop_local_percpu memcg_completions;
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9c0091678af4..f4b52ab328b2 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -34,36 +34,6 @@ DECLARE_PER_CPU(int, dirty_throttle_leaks);
 struct backing_dev_info;
 
 /*
- * fs/fs-writeback.c
- */
-enum writeback_sync_modes {
-	WB_SYNC_NONE,	/* Don't wait on anything */
-	WB_SYNC_ALL,	/* Wait on every mapping */
-};
-
-/*
- * why some writeback work was initiated
- */
-enum wb_reason {
-	WB_REASON_BACKGROUND,
-	WB_REASON_VMSCAN,
-	WB_REASON_SYNC,
-	WB_REASON_PERIODIC,
-	WB_REASON_LAPTOP_TIMER,
-	WB_REASON_FREE_MORE_MEM,
-	WB_REASON_FS_FREE_SPACE,
-	/*
-	 * There is no bdi forker thread any more and works are done
-	 * by emergency worker, however, this is TPs userland visible
-	 * and we'll be exposing exactly the same information,
-	 * so it has a mismatch name.
-	 */
-	WB_REASON_FORKER_THREAD,
-
-	WB_REASON_MAX,
-};
-
-/*
  * A control structure which tells the writeback code what to do.  These are
  * always on the stack, and hence need no locking.  They are always initialised
  * in a manner such that unspecified fields are set to zero.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-21 16:00       ` Jens Axboe
@ 2017-09-21 17:33         ` Christoph Hellwig
  2017-09-25  9:35         ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-21 17:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, hannes,
	clm, jack

On Thu, Sep 21, 2017 at 10:00:25AM -0600, Jens Axboe wrote:
> Something like the below would fit on top to do that. Gets rid of the
> allocation and embeds the work item for global start-all in the
> bdi_writeback structure.

Something like that.  Although if we still kalloc the global
wb we wouldn't have to expose all the details in the header.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads()
  2017-09-20 15:32 ` [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads() Jens Axboe
  2017-09-21 14:55   ` Christoph Hellwig
@ 2017-09-22 13:12   ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Kara @ 2017-09-22 13:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed 20-09-17 09:32:57, Jens Axboe wrote:
> Everybody is passing in 0 now, let's get rid of the argument.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza


> ---
>  fs/buffer.c               | 2 +-
>  fs/fs-writeback.c         | 9 ++++-----
>  fs/sync.c                 | 2 +-
>  include/linux/writeback.h | 2 +-
>  mm/vmscan.c               | 2 +-
>  5 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 9471a445e370..cf71926797d3 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -260,7 +260,7 @@ static void free_more_memory(void)
>  	struct zoneref *z;
>  	int nid;
>  
> -	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
> +	wakeup_flusher_threads(WB_REASON_FREE_MORE_MEM);
>  	yield();
>  
>  	for_each_online_node(nid) {
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 245c430a2e41..bb6148dc6d24 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1947,12 +1947,12 @@ void wb_workfn(struct work_struct *work)
>  }
>  
>  /*
> - * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
> - * the whole world.
> + * Wakeup the flusher threads to start writeback of all currently dirty pages
>   */
> -void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
> +void wakeup_flusher_threads(enum wb_reason reason)
>  {
>  	struct backing_dev_info *bdi;
> +	long nr_pages;
>  
>  	/*
>  	 * If we are expecting writeback progress we must submit plugged IO.
> @@ -1960,8 +1960,7 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
>  	if (blk_needs_flush_plug(current))
>  		blk_schedule_flush_plug(current);
>  
> -	if (!nr_pages)
> -		nr_pages = get_nr_dirty_pages();
> +	nr_pages = get_nr_dirty_pages();
>  
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
> diff --git a/fs/sync.c b/fs/sync.c
> index a576aa2e6b09..09f96a18dd93 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -108,7 +108,7 @@ SYSCALL_DEFINE0(sync)
>  {
>  	int nowait = 0, wait = 1;
>  
> -	wakeup_flusher_threads(0, WB_REASON_SYNC);
> +	wakeup_flusher_threads(WB_REASON_SYNC);
>  	iterate_supers(sync_inodes_one_sb, NULL);
>  	iterate_supers(sync_fs_one_sb, &nowait);
>  	iterate_supers(sync_fs_one_sb, &wait);
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index d5815794416c..1f9c6db5e29a 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -189,7 +189,7 @@ bool try_to_writeback_inodes_sb(struct super_block *, enum wb_reason reason);
>  bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
>  				   enum wb_reason reason);
>  void sync_inodes_sb(struct super_block *);
> -void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
> +void wakeup_flusher_threads(enum wb_reason reason);
>  void inode_wait_for_writeback(struct inode *inode);
>  
>  /* writeback.h requires fs.h; it, too, is not included from here. */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 13d711dd8776..42a7fdd52d87 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1867,7 +1867,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
>  		 * also allow kswapd to start writing pages during reclaim.
>  		 */
>  		if (stat.nr_unqueued_dirty == nr_taken) {
> -			wakeup_flusher_threads(0, WB_REASON_VMSCAN);
> +			wakeup_flusher_threads(WB_REASON_VMSCAN);
>  			set_bit(PGDAT_DIRTY, &pgdat->flags);
>  		}
>  
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
  2017-09-21 14:59   ` Christoph Hellwig
  2017-09-21 15:00   ` Christoph Hellwig
@ 2017-09-22 13:14   ` Jan Kara
  2 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2017-09-22 13:14 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed 20-09-17 09:32:59, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own. This changes the writeback to not be range cyclic,
> but that should not matter for laptop mode flush-all
> semantics.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Tested-by: Chris Mason <clm@fb.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

The subject of the patch looks stale.

									Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 15:33 ` [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
  2017-09-21 15:02   ` Christoph Hellwig
@ 2017-09-22 13:17   ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Kara @ 2017-09-22 13:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Wed 20-09-17 09:33:01, Jens Axboe wrote:
> Now that we have no external callers of wb_start_writeback(), we
> can shuffle the passing in of 'nr_pages'. Everybody passes in 0
> at this point, so just kill the argument and move the dirty
> count retrieval to that function.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Tested-by: Chris Mason <clm@fb.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

										Honza

> ---
>  fs/fs-writeback.c | 42 ++++++++++++++++++------------------------
>  1 file changed, 18 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index ecbd26d1121d..3916ea2484ae 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -933,8 +933,19 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
>  
>  #endif	/* CONFIG_CGROUP_WRITEBACK */
>  
> -static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> -			       bool range_cyclic, enum wb_reason reason)
> +/*
> + * Add in the number of potentially dirty inodes, because each inode
> + * write can dirty pagecache in the underlying blockdev.
> + */
> +static unsigned long get_nr_dirty_pages(void)
> +{
> +	return global_node_page_state(NR_FILE_DIRTY) +
> +		global_node_page_state(NR_UNSTABLE_NFS) +
> +		get_nr_dirty_inodes();
> +}
> +
> +static void wb_start_writeback(struct bdi_writeback *wb, bool range_cyclic,
> +			       enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
>  
> @@ -954,7 +965,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  	}
>  
>  	work->sync_mode	= WB_SYNC_NONE;
> -	work->nr_pages	= nr_pages;
> +	work->nr_pages	= wb_split_bdi_pages(wb, get_nr_dirty_pages());
>  	work->range_cyclic = range_cyclic;
>  	work->reason	= reason;
>  	work->auto_free	= 1;
> @@ -1814,17 +1825,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>  	return work;
>  }
>  
> -/*
> - * Add in the number of potentially dirty inodes, because each inode
> - * write can dirty pagecache in the underlying blockdev.
> - */
> -static unsigned long get_nr_dirty_pages(void)
> -{
> -	return global_node_page_state(NR_FILE_DIRTY) +
> -		global_node_page_state(NR_UNSTABLE_NFS) +
> -		get_nr_dirty_inodes();
> -}
> -
>  static long wb_check_background_flush(struct bdi_writeback *wb)
>  {
>  	if (wb_over_bg_thresh(wb)) {
> @@ -1951,7 +1951,7 @@ void wb_workfn(struct work_struct *work)
>   * write back the whole world.
>   */
>  static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
> -					 long nr_pages, enum wb_reason reason)
> +					 enum wb_reason reason)
>  {
>  	struct bdi_writeback *wb;
>  
> @@ -1959,17 +1959,14 @@ static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
>  		return;
>  
>  	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
> -		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
> -					   false, reason);
> +		wb_start_writeback(wb, false, reason);
>  }
>  
>  void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
>  				enum wb_reason reason)
>  {
> -	long nr_pages = get_nr_dirty_pages();
> -
>  	rcu_read_lock();
> -	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
> +	__wakeup_flusher_threads_bdi(bdi, reason);
>  	rcu_read_unlock();
>  }
>  
> @@ -1979,7 +1976,6 @@ void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
>  void wakeup_flusher_threads(enum wb_reason reason)
>  {
>  	struct backing_dev_info *bdi;
> -	long nr_pages;
>  
>  	/*
>  	 * If we are expecting writeback progress we must submit plugged IO.
> @@ -1987,11 +1983,9 @@ void wakeup_flusher_threads(enum wb_reason reason)
>  	if (blk_needs_flush_plug(current))
>  		blk_schedule_flush_plug(current);
>  
> -	nr_pages = get_nr_dirty_pages();
> -
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
> -		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
> +		__wakeup_flusher_threads_bdi(bdi, reason);
>  	rcu_read_unlock();
>  }
>  
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-21 16:00       ` Jens Axboe
  2017-09-21 17:33         ` Christoph Hellwig
@ 2017-09-25  9:35         ` Jan Kara
  2017-09-25 14:48           ` Jens Axboe
  2017-09-28 18:09           ` Jens Axboe
  1 sibling, 2 replies; 28+ messages in thread
From: Jan Kara @ 2017-09-25  9:35 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, hannes,
	clm, jack

On Thu 21-09-17 10:00:25, Jens Axboe wrote:
> On 09/21/2017 09:36 AM, Jens Axboe wrote:
> >> But more importantly once we are not guaranteed that we only have
> >> a single global wb_writeback_work per bdi_writeback we should just
> >> embedd that into struct bdi_writeback instead of dynamically
> >> allocating it.
> >
> > We could do this as a followup. But right now the logic is that we
> > can have on started (inflight), and still have one new queued.
> 
> Something like the below would fit on top to do that. Gets rid of the
> allocation and embeds the work item for global start-all in the
> bdi_writeback structure.

Hum, so when we consider stuff like embedded work item, I would somewhat
prefer to handle this like we do for for_background and for_kupdate style
writeback so that we don't have another special case. For these don't queue
any item, we just queue writeback work into the workqueue (via
wb_wakeup()). When flusher work gets processed wb_do_writeback() checks
(after processing all normal writeback requests) whether conditions for
these special writeback styles are met and if yes, it creates on-stack work
item and processes it (see wb_check_old_data_flush() and
wb_check_background_flush()).

So in this case we would just set some flag in bdi_writeback when memory
reclaim needs help and wb_do_writeback() would check for this flag and
create and process writeback-all style writeback work. Granted this does
not preserve ordering of requests (basically any specific request gets
priority over writeback-whole-world request) but memory gets cleaned in
either case so flusher should be doing what is needed.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-25  9:35         ` Jan Kara
@ 2017-09-25 14:48           ` Jens Axboe
  2017-09-28 18:09           ` Jens Axboe
  1 sibling, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-25 14:48 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/25/2017 03:35 AM, Jan Kara wrote:
> On Thu 21-09-17 10:00:25, Jens Axboe wrote:
>> On 09/21/2017 09:36 AM, Jens Axboe wrote:
>>>> But more importantly once we are not guaranteed that we only have
>>>> a single global wb_writeback_work per bdi_writeback we should just
>>>> embedd that into struct bdi_writeback instead of dynamically
>>>> allocating it.
>>>
>>> We could do this as a followup. But right now the logic is that we
>>> can have on started (inflight), and still have one new queued.
>>
>> Something like the below would fit on top to do that. Gets rid of the
>> allocation and embeds the work item for global start-all in the
>> bdi_writeback structure.
> 
> Hum, so when we consider stuff like embedded work item, I would somewhat
> prefer to handle this like we do for for_background and for_kupdate style
> writeback so that we don't have another special case. For these don't queue
> any item, we just queue writeback work into the workqueue (via
> wb_wakeup()). When flusher work gets processed wb_do_writeback() checks
> (after processing all normal writeback requests) whether conditions for
> these special writeback styles are met and if yes, it creates on-stack work
> item and processes it (see wb_check_old_data_flush() and
> wb_check_background_flush()).

Thanks Jan, I think that's a really good suggestion and kills the
special case completely. I'll rework the patch as a small series
for 4.15.

> So in this case we would just set some flag in bdi_writeback when memory
> reclaim needs help and wb_do_writeback() would check for this flag and
> create and process writeback-all style writeback work. Granted this does
> not preserve ordering of requests (basically any specific request gets
> priority over writeback-whole-world request) but memory gets cleaned in
> either case so flusher should be doing what is needed.

I don't think that matters, and we're already mostly there since we
reject a request if one is pending. And at this point they are all
identical "start all writeback" requests.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-21 14:59   ` Christoph Hellwig
@ 2017-09-25 14:57     ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-25 14:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On 09/21/2017 08:59 AM, Christoph Hellwig wrote:
> On Wed, Sep 20, 2017 at 09:32:59AM -0600, Jens Axboe wrote:
>> Laptop mode really wants to writeback the number of dirty
>> pages and inodes. Instead of calculating this in the caller,
>> just pass in 0 and let wakeup_flusher_threads() handle it.
>>
>> Use the new wakeup_flusher_threads_bdi() instead of rolling
>> our own. This changes the writeback to not be range cyclic,
>> but that should not matter for laptop mode flush-all
>> semantics.
> 
> Looks good,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> While we're at sorting out the laptop_mode_wb_timer mess:
> can we move initializing and deleting it from the block code
> to the backing-dev code given that it now doesn't assume anything
> about block devices any more?

Good point, I'll include that in a followup for 4.15.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-25  9:35         ` Jan Kara
  2017-09-25 14:48           ` Jens Axboe
@ 2017-09-28 18:09           ` Jens Axboe
  2017-09-29 23:20             ` Jens Axboe
  2017-10-02 14:53             ` Jan Kara
  1 sibling, 2 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-28 18:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/25/2017 11:35 AM, Jan Kara wrote:
> On Thu 21-09-17 10:00:25, Jens Axboe wrote:
>> On 09/21/2017 09:36 AM, Jens Axboe wrote:
>>>> But more importantly once we are not guaranteed that we only have
>>>> a single global wb_writeback_work per bdi_writeback we should just
>>>> embedd that into struct bdi_writeback instead of dynamically
>>>> allocating it.
>>>
>>> We could do this as a followup. But right now the logic is that we
>>> can have on started (inflight), and still have one new queued.
>>
>> Something like the below would fit on top to do that. Gets rid of the
>> allocation and embeds the work item for global start-all in the
>> bdi_writeback structure.
> 
> Hum, so when we consider stuff like embedded work item, I would somewhat
> prefer to handle this like we do for for_background and for_kupdate style
> writeback so that we don't have another special case. For these don't queue
> any item, we just queue writeback work into the workqueue (via
> wb_wakeup()). When flusher work gets processed wb_do_writeback() checks
> (after processing all normal writeback requests) whether conditions for
> these special writeback styles are met and if yes, it creates on-stack work
> item and processes it (see wb_check_old_data_flush() and
> wb_check_background_flush()).
> 
> So in this case we would just set some flag in bdi_writeback when memory
> reclaim needs help and wb_do_writeback() would check for this flag and
> create and process writeback-all style writeback work. Granted this does
> not preserve ordering of requests (basically any specific request gets
> priority over writeback-whole-world request) but memory gets cleaned in
> either case so flusher should be doing what is needed.

How about something like the below? It's on top of the latest series,
which is in my wb-start-all branch. It handles start_all like the
background/kupdate style writeback, reusing the WB_start_all bit for
that.

On a plane, so untested, but it seems pretty straight forward. It
changes the logic a little bit, as the WB_start_all bit isn't cleared
until after we're done with a flush-all request. At this point it's
truly on inflight at any point in time, not one inflight and one
potentially queued.

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 399619c97567..9e24d604c59c 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -53,7 +53,6 @@ struct wb_writeback_work {
 	unsigned int for_background:1;
 	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
 	unsigned int auto_free:1;	/* free on completion */
-	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
 	enum wb_reason reason;		/* why was writeback initiated? */
 
 	struct list_head list;		/* pending work list */
@@ -947,8 +946,6 @@ static unsigned long get_nr_dirty_pages(void)
 
 static void wb_start_writeback(struct bdi_writeback *wb, enum wb_reason reason)
 {
-	struct wb_writeback_work *work;
-
 	if (!wb_has_dirty_io(wb))
 		return;
 
@@ -958,35 +955,14 @@ static void wb_start_writeback(struct bdi_writeback *wb, enum wb_reason reason)
 	 * high frequency, causing pointless allocations of tons of
 	 * work items and keeping the flusher threads busy retrieving
 	 * that work. Ensure that we only allow one of them pending and
-	 * inflight at the time. It doesn't matter if we race a little
-	 * bit on this, so use the faster separate test/set bit variants.
+	 * inflight at the time.
 	 */
-	if (test_bit(WB_start_all, &wb->state))
+	if (test_bit(WB_start_all, &wb->state) ||
+	    test_and_set_bit(WB_start_all, &wb->state))
 		return;
 
-	set_bit(WB_start_all, &wb->state);
-
-	/*
-	 * This is WB_SYNC_NONE writeback, so if allocation fails just
-	 * wakeup the thread for old dirty data writeback
-	 */
-	work = kzalloc(sizeof(*work),
-		       GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);
-	if (!work) {
-		clear_bit(WB_start_all, &wb->state);
-		trace_writeback_nowork(wb);
-		wb_wakeup(wb);
-		return;
-	}
-
-	work->sync_mode	= WB_SYNC_NONE;
-	work->nr_pages	= wb_split_bdi_pages(wb, get_nr_dirty_pages());
-	work->range_cyclic = 1;
-	work->reason	= reason;
-	work->auto_free	= 1;
-	work->start_all = 1;
-
-	wb_queue_work(wb, work);
+	wb->start_all_reason = reason;
+	wb_wakeup(wb);
 }
 
 /**
@@ -1838,14 +1814,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 		list_del_init(&work->list);
 	}
 	spin_unlock_bh(&wb->work_lock);
-
-	/*
-	 * Once we start processing a work item that had !nr_pages,
-	 * clear the wb state bit for that so we can allow more.
-	 */
-	if (work && work->start_all)
-		clear_bit(WB_start_all, &wb->state);
-
 	return work;
 }
 
@@ -1901,6 +1869,30 @@ static long wb_check_old_data_flush(struct bdi_writeback *wb)
 	return 0;
 }
 
+static long wb_check_start_all(struct bdi_writeback *wb)
+{
+	long nr_pages;
+
+	if (!test_bit(WB_start_all, &wb->state))
+		return 0;
+
+	nr_pages = get_nr_dirty_pages();
+	if (nr_pages) {
+		struct wb_writeback_work work = {
+			.nr_pages	= wb_split_bdi_pages(wb, nr_pages),
+			.sync_mode	= WB_SYNC_NONE,
+			.range_cyclic	= 1,
+			.reason		= wb->start_all_reason,
+		};
+
+		nr_pages = wb_writeback(wb, &work);
+	}
+
+	clear_bit(WB_start_all, &wb->state);
+	return nr_pages;
+}
+
+
 /*
  * Retrieve work items and do the writeback they describe
  */
@@ -1917,6 +1909,11 @@ static long wb_do_writeback(struct bdi_writeback *wb)
 	}
 
 	/*
+	 * Check for a flush-everything request
+	 */
+	wrote += wb_check_start_all(wb);
+
+	/*
 	 * Check for periodic writeback, kupdated() style
 	 */
 	wrote += wb_check_old_data_flush(wb);
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 420de5c7c7f9..f0f1df29d6b8 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -116,6 +116,7 @@ struct bdi_writeback {
 
 	struct fprop_local_percpu completions;
 	int dirty_exceeded;
+	int start_all_reason;
 
 	spinlock_t work_lock;		/* protects work_list & dwork scheduling */
 	struct list_head work_list;
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 9b57f014d79d..19a0ea08e098 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -286,7 +286,6 @@ DEFINE_EVENT(writeback_class, name, \
 	TP_PROTO(struct bdi_writeback *wb), \
 	TP_ARGS(wb))
 
-DEFINE_WRITEBACK_EVENT(writeback_nowork);
 DEFINE_WRITEBACK_EVENT(writeback_wake_background);
 
 TRACE_EVENT(writeback_bdi_register,

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-28 18:09           ` Jens Axboe
@ 2017-09-29 23:20             ` Jens Axboe
  2017-10-02 14:53             ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2017-09-29 23:20 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/28/2017 08:09 PM, Jens Axboe wrote:
> On 09/25/2017 11:35 AM, Jan Kara wrote:
>> On Thu 21-09-17 10:00:25, Jens Axboe wrote:
>>> On 09/21/2017 09:36 AM, Jens Axboe wrote:
>>>>> But more importantly once we are not guaranteed that we only have
>>>>> a single global wb_writeback_work per bdi_writeback we should just
>>>>> embedd that into struct bdi_writeback instead of dynamically
>>>>> allocating it.
>>>>
>>>> We could do this as a followup. But right now the logic is that we
>>>> can have on started (inflight), and still have one new queued.
>>>
>>> Something like the below would fit on top to do that. Gets rid of the
>>> allocation and embeds the work item for global start-all in the
>>> bdi_writeback structure.
>>
>> Hum, so when we consider stuff like embedded work item, I would somewhat
>> prefer to handle this like we do for for_background and for_kupdate style
>> writeback so that we don't have another special case. For these don't queue
>> any item, we just queue writeback work into the workqueue (via
>> wb_wakeup()). When flusher work gets processed wb_do_writeback() checks
>> (after processing all normal writeback requests) whether conditions for
>> these special writeback styles are met and if yes, it creates on-stack work
>> item and processes it (see wb_check_old_data_flush() and
>> wb_check_background_flush()).
>>
>> So in this case we would just set some flag in bdi_writeback when memory
>> reclaim needs help and wb_do_writeback() would check for this flag and
>> create and process writeback-all style writeback work. Granted this does
>> not preserve ordering of requests (basically any specific request gets
>> priority over writeback-whole-world request) but memory gets cleaned in
>> either case so flusher should be doing what is needed.
> 
> How about something like the below? It's on top of the latest series,
> which is in my wb-start-all branch. It handles start_all like the
> background/kupdate style writeback, reusing the WB_start_all bit for
> that.
> 
> On a plane, so untested, but it seems pretty straight forward. It
> changes the logic a little bit, as the WB_start_all bit isn't cleared
> until after we're done with a flush-all request. At this point it's
> truly on inflight at any point in time, not one inflight and one
> potentially queued.

I tested it, with adding a patch that actually enables laptop completion
triggers on blk-mq (not there before, an oversight, will send that out
separately). It works fine for me, verified with tracing that we do
trigger flushes with completions from laptop mode.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush
  2017-09-28 18:09           ` Jens Axboe
  2017-09-29 23:20             ` Jens Axboe
@ 2017-10-02 14:53             ` Jan Kara
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Kara @ 2017-10-02 14:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jan Kara, Christoph Hellwig, linux-kernel, linux-fsdevel,
	linux-mm, hannes, clm

On Thu 28-09-17 20:09:50, Jens Axboe wrote:
> On 09/25/2017 11:35 AM, Jan Kara wrote:
> > On Thu 21-09-17 10:00:25, Jens Axboe wrote:
> >> On 09/21/2017 09:36 AM, Jens Axboe wrote:
> >>>> But more importantly once we are not guaranteed that we only have
> >>>> a single global wb_writeback_work per bdi_writeback we should just
> >>>> embedd that into struct bdi_writeback instead of dynamically
> >>>> allocating it.
> >>>
> >>> We could do this as a followup. But right now the logic is that we
> >>> can have on started (inflight), and still have one new queued.
> >>
> >> Something like the below would fit on top to do that. Gets rid of the
> >> allocation and embeds the work item for global start-all in the
> >> bdi_writeback structure.
> > 
> > Hum, so when we consider stuff like embedded work item, I would somewhat
> > prefer to handle this like we do for for_background and for_kupdate style
> > writeback so that we don't have another special case. For these don't queue
> > any item, we just queue writeback work into the workqueue (via
> > wb_wakeup()). When flusher work gets processed wb_do_writeback() checks
> > (after processing all normal writeback requests) whether conditions for
> > these special writeback styles are met and if yes, it creates on-stack work
> > item and processes it (see wb_check_old_data_flush() and
> > wb_check_background_flush()).
> > 
> > So in this case we would just set some flag in bdi_writeback when memory
> > reclaim needs help and wb_do_writeback() would check for this flag and
> > create and process writeback-all style writeback work. Granted this does
> > not preserve ordering of requests (basically any specific request gets
> > priority over writeback-whole-world request) but memory gets cleaned in
> > either case so flusher should be doing what is needed.
> 
> How about something like the below? It's on top of the latest series,
> which is in my wb-start-all branch. It handles start_all like the
> background/kupdate style writeback, reusing the WB_start_all bit for
> that.
> 
> On a plane, so untested, but it seems pretty straight forward. It
> changes the logic a little bit, as the WB_start_all bit isn't cleared
> until after we're done with a flush-all request. At this point it's
> truly on inflight at any point in time, not one inflight and one
> potentially queued.

Thanks for looking into this! I like the change.

								Honza

> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 399619c97567..9e24d604c59c 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -53,7 +53,6 @@ struct wb_writeback_work {
>  	unsigned int for_background:1;
>  	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
>  	unsigned int auto_free:1;	/* free on completion */
> -	unsigned int start_all:1;	/* nr_pages == 0 (all) writeback */
>  	enum wb_reason reason;		/* why was writeback initiated? */
>  
>  	struct list_head list;		/* pending work list */
> @@ -947,8 +946,6 @@ static unsigned long get_nr_dirty_pages(void)
>  
>  static void wb_start_writeback(struct bdi_writeback *wb, enum wb_reason reason)
>  {
> -	struct wb_writeback_work *work;
> -
>  	if (!wb_has_dirty_io(wb))
>  		return;
>  
> @@ -958,35 +955,14 @@ static void wb_start_writeback(struct bdi_writeback *wb, enum wb_reason reason)
>  	 * high frequency, causing pointless allocations of tons of
>  	 * work items and keeping the flusher threads busy retrieving
>  	 * that work. Ensure that we only allow one of them pending and
> -	 * inflight at the time. It doesn't matter if we race a little
> -	 * bit on this, so use the faster separate test/set bit variants.
> +	 * inflight at the time.
>  	 */
> -	if (test_bit(WB_start_all, &wb->state))
> +	if (test_bit(WB_start_all, &wb->state) ||
> +	    test_and_set_bit(WB_start_all, &wb->state))
>  		return;
>  
> -	set_bit(WB_start_all, &wb->state);
> -
> -	/*
> -	 * This is WB_SYNC_NONE writeback, so if allocation fails just
> -	 * wakeup the thread for old dirty data writeback
> -	 */
> -	work = kzalloc(sizeof(*work),
> -		       GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);
> -	if (!work) {
> -		clear_bit(WB_start_all, &wb->state);
> -		trace_writeback_nowork(wb);
> -		wb_wakeup(wb);
> -		return;
> -	}
> -
> -	work->sync_mode	= WB_SYNC_NONE;
> -	work->nr_pages	= wb_split_bdi_pages(wb, get_nr_dirty_pages());
> -	work->range_cyclic = 1;
> -	work->reason	= reason;
> -	work->auto_free	= 1;
> -	work->start_all = 1;
> -
> -	wb_queue_work(wb, work);
> +	wb->start_all_reason = reason;
> +	wb_wakeup(wb);
>  }
>  
>  /**
> @@ -1838,14 +1814,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>  		list_del_init(&work->list);
>  	}
>  	spin_unlock_bh(&wb->work_lock);
> -
> -	/*
> -	 * Once we start processing a work item that had !nr_pages,
> -	 * clear the wb state bit for that so we can allow more.
> -	 */
> -	if (work && work->start_all)
> -		clear_bit(WB_start_all, &wb->state);
> -
>  	return work;
>  }
>  
> @@ -1901,6 +1869,30 @@ static long wb_check_old_data_flush(struct bdi_writeback *wb)
>  	return 0;
>  }
>  
> +static long wb_check_start_all(struct bdi_writeback *wb)
> +{
> +	long nr_pages;
> +
> +	if (!test_bit(WB_start_all, &wb->state))
> +		return 0;
> +
> +	nr_pages = get_nr_dirty_pages();
> +	if (nr_pages) {
> +		struct wb_writeback_work work = {
> +			.nr_pages	= wb_split_bdi_pages(wb, nr_pages),
> +			.sync_mode	= WB_SYNC_NONE,
> +			.range_cyclic	= 1,
> +			.reason		= wb->start_all_reason,
> +		};
> +
> +		nr_pages = wb_writeback(wb, &work);
> +	}
> +
> +	clear_bit(WB_start_all, &wb->state);
> +	return nr_pages;
> +}
> +
> +
>  /*
>   * Retrieve work items and do the writeback they describe
>   */
> @@ -1917,6 +1909,11 @@ static long wb_do_writeback(struct bdi_writeback *wb)
>  	}
>  
>  	/*
> +	 * Check for a flush-everything request
> +	 */
> +	wrote += wb_check_start_all(wb);
> +
> +	/*
>  	 * Check for periodic writeback, kupdated() style
>  	 */
>  	wrote += wb_check_old_data_flush(wb);
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index 420de5c7c7f9..f0f1df29d6b8 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -116,6 +116,7 @@ struct bdi_writeback {
>  
>  	struct fprop_local_percpu completions;
>  	int dirty_exceeded;
> +	int start_all_reason;
>  
>  	spinlock_t work_lock;		/* protects work_list & dwork scheduling */
>  	struct list_head work_list;
> diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
> index 9b57f014d79d..19a0ea08e098 100644
> --- a/include/trace/events/writeback.h
> +++ b/include/trace/events/writeback.h
> @@ -286,7 +286,6 @@ DEFINE_EVENT(writeback_class, name, \
>  	TP_PROTO(struct bdi_writeback *wb), \
>  	TP_ARGS(wb))
>  
> -DEFINE_WRITEBACK_EVENT(writeback_nowork);
>  DEFINE_WRITEBACK_EVENT(writeback_wake_background);
>  
>  TRACE_EVENT(writeback_bdi_register,
> 
> -- 
> Jens Axboe
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-10-02 14:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-20 15:32 [PATCH 0/7 v2] More graceful flusher thread memory reclaim wakeup Jens Axboe
2017-09-20 15:32 ` [PATCH 1/7] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
2017-09-21 14:55   ` Christoph Hellwig
2017-09-20 15:32 ` [PATCH 2/7] fs: kill 'nr_pages' argument from wakeup_flusher_threads() Jens Axboe
2017-09-21 14:55   ` Christoph Hellwig
2017-09-22 13:12   ` Jan Kara
2017-09-20 15:32 ` [PATCH 3/7] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe
2017-09-21 14:56   ` Christoph Hellwig
2017-09-20 15:32 ` [PATCH 4/7] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
2017-09-21 14:59   ` Christoph Hellwig
2017-09-25 14:57     ` Jens Axboe
2017-09-21 15:00   ` Christoph Hellwig
2017-09-22 13:14   ` Jan Kara
2017-09-20 15:33 ` [PATCH 5/7] fs-writeback: make wb_start_writeback() static Jens Axboe
2017-09-21 15:01   ` Christoph Hellwig
2017-09-20 15:33 ` [PATCH 6/7] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
2017-09-21 15:02   ` Christoph Hellwig
2017-09-22 13:17   ` Jan Kara
2017-09-20 15:33 ` [PATCH 7/7] fs-writeback: only allow one inflight and pending full flush Jens Axboe
2017-09-21 15:05   ` Christoph Hellwig
2017-09-21 15:36     ` Jens Axboe
2017-09-21 16:00       ` Jens Axboe
2017-09-21 17:33         ` Christoph Hellwig
2017-09-25  9:35         ` Jan Kara
2017-09-25 14:48           ` Jens Axboe
2017-09-28 18:09           ` Jens Axboe
2017-09-29 23:20             ` Jens Axboe
2017-10-02 14:53             ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).