[PATCH 0/6] More graceful flusher thread memory reclaim wakeup

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
@ 2017-09-19 19:53 ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack

We've had some issues with writeback in presence of memory reclaim
at Facebook, and this patch set attempts to fix it up. The real
functional change is the last patch in the series, the first 5 are
prep and cleanup patches.

The basic idea is that we have callers that call
wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
everything'. For memory reclaim situations, we can end up queuing
a TON of these kinds of writeback units. This can cause softlockups
and further memory issues, since we allocate huge amounts of
struct wb_writeback_work to handle this writeback. Handle this
situation more gracefully.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
@ 2017-09-19 19:53 ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack

We've had some issues with writeback in presence of memory reclaim
at Facebook, and this patch set attempts to fix it up. The real
functional change is the last patch in the series, the first 5 are
prep and cleanup patches.

The basic idea is that we have callers that call
wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
everything'. For memory reclaim situations, we can end up queuing
a TON of these kinds of writeback units. This can cause softlockups
and further memory issues, since we allocate huge amounts of
struct wb_writeback_work to handle this writeback. Handle this
situation more gracefully.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

This whole function is... interesting. Change the wakeup call
to the flusher threads to pass in nr_pages == 0, instead of
some random number of pages. This matches more closely what
similar cases do for memory shortage/reclaim.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 170df856bdb9..9471a445e370 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -260,7 +260,7 @@ static void free_more_memory(void)
 	struct zoneref *z;
 	int nid;
 
-	wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
+	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
 	yield();
 
 	for_each_online_node(nid) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

This whole function is... interesting. Change the wakeup call
to the flusher threads to pass in nr_pages == 0, instead of
some random number of pages. This matches more closely what
similar cases do for memory shortage/reclaim.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 170df856bdb9..9471a445e370 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -260,7 +260,7 @@ static void free_more_memory(void)
 	struct zoneref *z;
 	int nid;
 
-	wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
+	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
 	yield();
 
 	for_each_online_node(nid) {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Similar to wakeup_flusher_threads(), except that we only wake
up the flusher threads on the specified backing device.

No functional changes in this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c         | 40 ++++++++++++++++++++++++++++++----------
 include/linux/writeback.h |  2 ++
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 245c430a2e41..03fda0830bf8 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1947,6 +1947,34 @@ void wb_workfn(struct work_struct *work)
 }
 
 /*
+ * Start writeback of `nr_pages' pages on this bdi. If `nr_pages' is zero,
+ * write back the whole world.
+ */
+static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
+					 long nr_pages, enum wb_reason reason)
+{
+	struct bdi_writeback *wb;
+
+	if (!bdi_has_dirty_io(bdi))
+		return;
+
+	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
+		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
+					   false, reason);
+}
+
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
+				enum wb_reason reason)
+{
+	if (!nr_pages)
+		nr_pages = get_nr_dirty_pages();
+
+	rcu_read_lock();
+	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
+	rcu_read_unlock();
+}
+
+/*
  * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
  * the whole world.
  */
@@ -1964,16 +1992,8 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 		nr_pages = get_nr_dirty_pages();
 
 	rcu_read_lock();
-	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
-		struct bdi_writeback *wb;
-
-		if (!bdi_has_dirty_io(bdi))
-			continue;
-
-		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
-			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
-					   false, reason);
-	}
+	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
+		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
 	rcu_read_unlock();
 }
 
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d5815794416c..5a7ed74d1f6f 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -190,6 +190,8 @@ bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
 				   enum wb_reason reason);
 void sync_inodes_sb(struct super_block *);
 void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
+				enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Similar to wakeup_flusher_threads(), except that we only wake
up the flusher threads on the specified backing device.

No functional changes in this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c         | 40 ++++++++++++++++++++++++++++++----------
 include/linux/writeback.h |  2 ++
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 245c430a2e41..03fda0830bf8 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1947,6 +1947,34 @@ void wb_workfn(struct work_struct *work)
 }
 
 /*
+ * Start writeback of `nr_pages' pages on this bdi. If `nr_pages' is zero,
+ * write back the whole world.
+ */
+static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
+					 long nr_pages, enum wb_reason reason)
+{
+	struct bdi_writeback *wb;
+
+	if (!bdi_has_dirty_io(bdi))
+		return;
+
+	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
+		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
+					   false, reason);
+}
+
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
+				enum wb_reason reason)
+{
+	if (!nr_pages)
+		nr_pages = get_nr_dirty_pages();
+
+	rcu_read_lock();
+	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
+	rcu_read_unlock();
+}
+
+/*
  * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
  * the whole world.
  */
@@ -1964,16 +1992,8 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 		nr_pages = get_nr_dirty_pages();
 
 	rcu_read_lock();
-	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
-		struct bdi_writeback *wb;
-
-		if (!bdi_has_dirty_io(bdi))
-			continue;
-
-		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
-			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
-					   false, reason);
-	}
+	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
+		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
 	rcu_read_unlock();
 }
 
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d5815794416c..5a7ed74d1f6f 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -190,6 +190,8 @@ bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
 				   enum wb_reason reason);
 void sync_inodes_sb(struct super_block *);
 void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
+void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
+				enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Laptop mode really wants to writeback the number of dirty
pages and inodes. Instead of calculating this in the caller,
just pass in 0 and let wakeup_flusher_threads() handle it.

Use the new wakeup_flusher_threads_bdi() instead of rolling
our own.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/page-writeback.c | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0b9c5cbe8eba..1933778c52c4 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1980,23 +1980,9 @@ int dirty_writeback_centisecs_handler(struct ctl_table *table, int write,
 void laptop_mode_timer_fn(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *)data;
-	int nr_pages = global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS);
-	struct bdi_writeback *wb;
 
-	/*
-	 * We want to write everything out, not just down to the dirty
-	 * threshold
-	 */
-	if (!bdi_has_dirty_io(q->backing_dev_info))
-		return;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
-		if (wb_has_dirty_io(wb))
-			wb_start_writeback(wb, nr_pages, true,
-					   WB_REASON_LAPTOP_TIMER);
-	rcu_read_unlock();
+	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
+					WB_REASON_LAPTOP_TIMER);
 }
 
 /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Laptop mode really wants to writeback the number of dirty
pages and inodes. Instead of calculating this in the caller,
just pass in 0 and let wakeup_flusher_threads() handle it.

Use the new wakeup_flusher_threads_bdi() instead of rolling
our own.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/page-writeback.c | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0b9c5cbe8eba..1933778c52c4 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1980,23 +1980,9 @@ int dirty_writeback_centisecs_handler(struct ctl_table *table, int write,
 void laptop_mode_timer_fn(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *)data;
-	int nr_pages = global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS);
-	struct bdi_writeback *wb;
 
-	/*
-	 * We want to write everything out, not just down to the dirty
-	 * threshold
-	 */
-	if (!bdi_has_dirty_io(q->backing_dev_info))
-		return;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
-		if (wb_has_dirty_io(wb))
-			wb_start_writeback(wb, nr_pages, true,
-					   WB_REASON_LAPTOP_TIMER);
-	rcu_read_unlock();
+	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
+					WB_REASON_LAPTOP_TIMER);
 }
 
 /*
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 4/6] fs-writeback: make wb_start_writeback() static
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

We don't have any callers outside of fs-writeback.c anymore,
make it private.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c           | 4 ++--
 include/linux/backing-dev.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 03fda0830bf8..7564347914f8 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,8 +933,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason)
+static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
+			       bool range_cyclic, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 854e1bdd0b2a..157e950a70dc 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -38,8 +38,6 @@ static inline struct backing_dev_info *bdi_alloc(gfp_t gfp_mask)
 	return bdi_alloc_node(gfp_mask, NUMA_NO_NODE);
 }
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason);
 void wb_start_background_writeback(struct bdi_writeback *wb);
 void wb_workfn(struct work_struct *work);
 void wb_wakeup_delayed(struct bdi_writeback *wb);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 4/6] fs-writeback: make wb_start_writeback() static
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

We don't have any callers outside of fs-writeback.c anymore,
make it private.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c           | 4 ++--
 include/linux/backing-dev.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 03fda0830bf8..7564347914f8 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,8 +933,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason)
+static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
+			       bool range_cyclic, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 854e1bdd0b2a..157e950a70dc 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -38,8 +38,6 @@ static inline struct backing_dev_info *bdi_alloc(gfp_t gfp_mask)
 	return bdi_alloc_node(gfp_mask, NUMA_NO_NODE);
 }
 
-void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
-			bool range_cyclic, enum wb_reason reason);
 void wb_start_background_writeback(struct bdi_writeback *wb);
 void wb_workfn(struct work_struct *work);
 void wb_wakeup_delayed(struct bdi_writeback *wb);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Now that we have no external callers of wb_start_writeback(),
we can move the nr_pages == 0 logic into that function.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 7564347914f8..a9a86644cb9f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,6 +933,17 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
+/*
+ * Add in the number of potentially dirty inodes, because each inode
+ * write can dirty pagecache in the underlying blockdev.
+ */
+static unsigned long get_nr_dirty_pages(void)
+{
+	return global_node_page_state(NR_FILE_DIRTY) +
+		global_node_page_state(NR_UNSTABLE_NFS) +
+		get_nr_dirty_inodes();
+}
+
 static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 			       bool range_cyclic, enum wb_reason reason)
 {
@@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 		return;
 
 	/*
+	 * If someone asked for zero pages, we write out the WORLD
+	 */
+	if (!nr_pages)
+		nr_pages = get_nr_dirty_pages();
+
+	/*
 	 * This is WB_SYNC_NONE writeback, so if allocation fails just
 	 * wakeup the thread for old dirty data writeback
 	 */
@@ -1814,17 +1831,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 	return work;
 }
 
-/*
- * Add in the number of potentially dirty inodes, because each inode
- * write can dirty pagecache in the underlying blockdev.
- */
-static unsigned long get_nr_dirty_pages(void)
-{
-	return global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS) +
-		get_nr_dirty_inodes();
-}
-
 static long wb_check_background_flush(struct bdi_writeback *wb)
 {
 	if (wb_over_bg_thresh(wb)) {
@@ -1966,9 +1972,6 @@ static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
 				enum wb_reason reason)
 {
-	if (!nr_pages)
-		nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
 	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
 	rcu_read_unlock();
@@ -1988,9 +1991,6 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 	if (blk_needs_flush_plug(current))
 		blk_schedule_flush_plug(current);
 
-	if (!nr_pages)
-		nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
 		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

Now that we have no external callers of wb_start_writeback(),
we can move the nr_pages == 0 logic into that function.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 7564347914f8..a9a86644cb9f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -933,6 +933,17 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
+/*
+ * Add in the number of potentially dirty inodes, because each inode
+ * write can dirty pagecache in the underlying blockdev.
+ */
+static unsigned long get_nr_dirty_pages(void)
+{
+	return global_node_page_state(NR_FILE_DIRTY) +
+		global_node_page_state(NR_UNSTABLE_NFS) +
+		get_nr_dirty_inodes();
+}
+
 static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 			       bool range_cyclic, enum wb_reason reason)
 {
@@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 		return;
 
 	/*
+	 * If someone asked for zero pages, we write out the WORLD
+	 */
+	if (!nr_pages)
+		nr_pages = get_nr_dirty_pages();
+
+	/*
 	 * This is WB_SYNC_NONE writeback, so if allocation fails just
 	 * wakeup the thread for old dirty data writeback
 	 */
@@ -1814,17 +1831,6 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 	return work;
 }
 
-/*
- * Add in the number of potentially dirty inodes, because each inode
- * write can dirty pagecache in the underlying blockdev.
- */
-static unsigned long get_nr_dirty_pages(void)
-{
-	return global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS) +
-		get_nr_dirty_inodes();
-}
-
 static long wb_check_background_flush(struct bdi_writeback *wb)
 {
 	if (wb_over_bg_thresh(wb)) {
@@ -1966,9 +1972,6 @@ static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
 				enum wb_reason reason)
 {
-	if (!nr_pages)
-		nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
 	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
 	rcu_read_unlock();
@@ -1988,9 +1991,6 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 	if (blk_needs_flush_plug(current))
 		blk_schedule_flush_plug(current);
 
-	if (!nr_pages)
-		nr_pages = get_nr_dirty_pages();
-
 	rcu_read_lock();
 	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
 		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-19 19:53   ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

A few callers pass in nr_pages == 0 when they wakeup the flusher
threads, which means that the flusher should just flush everything
that was currently dirty. If we are tight on memory, we can get
tons of these queued from kswapd/vmscan. This causes (at least)
two problems:

1) We consume a ton of memory just allocating writeback work items.
2) We spend so much time processing these work items, that we
   introduce a softlockup in writeback processing.

Fix this by adding a 'zero_pages' bit to the writeback structure,
and set that when someone queues a nr_pages==0 flusher thread
wakeup. The bit is cleared when we start writeback on that work
item. If the bit is already set when we attempt to queue !nr_pages
writeback, then we simply ignore it.

This provides us one of full flush in flight, with one pending as
well, and makes for more efficient handling of this type of
writeback.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
 include/linux/backing-dev-defs.h |  1 +
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a9a86644cb9f..e0240110b36f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -53,6 +53,7 @@ struct wb_writeback_work {
 	unsigned int for_background:1;
 	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
 	unsigned int auto_free:1;	/* free on completion */
+	unsigned int zero_pages:1;	/* nr_pages == 0 writeback */
 	enum wb_reason reason;		/* why was writeback initiated? */
 
 	struct list_head list;		/* pending work list */
@@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 			       bool range_cyclic, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
+	bool zero_pages = false;
 
 	if (!wb_has_dirty_io(wb))
 		return;
 
 	/*
-	 * If someone asked for zero pages, we write out the WORLD
+	 * If someone asked for zero pages, we write out the WORLD.
+	 * Places like vmscan and laptop mode want to queue a wakeup to
+	 * the flusher threads to clean out everything. To avoid potentially
+	 * having tons of these pending, ensure that we only allow one of
+	 * them pending and inflight at the time
 	 */
-	if (!nr_pages)
+	if (!nr_pages) {
+		if (test_bit(WB_zero_pages, &wb->state))
+			return;
+		set_bit(WB_zero_pages, &wb->state);
 		nr_pages = get_nr_dirty_pages();
+		zero_pages = true;
+	}
 
 	/*
 	 * This is WB_SYNC_NONE writeback, so if allocation fails just
@@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 	work->range_cyclic = range_cyclic;
 	work->reason	= reason;
 	work->auto_free	= 1;
+	work->zero_pages = zero_pages;
 
 	wb_queue_work(wb, work);
 }
@@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 		list_del_init(&work->list);
 	}
 	spin_unlock_bh(&wb->work_lock);
+
+	/*
+	 * Once we start processing a work item that had !nr_pages,
+	 * clear the wb state bit for that so we can allow more.
+	 */
+	if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
+		clear_bit(WB_zero_pages, &wb->state);
+
 	return work;
 }
 
@@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
 		trace_writeback_exec(wb, work);
 		wrote += wb_writeback(wb, work);
 		finish_writeback_work(wb, work);
+
+		/*
+		 * If we have a lot of pending work, make sure we take
+		 * an occasional breather, if needed.
+		 */
+		cond_resched();
 	}
 
 	/*
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 866c433e7d32..7494f6a75458 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -24,6 +24,7 @@ enum wb_state {
 	WB_shutting_down,	/* wb_shutdown() in progress */
 	WB_writeback_running,	/* Writeback is in progress */
 	WB_has_dirty_io,	/* Dirty inodes on ->b_{dirty|io|more_io} */
+	WB_zero_pages,		/* nr_pages == 0 flush pending */
 };
 
 enum wb_congested_state {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-19 19:53   ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 19:53 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack, Jens Axboe

A few callers pass in nr_pages == 0 when they wakeup the flusher
threads, which means that the flusher should just flush everything
that was currently dirty. If we are tight on memory, we can get
tons of these queued from kswapd/vmscan. This causes (at least)
two problems:

1) We consume a ton of memory just allocating writeback work items.
2) We spend so much time processing these work items, that we
   introduce a softlockup in writeback processing.

Fix this by adding a 'zero_pages' bit to the writeback structure,
and set that when someone queues a nr_pages==0 flusher thread
wakeup. The bit is cleared when we start writeback on that work
item. If the bit is already set when we attempt to queue !nr_pages
writeback, then we simply ignore it.

This provides us one of full flush in flight, with one pending as
well, and makes for more efficient handling of this type of
writeback.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
 include/linux/backing-dev-defs.h |  1 +
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a9a86644cb9f..e0240110b36f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -53,6 +53,7 @@ struct wb_writeback_work {
 	unsigned int for_background:1;
 	unsigned int for_sync:1;	/* sync(2) WB_SYNC_ALL writeback */
 	unsigned int auto_free:1;	/* free on completion */
+	unsigned int zero_pages:1;	/* nr_pages == 0 writeback */
 	enum wb_reason reason;		/* why was writeback initiated? */
 
 	struct list_head list;		/* pending work list */
@@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 			       bool range_cyclic, enum wb_reason reason)
 {
 	struct wb_writeback_work *work;
+	bool zero_pages = false;
 
 	if (!wb_has_dirty_io(wb))
 		return;
 
 	/*
-	 * If someone asked for zero pages, we write out the WORLD
+	 * If someone asked for zero pages, we write out the WORLD.
+	 * Places like vmscan and laptop mode want to queue a wakeup to
+	 * the flusher threads to clean out everything. To avoid potentially
+	 * having tons of these pending, ensure that we only allow one of
+	 * them pending and inflight at the time
 	 */
-	if (!nr_pages)
+	if (!nr_pages) {
+		if (test_bit(WB_zero_pages, &wb->state))
+			return;
+		set_bit(WB_zero_pages, &wb->state);
 		nr_pages = get_nr_dirty_pages();
+		zero_pages = true;
+	}
 
 	/*
 	 * This is WB_SYNC_NONE writeback, so if allocation fails just
@@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
 	work->range_cyclic = range_cyclic;
 	work->reason	= reason;
 	work->auto_free	= 1;
+	work->zero_pages = zero_pages;
 
 	wb_queue_work(wb, work);
 }
@@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
 		list_del_init(&work->list);
 	}
 	spin_unlock_bh(&wb->work_lock);
+
+	/*
+	 * Once we start processing a work item that had !nr_pages,
+	 * clear the wb state bit for that so we can allow more.
+	 */
+	if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
+		clear_bit(WB_zero_pages, &wb->state);
+
 	return work;
 }
 
@@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
 		trace_writeback_exec(wb, work);
 		wrote += wb_writeback(wb, work);
 		finish_writeback_work(wb, work);
+
+		/*
+		 * If we have a lot of pending work, make sure we take
+		 * an occasional breather, if needed.
+		 */
+		cond_resched();
 	}
 
 	/*
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 866c433e7d32..7494f6a75458 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -24,6 +24,7 @@ enum wb_state {
 	WB_shutting_down,	/* wb_shutdown() in progress */
 	WB_writeback_running,	/* Writeback is in progress */
 	WB_has_dirty_io,	/* Dirty inodes on ->b_{dirty|io|more_io} */
+	WB_zero_pages,		/* nr_pages == 0 flush pending */
 };
 
 enum wb_congested_state {
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:05     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:02PM -0600, Jens Axboe wrote:
> This whole function is... interesting. Change the wakeup call
> to the flusher threads to pass in nr_pages == 0, instead of
> some random number of pages. This matches more closely what
> similar cases do for memory shortage/reclaim.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
@ 2017-09-19 20:05     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:02PM -0600, Jens Axboe wrote:
> This whole function is... interesting. Change the wakeup call
> to the flusher threads to pass in nr_pages == 0, instead of
> some random number of pages. This matches more closely what
> similar cases do for memory shortage/reclaim.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:05     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:03PM -0600, Jens Axboe wrote:
> Similar to wakeup_flusher_threads(), except that we only wake
> up the flusher threads on the specified backing device.
> 
> No functional changes in this patch.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
@ 2017-09-19 20:05     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:03PM -0600, Jens Axboe wrote:
> Similar to wakeup_flusher_threads(), except that we only wake
> up the flusher threads on the specified backing device.
> 
> No functional changes in this patch.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:06     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:04PM -0600, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
@ 2017-09-19 20:06     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:04PM -0600, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 4/6] fs-writeback: make wb_start_writeback() static
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:07     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:05PM -0600, Jens Axboe wrote:
> We don't have any callers outside of fs-writeback.c anymore,
> make it private.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 4/6] fs-writeback: make wb_start_writeback() static
@ 2017-09-19 20:07     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:05PM -0600, Jens Axboe wrote:
> We don't have any callers outside of fs-writeback.c anymore,
> make it private.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:07     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:06PM -0600, Jens Axboe wrote:
> Now that we have no external callers of wb_start_writeback(),
> we can move the nr_pages == 0 logic into that function.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-19 20:07     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:06PM -0600, Jens Axboe wrote:
> Now that we have no external callers of wb_start_writeback(),
> we can move the nr_pages == 0 logic into that function.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-19 20:18     ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:07PM -0600, Jens Axboe wrote:
> A few callers pass in nr_pages == 0 when they wakeup the flusher
> threads, which means that the flusher should just flush everything
> that was currently dirty. If we are tight on memory, we can get
> tons of these queued from kswapd/vmscan. This causes (at least)
> two problems:
> 
> 1) We consume a ton of memory just allocating writeback work items.
> 2) We spend so much time processing these work items, that we
>    introduce a softlockup in writeback processing.
> 
> Fix this by adding a 'zero_pages' bit to the writeback structure,
> and set that when someone queues a nr_pages==0 flusher thread
> wakeup. The bit is cleared when we start writeback on that work
> item. If the bit is already set when we attempt to queue !nr_pages
> writeback, then we simply ignore it.
> 
> This provides us one of full flush in flight, with one pending as
> well, and makes for more efficient handling of this type of
> writeback.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Just a nitpick:

> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
> +	bool zero_pages = false;
>  
>  	if (!wb_has_dirty_io(wb))
>  		return;
>  
>  	/*
> -	 * If someone asked for zero pages, we write out the WORLD
> +	 * If someone asked for zero pages, we write out the WORLD.
> +	 * Places like vmscan and laptop mode want to queue a wakeup to
> +	 * the flusher threads to clean out everything. To avoid potentially
> +	 * having tons of these pending, ensure that we only allow one of
> +	 * them pending and inflight at the time
>  	 */
> -	if (!nr_pages)
> +	if (!nr_pages) {
> +		if (test_bit(WB_zero_pages, &wb->state))
> +			return;
> +		set_bit(WB_zero_pages, &wb->state);
>  		nr_pages = get_nr_dirty_pages();

We could rely on the work->older_than_this and pass LONG_MAX here
instead to write out the world as it was at the time wb commences.

get_nr_dirty_pages() is somewhat clearer on intent, but on the other
hand it returns global state and is used here in a split-bdi context,
and we can end up in sum requesting the system-wide dirty pages
several times over. It'll work fine, relying on work->older_than_this
to contain it also, it just seems a little ugly and subtle.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-19 20:18     ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-19 20:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On Tue, Sep 19, 2017 at 01:53:07PM -0600, Jens Axboe wrote:
> A few callers pass in nr_pages == 0 when they wakeup the flusher
> threads, which means that the flusher should just flush everything
> that was currently dirty. If we are tight on memory, we can get
> tons of these queued from kswapd/vmscan. This causes (at least)
> two problems:
> 
> 1) We consume a ton of memory just allocating writeback work items.
> 2) We spend so much time processing these work items, that we
>    introduce a softlockup in writeback processing.
> 
> Fix this by adding a 'zero_pages' bit to the writeback structure,
> and set that when someone queues a nr_pages==0 flusher thread
> wakeup. The bit is cleared when we start writeback on that work
> item. If the bit is already set when we attempt to queue !nr_pages
> writeback, then we simply ignore it.
> 
> This provides us one of full flush in flight, with one pending as
> well, and makes for more efficient handling of this type of
> writeback.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Just a nitpick:

> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
> +	bool zero_pages = false;
>  
>  	if (!wb_has_dirty_io(wb))
>  		return;
>  
>  	/*
> -	 * If someone asked for zero pages, we write out the WORLD
> +	 * If someone asked for zero pages, we write out the WORLD.
> +	 * Places like vmscan and laptop mode want to queue a wakeup to
> +	 * the flusher threads to clean out everything. To avoid potentially
> +	 * having tons of these pending, ensure that we only allow one of
> +	 * them pending and inflight at the time
>  	 */
> -	if (!nr_pages)
> +	if (!nr_pages) {
> +		if (test_bit(WB_zero_pages, &wb->state))
> +			return;
> +		set_bit(WB_zero_pages, &wb->state);
>  		nr_pages = get_nr_dirty_pages();

We could rely on the work->older_than_this and pass LONG_MAX here
instead to write out the world as it was at the time wb commences.

get_nr_dirty_pages() is somewhat clearer on intent, but on the other
hand it returns global state and is used here in a split-bdi context,
and we can end up in sum requesting the system-wide dirty pages
several times over. It'll work fine, relying on work->older_than_this
to contain it also, it just seems a little ugly and subtle.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-19 20:18     ` Johannes Weiner
@ 2017-09-19 20:39       ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 20:39 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On 09/19/2017 02:18 PM, Johannes Weiner wrote:
> On Tue, Sep 19, 2017 at 01:53:07PM -0600, Jens Axboe wrote:
>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>> threads, which means that the flusher should just flush everything
>> that was currently dirty. If we are tight on memory, we can get
>> tons of these queued from kswapd/vmscan. This causes (at least)
>> two problems:
>>
>> 1) We consume a ton of memory just allocating writeback work items.
>> 2) We spend so much time processing these work items, that we
>>    introduce a softlockup in writeback processing.
>>
>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>> and set that when someone queues a nr_pages==0 flusher thread
>> wakeup. The bit is cleared when we start writeback on that work
>> item. If the bit is already set when we attempt to queue !nr_pages
>> writeback, then we simply ignore it.
>>
>> This provides us one of full flush in flight, with one pending as
>> well, and makes for more efficient handling of this type of
>> writeback.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Just a nitpick:
> 
>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  			       bool range_cyclic, enum wb_reason reason)
>>  {
>>  	struct wb_writeback_work *work;
>> +	bool zero_pages = false;
>>  
>>  	if (!wb_has_dirty_io(wb))
>>  		return;
>>  
>>  	/*
>> -	 * If someone asked for zero pages, we write out the WORLD
>> +	 * If someone asked for zero pages, we write out the WORLD.
>> +	 * Places like vmscan and laptop mode want to queue a wakeup to
>> +	 * the flusher threads to clean out everything. To avoid potentially
>> +	 * having tons of these pending, ensure that we only allow one of
>> +	 * them pending and inflight at the time
>>  	 */
>> -	if (!nr_pages)
>> +	if (!nr_pages) {
>> +		if (test_bit(WB_zero_pages, &wb->state))
>> +			return;
>> +		set_bit(WB_zero_pages, &wb->state);
>>  		nr_pages = get_nr_dirty_pages();
> 
> We could rely on the work->older_than_this and pass LONG_MAX here
> instead to write out the world as it was at the time wb commences.
> 
> get_nr_dirty_pages() is somewhat clearer on intent, but on the other
> hand it returns global state and is used here in a split-bdi context,
> and we can end up in sum requesting the system-wide dirty pages
> several times over. It'll work fine, relying on work->older_than_this
> to contain it also, it just seems a little ugly and subtle.

Not disagreeing with that at all. I just carried the !nr_pages forward
as the way to do this. I think any further cleanup or work should just
be based on this patchset, I'd definitely welcome a change in that
direction.

Thanks for your reviews!

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-19 20:39       ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-19 20:39 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-kernel, linux-fsdevel, linux-mm, clm, jack

On 09/19/2017 02:18 PM, Johannes Weiner wrote:
> On Tue, Sep 19, 2017 at 01:53:07PM -0600, Jens Axboe wrote:
>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>> threads, which means that the flusher should just flush everything
>> that was currently dirty. If we are tight on memory, we can get
>> tons of these queued from kswapd/vmscan. This causes (at least)
>> two problems:
>>
>> 1) We consume a ton of memory just allocating writeback work items.
>> 2) We spend so much time processing these work items, that we
>>    introduce a softlockup in writeback processing.
>>
>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>> and set that when someone queues a nr_pages==0 flusher thread
>> wakeup. The bit is cleared when we start writeback on that work
>> item. If the bit is already set when we attempt to queue !nr_pages
>> writeback, then we simply ignore it.
>>
>> This provides us one of full flush in flight, with one pending as
>> well, and makes for more efficient handling of this type of
>> writeback.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Just a nitpick:
> 
>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  			       bool range_cyclic, enum wb_reason reason)
>>  {
>>  	struct wb_writeback_work *work;
>> +	bool zero_pages = false;
>>  
>>  	if (!wb_has_dirty_io(wb))
>>  		return;
>>  
>>  	/*
>> -	 * If someone asked for zero pages, we write out the WORLD
>> +	 * If someone asked for zero pages, we write out the WORLD.
>> +	 * Places like vmscan and laptop mode want to queue a wakeup to
>> +	 * the flusher threads to clean out everything. To avoid potentially
>> +	 * having tons of these pending, ensure that we only allow one of
>> +	 * them pending and inflight at the time
>>  	 */
>> -	if (!nr_pages)
>> +	if (!nr_pages) {
>> +		if (test_bit(WB_zero_pages, &wb->state))
>> +			return;
>> +		set_bit(WB_zero_pages, &wb->state);
>>  		nr_pages = get_nr_dirty_pages();
> 
> We could rely on the work->older_than_this and pass LONG_MAX here
> instead to write out the world as it was at the time wb commences.
> 
> get_nr_dirty_pages() is somewhat clearer on intent, but on the other
> hand it returns global state and is used here in a split-bdi context,
> and we can end up in sum requesting the system-wide dirty pages
> several times over. It'll work fine, relying on work->older_than_this
> to contain it also, it just seems a little ugly and subtle.

Not disagreeing with that at all. I just carried the !nr_pages forward
as the way to do this. I think any further cleanup or work should just
be based on this patchset, I'd definitely welcome a change in that
direction.

Thanks for your reviews!

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20  1:57     ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20  1:57 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack

On 09/19/2017 01:53 PM, Jens Axboe wrote:
> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
> +	bool zero_pages = false;
>  
>  	if (!wb_has_dirty_io(wb))
>  		return;
>  
>  	/*
> -	 * If someone asked for zero pages, we write out the WORLD
> +	 * If someone asked for zero pages, we write out the WORLD.
> +	 * Places like vmscan and laptop mode want to queue a wakeup to
> +	 * the flusher threads to clean out everything. To avoid potentially
> +	 * having tons of these pending, ensure that we only allow one of
> +	 * them pending and inflight at the time
>  	 */
> -	if (!nr_pages)
> +	if (!nr_pages) {
> +		if (test_bit(WB_zero_pages, &wb->state))
> +			return;
> +		set_bit(WB_zero_pages, &wb->state);
>  		nr_pages = get_nr_dirty_pages();
> +		zero_pages = true;
> +	}

Later fix added here to ensure we clear WB_zero_pages, if work
allocation fails:

work = kzalloc(sizeof(*work),                                           
                GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);           
if (!work) {                                                            
        if (zero_pages)                                                 
                clear_bit(WB_zero_pages, &wb->state);
	[...]

Updated patch here:

http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=21ea70657894fda9fccf257543cbec112b2813ef

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20  1:57     ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20  1:57 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-mm; +Cc: hannes, clm, jack

On 09/19/2017 01:53 PM, Jens Axboe wrote:
> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
> +	bool zero_pages = false;
>  
>  	if (!wb_has_dirty_io(wb))
>  		return;
>  
>  	/*
> -	 * If someone asked for zero pages, we write out the WORLD
> +	 * If someone asked for zero pages, we write out the WORLD.
> +	 * Places like vmscan and laptop mode want to queue a wakeup to
> +	 * the flusher threads to clean out everything. To avoid potentially
> +	 * having tons of these pending, ensure that we only allow one of
> +	 * them pending and inflight at the time
>  	 */
> -	if (!nr_pages)
> +	if (!nr_pages) {
> +		if (test_bit(WB_zero_pages, &wb->state))
> +			return;
> +		set_bit(WB_zero_pages, &wb->state);
>  		nr_pages = get_nr_dirty_pages();
> +		zero_pages = true;
> +	}

Later fix added here to ensure we clear WB_zero_pages, if work
allocation fails:

work = kzalloc(sizeof(*work),                                           
                GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN);           
if (!work) {                                                            
        if (zero_pages)                                                 
                clear_bit(WB_zero_pages, &wb->state);
	[...]

Updated patch here:

http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=21ea70657894fda9fccf257543cbec112b2813ef

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20  3:10     ` Amir Goldstein
  -1 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-09-20  3:10 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
> A few callers pass in nr_pages == 0 when they wakeup the flusher
> threads, which means that the flusher should just flush everything
> that was currently dirty. If we are tight on memory, we can get
> tons of these queued from kswapd/vmscan. This causes (at least)
> two problems:
>
> 1) We consume a ton of memory just allocating writeback work items.
> 2) We spend so much time processing these work items, that we
>    introduce a softlockup in writeback processing.
>
> Fix this by adding a 'zero_pages' bit to the writeback structure,
> and set that when someone queues a nr_pages==0 flusher thread
> wakeup. The bit is cleared when we start writeback on that work
> item. If the bit is already set when we attempt to queue !nr_pages
> writeback, then we simply ignore it.
>
> This provides us one of full flush in flight, with one pending as
> well, and makes for more efficient handling of this type of
> writeback.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>  include/linux/backing-dev-defs.h |  1 +
>  2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index a9a86644cb9f..e0240110b36f 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>         unsigned int for_background:1;
>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>         unsigned int auto_free:1;       /* free on completion */
> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */

Suggest: use a name that describes the intention (e.g. WB_everything)

>         enum wb_reason reason;          /* why was writeback initiated? */
>
>         struct list_head list;          /* pending work list */
> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>                                bool range_cyclic, enum wb_reason reason)
>  {
>         struct wb_writeback_work *work;
> +       bool zero_pages = false;
>
>         if (!wb_has_dirty_io(wb))
>                 return;
>
>         /*
> -        * If someone asked for zero pages, we write out the WORLD
> +        * If someone asked for zero pages, we write out the WORLD.
> +        * Places like vmscan and laptop mode want to queue a wakeup to
> +        * the flusher threads to clean out everything. To avoid potentially
> +        * having tons of these pending, ensure that we only allow one of
> +        * them pending and inflight at the time
>          */
> -       if (!nr_pages)
> +       if (!nr_pages) {
> +               if (test_bit(WB_zero_pages, &wb->state))
> +                       return;
> +               set_bit(WB_zero_pages, &wb->state);

Shouldn't this be test_and_set? not the worst outcome if you have more
than one pending work item, but still.

>                 nr_pages = get_nr_dirty_pages();
> +               zero_pages = true;
> +       }
>
>         /*
>          * This is WB_SYNC_NONE writeback, so if allocation fails just
> @@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>         work->range_cyclic = range_cyclic;
>         work->reason    = reason;
>         work->auto_free = 1;
> +       work->zero_pages = zero_pages;
>
>         wb_queue_work(wb, work);
>  }
> @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>                 list_del_init(&work->list);
>         }
>         spin_unlock_bh(&wb->work_lock);
> +
> +       /*
> +        * Once we start processing a work item that had !nr_pages,
> +        * clear the wb state bit for that so we can allow more.
> +        */
> +       if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
> +               clear_bit(WB_zero_pages, &wb->state);

nit: should not need to test_bit

> +
>         return work;
>  }
>
> @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
>                 trace_writeback_exec(wb, work);
>                 wrote += wb_writeback(wb, work);
>                 finish_writeback_work(wb, work);
> +
> +               /*
> +                * If we have a lot of pending work, make sure we take
> +                * an occasional breather, if needed.
> +                */
> +               cond_resched();

Probably ought to be in a separate patch.

>         }
>
>         /*
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index 866c433e7d32..7494f6a75458 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -24,6 +24,7 @@ enum wb_state {
>         WB_shutting_down,       /* wb_shutdown() in progress */
>         WB_writeback_running,   /* Writeback is in progress */
>         WB_has_dirty_io,        /* Dirty inodes on ->b_{dirty|io|more_io} */
> +       WB_zero_pages,          /* nr_pages == 0 flush pending */

same suggestion: WB_everything

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20  3:10     ` Amir Goldstein
  0 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-09-20  3:10 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
> A few callers pass in nr_pages == 0 when they wakeup the flusher
> threads, which means that the flusher should just flush everything
> that was currently dirty. If we are tight on memory, we can get
> tons of these queued from kswapd/vmscan. This causes (at least)
> two problems:
>
> 1) We consume a ton of memory just allocating writeback work items.
> 2) We spend so much time processing these work items, that we
>    introduce a softlockup in writeback processing.
>
> Fix this by adding a 'zero_pages' bit to the writeback structure,
> and set that when someone queues a nr_pages==0 flusher thread
> wakeup. The bit is cleared when we start writeback on that work
> item. If the bit is already set when we attempt to queue !nr_pages
> writeback, then we simply ignore it.
>
> This provides us one of full flush in flight, with one pending as
> well, and makes for more efficient handling of this type of
> writeback.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>  include/linux/backing-dev-defs.h |  1 +
>  2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index a9a86644cb9f..e0240110b36f 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>         unsigned int for_background:1;
>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>         unsigned int auto_free:1;       /* free on completion */
> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */

Suggest: use a name that describes the intention (e.g. WB_everything)

>         enum wb_reason reason;          /* why was writeback initiated? */
>
>         struct list_head list;          /* pending work list */
> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>                                bool range_cyclic, enum wb_reason reason)
>  {
>         struct wb_writeback_work *work;
> +       bool zero_pages = false;
>
>         if (!wb_has_dirty_io(wb))
>                 return;
>
>         /*
> -        * If someone asked for zero pages, we write out the WORLD
> +        * If someone asked for zero pages, we write out the WORLD.
> +        * Places like vmscan and laptop mode want to queue a wakeup to
> +        * the flusher threads to clean out everything. To avoid potentially
> +        * having tons of these pending, ensure that we only allow one of
> +        * them pending and inflight at the time
>          */
> -       if (!nr_pages)
> +       if (!nr_pages) {
> +               if (test_bit(WB_zero_pages, &wb->state))
> +                       return;
> +               set_bit(WB_zero_pages, &wb->state);

Shouldn't this be test_and_set? not the worst outcome if you have more
than one pending work item, but still.

>                 nr_pages = get_nr_dirty_pages();
> +               zero_pages = true;
> +       }
>
>         /*
>          * This is WB_SYNC_NONE writeback, so if allocation fails just
> @@ -975,6 +986,7 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>         work->range_cyclic = range_cyclic;
>         work->reason    = reason;
>         work->auto_free = 1;
> +       work->zero_pages = zero_pages;
>
>         wb_queue_work(wb, work);
>  }
> @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>                 list_del_init(&work->list);
>         }
>         spin_unlock_bh(&wb->work_lock);
> +
> +       /*
> +        * Once we start processing a work item that had !nr_pages,
> +        * clear the wb state bit for that so we can allow more.
> +        */
> +       if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
> +               clear_bit(WB_zero_pages, &wb->state);

nit: should not need to test_bit

> +
>         return work;
>  }
>
> @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
>                 trace_writeback_exec(wb, work);
>                 wrote += wb_writeback(wb, work);
>                 finish_writeback_work(wb, work);
> +
> +               /*
> +                * If we have a lot of pending work, make sure we take
> +                * an occasional breather, if needed.
> +                */
> +               cond_resched();

Probably ought to be in a separate patch.

>         }
>
>         /*
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index 866c433e7d32..7494f6a75458 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -24,6 +24,7 @@ enum wb_state {
>         WB_shutting_down,       /* wb_shutdown() in progress */
>         WB_writeback_running,   /* Writeback is in progress */
>         WB_has_dirty_io,        /* Dirty inodes on ->b_{dirty|io|more_io} */
> +       WB_zero_pages,          /* nr_pages == 0 flush pending */

same suggestion: WB_everything

Cheers,
Amir.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-20  3:10     ` Amir Goldstein
@ 2017-09-20  4:13       ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20  4:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On 09/19/2017 09:10 PM, Amir Goldstein wrote:
> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>> threads, which means that the flusher should just flush everything
>> that was currently dirty. If we are tight on memory, we can get
>> tons of these queued from kswapd/vmscan. This causes (at least)
>> two problems:
>>
>> 1) We consume a ton of memory just allocating writeback work items.
>> 2) We spend so much time processing these work items, that we
>>    introduce a softlockup in writeback processing.
>>
>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>> and set that when someone queues a nr_pages==0 flusher thread
>> wakeup. The bit is cleared when we start writeback on that work
>> item. If the bit is already set when we attempt to queue !nr_pages
>> writeback, then we simply ignore it.
>>
>> This provides us one of full flush in flight, with one pending as
>> well, and makes for more efficient handling of this type of
>> writeback.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>  include/linux/backing-dev-defs.h |  1 +
>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>> index a9a86644cb9f..e0240110b36f 100644
>> --- a/fs/fs-writeback.c
>> +++ b/fs/fs-writeback.c
>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>         unsigned int for_background:1;
>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>         unsigned int auto_free:1;       /* free on completion */
>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
> 
> Suggest: use a name that describes the intention (e.g. WB_everything)

Agree, the name isn't the best. WB_everything isn't great either, though,
since this isn't an integrity write. WB_start_all would be better,
I'll make that change.

>>         enum wb_reason reason;          /* why was writeback initiated? */
>>
>>         struct list_head list;          /* pending work list */
>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>                                bool range_cyclic, enum wb_reason reason)
>>  {
>>         struct wb_writeback_work *work;
>> +       bool zero_pages = false;
>>
>>         if (!wb_has_dirty_io(wb))
>>                 return;
>>
>>         /*
>> -        * If someone asked for zero pages, we write out the WORLD
>> +        * If someone asked for zero pages, we write out the WORLD.
>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>> +        * the flusher threads to clean out everything. To avoid potentially
>> +        * having tons of these pending, ensure that we only allow one of
>> +        * them pending and inflight at the time
>>          */
>> -       if (!nr_pages)
>> +       if (!nr_pages) {
>> +               if (test_bit(WB_zero_pages, &wb->state))
>> +                       return;
>> +               set_bit(WB_zero_pages, &wb->state);
> 
> Shouldn't this be test_and_set? not the worst outcome if you have more
> than one pending work item, but still.

If the frequency of these is high, and they were to trigger the bad
conditions we saw, then a split test + set is faster as it won't
keep re-dirtying the same cacheline from multiple locations. It's
better to leave it a little racy, but faster.

>> @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>>                 list_del_init(&work->list);
>>         }
>>         spin_unlock_bh(&wb->work_lock);
>> +
>> +       /*
>> +        * Once we start processing a work item that had !nr_pages,
>> +        * clear the wb state bit for that so we can allow more.
>> +        */
>> +       if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
>> +               clear_bit(WB_zero_pages, &wb->state);
> 
> nit: should not need to test_bit

True, we can drop it for this case, as it'll be the common condition
anyway. I'll make that change.

>> @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
>>                 trace_writeback_exec(wb, work);
>>                 wrote += wb_writeback(wb, work);
>>                 finish_writeback_work(wb, work);
>> +
>> +               /*
>> +                * If we have a lot of pending work, make sure we take
>> +                * an occasional breather, if needed.
>> +                */
>> +               cond_resched();
> 
> Probably ought to be in a separate patch.

Yeah, it probably should be. It's not strictly needed with the other
change anyway, I will just drop it.

New version:

http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=338a69c217cdaaffda93f3cc9a364a347f782adb

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20  4:13       ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20  4:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On 09/19/2017 09:10 PM, Amir Goldstein wrote:
> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>> threads, which means that the flusher should just flush everything
>> that was currently dirty. If we are tight on memory, we can get
>> tons of these queued from kswapd/vmscan. This causes (at least)
>> two problems:
>>
>> 1) We consume a ton of memory just allocating writeback work items.
>> 2) We spend so much time processing these work items, that we
>>    introduce a softlockup in writeback processing.
>>
>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>> and set that when someone queues a nr_pages==0 flusher thread
>> wakeup. The bit is cleared when we start writeback on that work
>> item. If the bit is already set when we attempt to queue !nr_pages
>> writeback, then we simply ignore it.
>>
>> This provides us one of full flush in flight, with one pending as
>> well, and makes for more efficient handling of this type of
>> writeback.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>  include/linux/backing-dev-defs.h |  1 +
>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>> index a9a86644cb9f..e0240110b36f 100644
>> --- a/fs/fs-writeback.c
>> +++ b/fs/fs-writeback.c
>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>         unsigned int for_background:1;
>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>         unsigned int auto_free:1;       /* free on completion */
>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
> 
> Suggest: use a name that describes the intention (e.g. WB_everything)

Agree, the name isn't the best. WB_everything isn't great either, though,
since this isn't an integrity write. WB_start_all would be better,
I'll make that change.

>>         enum wb_reason reason;          /* why was writeback initiated? */
>>
>>         struct list_head list;          /* pending work list */
>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>                                bool range_cyclic, enum wb_reason reason)
>>  {
>>         struct wb_writeback_work *work;
>> +       bool zero_pages = false;
>>
>>         if (!wb_has_dirty_io(wb))
>>                 return;
>>
>>         /*
>> -        * If someone asked for zero pages, we write out the WORLD
>> +        * If someone asked for zero pages, we write out the WORLD.
>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>> +        * the flusher threads to clean out everything. To avoid potentially
>> +        * having tons of these pending, ensure that we only allow one of
>> +        * them pending and inflight at the time
>>          */
>> -       if (!nr_pages)
>> +       if (!nr_pages) {
>> +               if (test_bit(WB_zero_pages, &wb->state))
>> +                       return;
>> +               set_bit(WB_zero_pages, &wb->state);
> 
> Shouldn't this be test_and_set? not the worst outcome if you have more
> than one pending work item, but still.

If the frequency of these is high, and they were to trigger the bad
conditions we saw, then a split test + set is faster as it won't
keep re-dirtying the same cacheline from multiple locations. It's
better to leave it a little racy, but faster.

>> @@ -1828,6 +1840,14 @@ static struct wb_writeback_work *get_next_work_item(struct bdi_writeback *wb)
>>                 list_del_init(&work->list);
>>         }
>>         spin_unlock_bh(&wb->work_lock);
>> +
>> +       /*
>> +        * Once we start processing a work item that had !nr_pages,
>> +        * clear the wb state bit for that so we can allow more.
>> +        */
>> +       if (work && work->zero_pages && test_bit(WB_zero_pages, &wb->state))
>> +               clear_bit(WB_zero_pages, &wb->state);
> 
> nit: should not need to test_bit

True, we can drop it for this case, as it'll be the common condition
anyway. I'll make that change.

>> @@ -1896,6 +1916,12 @@ static long wb_do_writeback(struct bdi_writeback *wb)
>>                 trace_writeback_exec(wb, work);
>>                 wrote += wb_writeback(wb, work);
>>                 finish_writeback_work(wb, work);
>> +
>> +               /*
>> +                * If we have a lot of pending work, make sure we take
>> +                * an occasional breather, if needed.
>> +                */
>> +               cond_resched();
> 
> Probably ought to be in a separate patch.

Yeah, it probably should be. It's not strictly needed with the other
change anyway, I will just drop it.

New version:

http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=338a69c217cdaaffda93f3cc9a364a347f782adb

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-20  4:13       ` Jens Axboe
@ 2017-09-20  6:05         ` Amir Goldstein
  -1 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-09-20  6:05 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On Wed, Sep 20, 2017 at 7:13 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
>> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>>> threads, which means that the flusher should just flush everything
>>> that was currently dirty. If we are tight on memory, we can get
>>> tons of these queued from kswapd/vmscan. This causes (at least)
>>> two problems:
>>>
>>> 1) We consume a ton of memory just allocating writeback work items.
>>> 2) We spend so much time processing these work items, that we
>>>    introduce a softlockup in writeback processing.
>>>
>>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>>> and set that when someone queues a nr_pages==0 flusher thread
>>> wakeup. The bit is cleared when we start writeback on that work
>>> item. If the bit is already set when we attempt to queue !nr_pages
>>> writeback, then we simply ignore it.
>>>
>>> This provides us one of full flush in flight, with one pending as
>>> well, and makes for more efficient handling of this type of
>>> writeback.
>>>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>>  include/linux/backing-dev-defs.h |  1 +
>>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>>> index a9a86644cb9f..e0240110b36f 100644
>>> --- a/fs/fs-writeback.c
>>> +++ b/fs/fs-writeback.c
>>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>>         unsigned int for_background:1;
>>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>>         unsigned int auto_free:1;       /* free on completion */
>>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
>>
>> Suggest: use a name that describes the intention (e.g. WB_everything)
>
> Agree, the name isn't the best. WB_everything isn't great either, though,
> since this isn't an integrity write. WB_start_all would be better,
> I'll make that change.
>
>>>         enum wb_reason reason;          /* why was writeback initiated? */
>>>
>>>         struct list_head list;          /* pending work list */
>>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>                                bool range_cyclic, enum wb_reason reason)
>>>  {
>>>         struct wb_writeback_work *work;
>>> +       bool zero_pages = false;
>>>
>>>         if (!wb_has_dirty_io(wb))
>>>                 return;
>>>
>>>         /*
>>> -        * If someone asked for zero pages, we write out the WORLD
>>> +        * If someone asked for zero pages, we write out the WORLD.
>>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>>> +        * the flusher threads to clean out everything. To avoid potentially
>>> +        * having tons of these pending, ensure that we only allow one of
>>> +        * them pending and inflight at the time
>>>          */
>>> -       if (!nr_pages)
>>> +       if (!nr_pages) {
>>> +               if (test_bit(WB_zero_pages, &wb->state))
>>> +                       return;
>>> +               set_bit(WB_zero_pages, &wb->state);
>>
>> Shouldn't this be test_and_set? not the worst outcome if you have more
>> than one pending work item, but still.
>
> If the frequency of these is high, and they were to trigger the bad
> conditions we saw, then a split test + set is faster as it won't
> keep re-dirtying the same cacheline from multiple locations. It's
> better to leave it a little racy, but faster.
>

Fare enough, but then better change the language of the commit message and
comment above not to claim that there can be only one pending work item.

Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20  6:05         ` Amir Goldstein
  0 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-09-20  6:05 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On Wed, Sep 20, 2017 at 7:13 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
>> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>>> threads, which means that the flusher should just flush everything
>>> that was currently dirty. If we are tight on memory, we can get
>>> tons of these queued from kswapd/vmscan. This causes (at least)
>>> two problems:
>>>
>>> 1) We consume a ton of memory just allocating writeback work items.
>>> 2) We spend so much time processing these work items, that we
>>>    introduce a softlockup in writeback processing.
>>>
>>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>>> and set that when someone queues a nr_pages==0 flusher thread
>>> wakeup. The bit is cleared when we start writeback on that work
>>> item. If the bit is already set when we attempt to queue !nr_pages
>>> writeback, then we simply ignore it.
>>>
>>> This provides us one of full flush in flight, with one pending as
>>> well, and makes for more efficient handling of this type of
>>> writeback.
>>>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>>  include/linux/backing-dev-defs.h |  1 +
>>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>>> index a9a86644cb9f..e0240110b36f 100644
>>> --- a/fs/fs-writeback.c
>>> +++ b/fs/fs-writeback.c
>>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>>         unsigned int for_background:1;
>>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>>         unsigned int auto_free:1;       /* free on completion */
>>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
>>
>> Suggest: use a name that describes the intention (e.g. WB_everything)
>
> Agree, the name isn't the best. WB_everything isn't great either, though,
> since this isn't an integrity write. WB_start_all would be better,
> I'll make that change.
>
>>>         enum wb_reason reason;          /* why was writeback initiated? */
>>>
>>>         struct list_head list;          /* pending work list */
>>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>                                bool range_cyclic, enum wb_reason reason)
>>>  {
>>>         struct wb_writeback_work *work;
>>> +       bool zero_pages = false;
>>>
>>>         if (!wb_has_dirty_io(wb))
>>>                 return;
>>>
>>>         /*
>>> -        * If someone asked for zero pages, we write out the WORLD
>>> +        * If someone asked for zero pages, we write out the WORLD.
>>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>>> +        * the flusher threads to clean out everything. To avoid potentially
>>> +        * having tons of these pending, ensure that we only allow one of
>>> +        * them pending and inflight at the time
>>>          */
>>> -       if (!nr_pages)
>>> +       if (!nr_pages) {
>>> +               if (test_bit(WB_zero_pages, &wb->state))
>>> +                       return;
>>> +               set_bit(WB_zero_pages, &wb->state);
>>
>> Shouldn't this be test_and_set? not the worst outcome if you have more
>> than one pending work item, but still.
>
> If the frequency of these is high, and they were to trigger the bad
> conditions we saw, then a split test + set is faster as it won't
> keep re-dirtying the same cacheline from multiple locations. It's
> better to leave it a little racy, but faster.
>

Fare enough, but then better change the language of the commit message and
comment above not to claim that there can be only one pending work item.

Amir.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-20  6:05         ` Amir Goldstein
@ 2017-09-20 12:35           ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 12:35 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On 09/20/2017 12:05 AM, Amir Goldstein wrote:
> On Wed, Sep 20, 2017 at 7:13 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
>>> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>>>> threads, which means that the flusher should just flush everything
>>>> that was currently dirty. If we are tight on memory, we can get
>>>> tons of these queued from kswapd/vmscan. This causes (at least)
>>>> two problems:
>>>>
>>>> 1) We consume a ton of memory just allocating writeback work items.
>>>> 2) We spend so much time processing these work items, that we
>>>>    introduce a softlockup in writeback processing.
>>>>
>>>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>>>> and set that when someone queues a nr_pages==0 flusher thread
>>>> wakeup. The bit is cleared when we start writeback on that work
>>>> item. If the bit is already set when we attempt to queue !nr_pages
>>>> writeback, then we simply ignore it.
>>>>
>>>> This provides us one of full flush in flight, with one pending as
>>>> well, and makes for more efficient handling of this type of
>>>> writeback.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> ---
>>>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>>>  include/linux/backing-dev-defs.h |  1 +
>>>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>>>> index a9a86644cb9f..e0240110b36f 100644
>>>> --- a/fs/fs-writeback.c
>>>> +++ b/fs/fs-writeback.c
>>>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>>>         unsigned int for_background:1;
>>>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>>>         unsigned int auto_free:1;       /* free on completion */
>>>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
>>>
>>> Suggest: use a name that describes the intention (e.g. WB_everything)
>>
>> Agree, the name isn't the best. WB_everything isn't great either, though,
>> since this isn't an integrity write. WB_start_all would be better,
>> I'll make that change.
>>
>>>>         enum wb_reason reason;          /* why was writeback initiated? */
>>>>
>>>>         struct list_head list;          /* pending work list */
>>>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>                                bool range_cyclic, enum wb_reason reason)
>>>>  {
>>>>         struct wb_writeback_work *work;
>>>> +       bool zero_pages = false;
>>>>
>>>>         if (!wb_has_dirty_io(wb))
>>>>                 return;
>>>>
>>>>         /*
>>>> -        * If someone asked for zero pages, we write out the WORLD
>>>> +        * If someone asked for zero pages, we write out the WORLD.
>>>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>>>> +        * the flusher threads to clean out everything. To avoid potentially
>>>> +        * having tons of these pending, ensure that we only allow one of
>>>> +        * them pending and inflight at the time
>>>>          */
>>>> -       if (!nr_pages)
>>>> +       if (!nr_pages) {
>>>> +               if (test_bit(WB_zero_pages, &wb->state))
>>>> +                       return;
>>>> +               set_bit(WB_zero_pages, &wb->state);
>>>
>>> Shouldn't this be test_and_set? not the worst outcome if you have more
>>> than one pending work item, but still.
>>
>> If the frequency of these is high, and they were to trigger the bad
>> conditions we saw, then a split test + set is faster as it won't
>> keep re-dirtying the same cacheline from multiple locations. It's
>> better to leave it a little racy, but faster.
>>
> 
> Fare enough, but then better change the language of the commit message and
> comment above not to claim that there can be only one pending work item.

That's unchanged, the commit message should be fine. We clear the
bit when we start the work item, so we can have one in flight and
one pending.

But it does reference 'zero_pages', I'll update that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20 12:35           ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 12:35 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, Chris Mason, Jan Kara

On 09/20/2017 12:05 AM, Amir Goldstein wrote:
> On Wed, Sep 20, 2017 at 7:13 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
>>> On Tue, Sep 19, 2017 at 10:53 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>> A few callers pass in nr_pages == 0 when they wakeup the flusher
>>>> threads, which means that the flusher should just flush everything
>>>> that was currently dirty. If we are tight on memory, we can get
>>>> tons of these queued from kswapd/vmscan. This causes (at least)
>>>> two problems:
>>>>
>>>> 1) We consume a ton of memory just allocating writeback work items.
>>>> 2) We spend so much time processing these work items, that we
>>>>    introduce a softlockup in writeback processing.
>>>>
>>>> Fix this by adding a 'zero_pages' bit to the writeback structure,
>>>> and set that when someone queues a nr_pages==0 flusher thread
>>>> wakeup. The bit is cleared when we start writeback on that work
>>>> item. If the bit is already set when we attempt to queue !nr_pages
>>>> writeback, then we simply ignore it.
>>>>
>>>> This provides us one of full flush in flight, with one pending as
>>>> well, and makes for more efficient handling of this type of
>>>> writeback.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> ---
>>>>  fs/fs-writeback.c                | 30 ++++++++++++++++++++++++++++--
>>>>  include/linux/backing-dev-defs.h |  1 +
>>>>  2 files changed, 29 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>>>> index a9a86644cb9f..e0240110b36f 100644
>>>> --- a/fs/fs-writeback.c
>>>> +++ b/fs/fs-writeback.c
>>>> @@ -53,6 +53,7 @@ struct wb_writeback_work {
>>>>         unsigned int for_background:1;
>>>>         unsigned int for_sync:1;        /* sync(2) WB_SYNC_ALL writeback */
>>>>         unsigned int auto_free:1;       /* free on completion */
>>>> +       unsigned int zero_pages:1;      /* nr_pages == 0 writeback */
>>>
>>> Suggest: use a name that describes the intention (e.g. WB_everything)
>>
>> Agree, the name isn't the best. WB_everything isn't great either, though,
>> since this isn't an integrity write. WB_start_all would be better,
>> I'll make that change.
>>
>>>>         enum wb_reason reason;          /* why was writeback initiated? */
>>>>
>>>>         struct list_head list;          /* pending work list */
>>>> @@ -948,15 +949,25 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>                                bool range_cyclic, enum wb_reason reason)
>>>>  {
>>>>         struct wb_writeback_work *work;
>>>> +       bool zero_pages = false;
>>>>
>>>>         if (!wb_has_dirty_io(wb))
>>>>                 return;
>>>>
>>>>         /*
>>>> -        * If someone asked for zero pages, we write out the WORLD
>>>> +        * If someone asked for zero pages, we write out the WORLD.
>>>> +        * Places like vmscan and laptop mode want to queue a wakeup to
>>>> +        * the flusher threads to clean out everything. To avoid potentially
>>>> +        * having tons of these pending, ensure that we only allow one of
>>>> +        * them pending and inflight at the time
>>>>          */
>>>> -       if (!nr_pages)
>>>> +       if (!nr_pages) {
>>>> +               if (test_bit(WB_zero_pages, &wb->state))
>>>> +                       return;
>>>> +               set_bit(WB_zero_pages, &wb->state);
>>>
>>> Shouldn't this be test_and_set? not the worst outcome if you have more
>>> than one pending work item, but still.
>>
>> If the frequency of these is high, and they were to trigger the bad
>> conditions we saw, then a split test + set is faster as it won't
>> keep re-dirtying the same cacheline from multiple locations. It's
>> better to leave it a little racy, but faster.
>>
> 
> Fare enough, but then better change the language of the commit message and
> comment above not to claim that there can be only one pending work item.

That's unchanged, the commit message should be fine. We clear the
bit when we start the work item, so we can have one in flight and
one pending.

But it does reference 'zero_pages', I'll update that.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20 14:17     ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:02, Jens Axboe wrote:
> This whole function is... interesting. Change the wakeup call
> to the flusher threads to pass in nr_pages == 0, instead of
> some random number of pages. This matches more closely what
> similar cases do for memory shortage/reclaim.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Ok, probably makes sense. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

BTW, after this nobody seems to use the number of pages for
wakeup_flusher_threads() so can you just delete the argument for the
function? After all system-wide wakeup is useful only for system wide
sync(2) or memory reclaim so number of pages isn't very useful...

								Honza

> ---
>  fs/buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 170df856bdb9..9471a445e370 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -260,7 +260,7 @@ static void free_more_memory(void)
>  	struct zoneref *z;
>  	int nid;
>  
> -	wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
> +	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
>  	yield();
>  
>  	for_each_online_node(nid) {
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
@ 2017-09-20 14:17     ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:02, Jens Axboe wrote:
> This whole function is... interesting. Change the wakeup call
> to the flusher threads to pass in nr_pages == 0, instead of
> some random number of pages. This matches more closely what
> similar cases do for memory shortage/reclaim.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Ok, probably makes sense. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

BTW, after this nobody seems to use the number of pages for
wakeup_flusher_threads() so can you just delete the argument for the
function? After all system-wide wakeup is useful only for system wide
sync(2) or memory reclaim so number of pages isn't very useful...

								Honza

> ---
>  fs/buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 170df856bdb9..9471a445e370 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -260,7 +260,7 @@ static void free_more_memory(void)
>  	struct zoneref *z;
>  	int nid;
>  
> -	wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
> +	wakeup_flusher_threads(0, WB_REASON_FREE_MORE_MEM);
>  	yield();
>  
>  	for_each_online_node(nid) {
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20 14:19     ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:03, Jens Axboe wrote:
> Similar to wakeup_flusher_threads(), except that we only wake
> up the flusher threads on the specified backing device.
> 
> No functional changes in this patch.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c         | 40 ++++++++++++++++++++++++++++++----------
>  include/linux/writeback.h |  2 ++
>  2 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 245c430a2e41..03fda0830bf8 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1947,6 +1947,34 @@ void wb_workfn(struct work_struct *work)
>  }
>  
>  /*
> + * Start writeback of `nr_pages' pages on this bdi. If `nr_pages' is zero,
> + * write back the whole world.
> + */
> +static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
> +					 long nr_pages, enum wb_reason reason)
> +{
> +	struct bdi_writeback *wb;
> +
> +	if (!bdi_has_dirty_io(bdi))
> +		return;
> +
> +	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
> +		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
> +					   false, reason);
> +}
> +
> +void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
> +				enum wb_reason reason)
> +{
> +	if (!nr_pages)
> +		nr_pages = get_nr_dirty_pages();
> +
> +	rcu_read_lock();
> +	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
> +	rcu_read_unlock();
> +}
> +
> +/*
>   * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
>   * the whole world.
>   */
> @@ -1964,16 +1992,8 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
>  		nr_pages = get_nr_dirty_pages();
>  
>  	rcu_read_lock();
> -	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
> -		struct bdi_writeback *wb;
> -
> -		if (!bdi_has_dirty_io(bdi))
> -			continue;
> -
> -		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
> -			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
> -					   false, reason);
> -	}
> +	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
> +		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
>  	rcu_read_unlock();
>  }
>  
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index d5815794416c..5a7ed74d1f6f 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -190,6 +190,8 @@ bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
>  				   enum wb_reason reason);
>  void sync_inodes_sb(struct super_block *);
>  void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
> +void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
> +				enum wb_reason reason);
>  void inode_wait_for_writeback(struct inode *inode);
>  
>  /* writeback.h requires fs.h; it, too, is not included from here. */
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi()
@ 2017-09-20 14:19     ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:03, Jens Axboe wrote:
> Similar to wakeup_flusher_threads(), except that we only wake
> up the flusher threads on the specified backing device.
> 
> No functional changes in this patch.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c         | 40 ++++++++++++++++++++++++++++++----------
>  include/linux/writeback.h |  2 ++
>  2 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 245c430a2e41..03fda0830bf8 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1947,6 +1947,34 @@ void wb_workfn(struct work_struct *work)
>  }
>  
>  /*
> + * Start writeback of `nr_pages' pages on this bdi. If `nr_pages' is zero,
> + * write back the whole world.
> + */
> +static void __wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
> +					 long nr_pages, enum wb_reason reason)
> +{
> +	struct bdi_writeback *wb;
> +
> +	if (!bdi_has_dirty_io(bdi))
> +		return;
> +
> +	list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
> +		wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
> +					   false, reason);
> +}
> +
> +void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
> +				enum wb_reason reason)
> +{
> +	if (!nr_pages)
> +		nr_pages = get_nr_dirty_pages();
> +
> +	rcu_read_lock();
> +	__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
> +	rcu_read_unlock();
> +}
> +
> +/*
>   * Start writeback of `nr_pages' pages.  If `nr_pages' is zero, write back
>   * the whole world.
>   */
> @@ -1964,16 +1992,8 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
>  		nr_pages = get_nr_dirty_pages();
>  
>  	rcu_read_lock();
> -	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) {
> -		struct bdi_writeback *wb;
> -
> -		if (!bdi_has_dirty_io(bdi))
> -			continue;
> -
> -		list_for_each_entry_rcu(wb, &bdi->wb_list, bdi_node)
> -			wb_start_writeback(wb, wb_split_bdi_pages(wb, nr_pages),
> -					   false, reason);
> -	}
> +	list_for_each_entry_rcu(bdi, &bdi_list, bdi_list)
> +		__wakeup_flusher_threads_bdi(bdi, nr_pages, reason);
>  	rcu_read_unlock();
>  }
>  
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index d5815794416c..5a7ed74d1f6f 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -190,6 +190,8 @@ bool try_to_writeback_inodes_sb_nr(struct super_block *, unsigned long nr,
>  				   enum wb_reason reason);
>  void sync_inodes_sb(struct super_block *);
>  void wakeup_flusher_threads(long nr_pages, enum wb_reason reason);
> +void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi, long nr_pages,
> +				enum wb_reason reason);
>  void inode_wait_for_writeback(struct inode *inode);
>  
>  /* writeback.h requires fs.h; it, too, is not included from here. */
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20 14:35     ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:04, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
...
> -	rcu_read_lock();
> -	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
> -		if (wb_has_dirty_io(wb))
> -			wb_start_writeback(wb, nr_pages, true,
> -					   WB_REASON_LAPTOP_TIMER);
> -	rcu_read_unlock();
> +	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
> +					WB_REASON_LAPTOP_TIMER);
>  }

So this slightly changes the semantics since previously we were doing
range_cyclic writeback and now we don't. I don't think this matters in
practice but please mention that in the changelog. With that you can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
@ 2017-09-20 14:35     ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:04, Jens Axboe wrote:
> Laptop mode really wants to writeback the number of dirty
> pages and inodes. Instead of calculating this in the caller,
> just pass in 0 and let wakeup_flusher_threads() handle it.
> 
> Use the new wakeup_flusher_threads_bdi() instead of rolling
> our own.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
...
> -	rcu_read_lock();
> -	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
> -		if (wb_has_dirty_io(wb))
> -			wb_start_writeback(wb, nr_pages, true,
> -					   WB_REASON_LAPTOP_TIMER);
> -	rcu_read_unlock();
> +	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
> +					WB_REASON_LAPTOP_TIMER);
>  }

So this slightly changes the semantics since previously we were doing
range_cyclic writeback and now we don't. I don't think this matters in
practice but please mention that in the changelog. With that you can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 4/6] fs-writeback: make wb_start_writeback() static
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20 14:35     ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:05, Jens Axboe wrote:
> We don't have any callers outside of fs-writeback.c anymore,
> make it private.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c           | 4 ++--
>  include/linux/backing-dev.h | 2 --
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 03fda0830bf8..7564347914f8 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -933,8 +933,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
>  
>  #endif	/* CONFIG_CGROUP_WRITEBACK */
>  
> -void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> -			bool range_cyclic, enum wb_reason reason)
> +static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> +			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
>  
> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index 854e1bdd0b2a..157e950a70dc 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -38,8 +38,6 @@ static inline struct backing_dev_info *bdi_alloc(gfp_t gfp_mask)
>  	return bdi_alloc_node(gfp_mask, NUMA_NO_NODE);
>  }
>  
> -void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> -			bool range_cyclic, enum wb_reason reason);
>  void wb_start_background_writeback(struct bdi_writeback *wb);
>  void wb_workfn(struct work_struct *work);
>  void wb_wakeup_delayed(struct bdi_writeback *wb);
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 4/6] fs-writeback: make wb_start_writeback() static
@ 2017-09-20 14:35     ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:05, Jens Axboe wrote:
> We don't have any callers outside of fs-writeback.c anymore,
> make it private.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c           | 4 ++--
>  include/linux/backing-dev.h | 2 --
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 03fda0830bf8..7564347914f8 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -933,8 +933,8 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
>  
>  #endif	/* CONFIG_CGROUP_WRITEBACK */
>  
> -void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> -			bool range_cyclic, enum wb_reason reason)
> +static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> +			       bool range_cyclic, enum wb_reason reason)
>  {
>  	struct wb_writeback_work *work;
>  
> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index 854e1bdd0b2a..157e950a70dc 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -38,8 +38,6 @@ static inline struct backing_dev_info *bdi_alloc(gfp_t gfp_mask)
>  	return bdi_alloc_node(gfp_mask, NUMA_NO_NODE);
>  }
>  
> -void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> -			bool range_cyclic, enum wb_reason reason);
>  void wb_start_background_writeback(struct bdi_writeback *wb);
>  void wb_workfn(struct work_struct *work);
>  void wb_wakeup_delayed(struct bdi_writeback *wb);
> -- 
> 2.7.4
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-19 19:53   ` Jens Axboe
@ 2017-09-20 14:41     ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:06, Jens Axboe wrote:
> Now that we have no external callers of wb_start_writeback(),
> we can move the nr_pages == 0 logic into that function.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

...

> +static unsigned long get_nr_dirty_pages(void)
> +{
> +	return global_node_page_state(NR_FILE_DIRTY) +
> +		global_node_page_state(NR_UNSTABLE_NFS) +
> +		get_nr_dirty_inodes();
> +}
> +
>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  		return;
>  
>  	/*
> +	 * If someone asked for zero pages, we write out the WORLD
> +	 */
> +	if (!nr_pages)
> +		nr_pages = get_nr_dirty_pages();
> +

So for 'wb' we have a better estimate of the amount we should write - use
wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
unstable_nfs broken down to bdi_writeback.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-20 14:41     ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue 19-09-17 13:53:06, Jens Axboe wrote:
> Now that we have no external callers of wb_start_writeback(),
> we can move the nr_pages == 0 logic into that function.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

...

> +static unsigned long get_nr_dirty_pages(void)
> +{
> +	return global_node_page_state(NR_FILE_DIRTY) +
> +		global_node_page_state(NR_UNSTABLE_NFS) +
> +		get_nr_dirty_inodes();
> +}
> +
>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  			       bool range_cyclic, enum wb_reason reason)
>  {
> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>  		return;
>  
>  	/*
> +	 * If someone asked for zero pages, we write out the WORLD
> +	 */
> +	if (!nr_pages)
> +		nr_pages = get_nr_dirty_pages();
> +

So for 'wb' we have a better estimate of the amount we should write - use
wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
unstable_nfs broken down to bdi_writeback.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
  2017-09-20  4:13       ` Jens Axboe
@ 2017-09-20 14:43         ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Amir Goldstein, linux-kernel, linux-fsdevel, linux-mm, hannes,
	Chris Mason, Jan Kara

On Tue 19-09-17 22:13:25, Jens Axboe wrote:
> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
> New version:
> 
> http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=338a69c217cdaaffda93f3cc9a364a347f782adb

This looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush
@ 2017-09-20 14:43         ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 14:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Amir Goldstein, linux-kernel, linux-fsdevel, linux-mm, hannes,
	Chris Mason, Jan Kara

On Tue 19-09-17 22:13:25, Jens Axboe wrote:
> On 09/19/2017 09:10 PM, Amir Goldstein wrote:
> New version:
> 
> http://git.kernel.dk/cgit/linux-block/commit/?h=writeback-fixup&id=338a69c217cdaaffda93f3cc9a364a347f782adb

This looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 14:41     ` Jan Kara
@ 2017-09-20 15:05       ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:41 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:06, Jens Axboe wrote:
>> Now that we have no external callers of wb_start_writeback(),
>> we can move the nr_pages == 0 logic into that function.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> ...
> 
>> +static unsigned long get_nr_dirty_pages(void)
>> +{
>> +	return global_node_page_state(NR_FILE_DIRTY) +
>> +		global_node_page_state(NR_UNSTABLE_NFS) +
>> +		get_nr_dirty_inodes();
>> +}
>> +
>>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  			       bool range_cyclic, enum wb_reason reason)
>>  {
>> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  		return;
>>  
>>  	/*
>> +	 * If someone asked for zero pages, we write out the WORLD
>> +	 */
>> +	if (!nr_pages)
>> +		nr_pages = get_nr_dirty_pages();
>> +
> 
> So for 'wb' we have a better estimate of the amount we should write - use
> wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
> unstable_nfs broken down to bdi_writeback.

I don't mind making that change, but I think that should be a separate
patch. We're using get_nr_dirty_pages() in existing locations where
we have the 'wb', like in wb_check_old_data_flush().

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-20 15:05       ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:41 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:06, Jens Axboe wrote:
>> Now that we have no external callers of wb_start_writeback(),
>> we can move the nr_pages == 0 logic into that function.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> ...
> 
>> +static unsigned long get_nr_dirty_pages(void)
>> +{
>> +	return global_node_page_state(NR_FILE_DIRTY) +
>> +		global_node_page_state(NR_UNSTABLE_NFS) +
>> +		get_nr_dirty_inodes();
>> +}
>> +
>>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  			       bool range_cyclic, enum wb_reason reason)
>>  {
>> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>  		return;
>>  
>>  	/*
>> +	 * If someone asked for zero pages, we write out the WORLD
>> +	 */
>> +	if (!nr_pages)
>> +		nr_pages = get_nr_dirty_pages();
>> +
> 
> So for 'wb' we have a better estimate of the amount we should write - use
> wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
> unstable_nfs broken down to bdi_writeback.

I don't mind making that change, but I think that should be a separate
patch. We're using get_nr_dirty_pages() in existing locations where
we have the 'wb', like in wb_check_old_data_flush().

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
  2017-09-20 14:17     ` Jan Kara
@ 2017-09-20 15:18       ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:18 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:17 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:02, Jens Axboe wrote:
>> This whole function is... interesting. Change the wakeup call
>> to the flusher threads to pass in nr_pages == 0, instead of
>> some random number of pages. This matches more closely what
>> similar cases do for memory shortage/reclaim.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Ok, probably makes sense. You can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> BTW, after this nobody seems to use the number of pages for
> wakeup_flusher_threads() so can you just delete the argument for the
> function? After all system-wide wakeup is useful only for system wide
> sync(2) or memory reclaim so number of pages isn't very useful...

Great observation! That's true, and if we kill that, it enables
further cleanups down the line for patch 5 and 6. I have
incorporated that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup
@ 2017-09-20 15:18       ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:18 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:17 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:02, Jens Axboe wrote:
>> This whole function is... interesting. Change the wakeup call
>> to the flusher threads to pass in nr_pages == 0, instead of
>> some random number of pages. This matches more closely what
>> similar cases do for memory shortage/reclaim.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Ok, probably makes sense. You can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> BTW, after this nobody seems to use the number of pages for
> wakeup_flusher_threads() so can you just delete the argument for the
> function? After all system-wide wakeup is useful only for system wide
> sync(2) or memory reclaim so number of pages isn't very useful...

Great observation! That's true, and if we kill that, it enables
further cleanups down the line for patch 5 and 6. I have
incorporated that.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
  2017-09-20 14:35     ` Jan Kara
@ 2017-09-20 15:19       ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:19 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:35 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:04, Jens Axboe wrote:
>> Laptop mode really wants to writeback the number of dirty
>> pages and inodes. Instead of calculating this in the caller,
>> just pass in 0 and let wakeup_flusher_threads() handle it.
>>
>> Use the new wakeup_flusher_threads_bdi() instead of rolling
>> our own.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ...
>> -	rcu_read_lock();
>> -	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
>> -		if (wb_has_dirty_io(wb))
>> -			wb_start_writeback(wb, nr_pages, true,
>> -					   WB_REASON_LAPTOP_TIMER);
>> -	rcu_read_unlock();
>> +	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
>> +					WB_REASON_LAPTOP_TIMER);
>>  }
> 
> So this slightly changes the semantics since previously we were doing
> range_cyclic writeback and now we don't. I don't think this matters in
> practice but please mention that in the changelog. With that you can add:

Thanks, I added a note about that in the commit message.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode
@ 2017-09-20 15:19       ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:19 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 08:35 AM, Jan Kara wrote:
> On Tue 19-09-17 13:53:04, Jens Axboe wrote:
>> Laptop mode really wants to writeback the number of dirty
>> pages and inodes. Instead of calculating this in the caller,
>> just pass in 0 and let wakeup_flusher_threads() handle it.
>>
>> Use the new wakeup_flusher_threads_bdi() instead of rolling
>> our own.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ...
>> -	rcu_read_lock();
>> -	list_for_each_entry_rcu(wb, &q->backing_dev_info->wb_list, bdi_node)
>> -		if (wb_has_dirty_io(wb))
>> -			wb_start_writeback(wb, nr_pages, true,
>> -					   WB_REASON_LAPTOP_TIMER);
>> -	rcu_read_unlock();
>> +	wakeup_flusher_threads_bdi(q->backing_dev_info, 0,
>> +					WB_REASON_LAPTOP_TIMER);
>>  }
> 
> So this slightly changes the semantics since previously we were doing
> range_cyclic writeback and now we don't. I don't think this matters in
> practice but please mention that in the changelog. With that you can add:

Thanks, I added a note about that in the commit message.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 15:05       ` Jens Axboe
@ 2017-09-20 15:36         ` Jan Kara
  -1 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 15:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Kara, linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On Wed 20-09-17 09:05:51, Jens Axboe wrote:
> On 09/20/2017 08:41 AM, Jan Kara wrote:
> > On Tue 19-09-17 13:53:06, Jens Axboe wrote:
> >> Now that we have no external callers of wb_start_writeback(),
> >> we can move the nr_pages == 0 logic into that function.
> >>
> >> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > 
> > ...
> > 
> >> +static unsigned long get_nr_dirty_pages(void)
> >> +{
> >> +	return global_node_page_state(NR_FILE_DIRTY) +
> >> +		global_node_page_state(NR_UNSTABLE_NFS) +
> >> +		get_nr_dirty_inodes();
> >> +}
> >> +
> >>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> >>  			       bool range_cyclic, enum wb_reason reason)
> >>  {
> >> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> >>  		return;
> >>  
> >>  	/*
> >> +	 * If someone asked for zero pages, we write out the WORLD
> >> +	 */
> >> +	if (!nr_pages)
> >> +		nr_pages = get_nr_dirty_pages();
> >> +
> > 
> > So for 'wb' we have a better estimate of the amount we should write - use
> > wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
> > unstable_nfs broken down to bdi_writeback.
> 
> I don't mind making that change, but I think that should be a separate
> patch. We're using get_nr_dirty_pages() in existing locations where
> we have the 'wb', like in wb_check_old_data_flush().

Good point and fully agreed. So you can add:

Reviewed-by: Jan Kara <jack@suse.cz>

for this patch.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-20 15:36         ` Jan Kara
  0 siblings, 0 replies; 66+ messages in thread
From: Jan Kara @ 2017-09-20 15:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Kara, linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On Wed 20-09-17 09:05:51, Jens Axboe wrote:
> On 09/20/2017 08:41 AM, Jan Kara wrote:
> > On Tue 19-09-17 13:53:06, Jens Axboe wrote:
> >> Now that we have no external callers of wb_start_writeback(),
> >> we can move the nr_pages == 0 logic into that function.
> >>
> >> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > 
> > ...
> > 
> >> +static unsigned long get_nr_dirty_pages(void)
> >> +{
> >> +	return global_node_page_state(NR_FILE_DIRTY) +
> >> +		global_node_page_state(NR_UNSTABLE_NFS) +
> >> +		get_nr_dirty_inodes();
> >> +}
> >> +
> >>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> >>  			       bool range_cyclic, enum wb_reason reason)
> >>  {
> >> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
> >>  		return;
> >>  
> >>  	/*
> >> +	 * If someone asked for zero pages, we write out the WORLD
> >> +	 */
> >> +	if (!nr_pages)
> >> +		nr_pages = get_nr_dirty_pages();
> >> +
> > 
> > So for 'wb' we have a better estimate of the amount we should write - use
> > wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
> > unstable_nfs broken down to bdi_writeback.
> 
> I don't mind making that change, but I think that should be a separate
> patch. We're using get_nr_dirty_pages() in existing locations where
> we have the 'wb', like in wb_check_old_data_flush().

Good point and fully agreed. So you can add:

Reviewed-by: Jan Kara <jack@suse.cz>

for this patch.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
  2017-09-20 15:36         ` Jan Kara
@ 2017-09-20 15:40           ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:40 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 09:36 AM, Jan Kara wrote:
> On Wed 20-09-17 09:05:51, Jens Axboe wrote:
>> On 09/20/2017 08:41 AM, Jan Kara wrote:
>>> On Tue 19-09-17 13:53:06, Jens Axboe wrote:
>>>> Now that we have no external callers of wb_start_writeback(),
>>>> we can move the nr_pages == 0 logic into that function.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>
>>> ...
>>>
>>>> +static unsigned long get_nr_dirty_pages(void)
>>>> +{
>>>> +	return global_node_page_state(NR_FILE_DIRTY) +
>>>> +		global_node_page_state(NR_UNSTABLE_NFS) +
>>>> +		get_nr_dirty_inodes();
>>>> +}
>>>> +
>>>>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>  			       bool range_cyclic, enum wb_reason reason)
>>>>  {
>>>> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>  		return;
>>>>  
>>>>  	/*
>>>> +	 * If someone asked for zero pages, we write out the WORLD
>>>> +	 */
>>>> +	if (!nr_pages)
>>>> +		nr_pages = get_nr_dirty_pages();
>>>> +
>>>
>>> So for 'wb' we have a better estimate of the amount we should write - use
>>> wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
>>> unstable_nfs broken down to bdi_writeback.
>>
>> I don't mind making that change, but I think that should be a separate
>> patch. We're using get_nr_dirty_pages() in existing locations where
>> we have the 'wb', like in wb_check_old_data_flush().
> 
> Good point and fully agreed. So you can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Thanks Jan, added. I just sent out the new version, mainly because the
removal or 'nr_pages' changes the later patches a bit. All for the
better, in my opinion.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location
@ 2017-09-20 15:40           ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 15:40 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm

On 09/20/2017 09:36 AM, Jan Kara wrote:
> On Wed 20-09-17 09:05:51, Jens Axboe wrote:
>> On 09/20/2017 08:41 AM, Jan Kara wrote:
>>> On Tue 19-09-17 13:53:06, Jens Axboe wrote:
>>>> Now that we have no external callers of wb_start_writeback(),
>>>> we can move the nr_pages == 0 logic into that function.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>
>>> ...
>>>
>>>> +static unsigned long get_nr_dirty_pages(void)
>>>> +{
>>>> +	return global_node_page_state(NR_FILE_DIRTY) +
>>>> +		global_node_page_state(NR_UNSTABLE_NFS) +
>>>> +		get_nr_dirty_inodes();
>>>> +}
>>>> +
>>>>  static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>  			       bool range_cyclic, enum wb_reason reason)
>>>>  {
>>>> @@ -942,6 +953,12 @@ static void wb_start_writeback(struct bdi_writeback *wb, long nr_pages,
>>>>  		return;
>>>>  
>>>>  	/*
>>>> +	 * If someone asked for zero pages, we write out the WORLD
>>>> +	 */
>>>> +	if (!nr_pages)
>>>> +		nr_pages = get_nr_dirty_pages();
>>>> +
>>>
>>> So for 'wb' we have a better estimate of the amount we should write - use
>>> wb_stat_sum(wb, WB_RECLAIMABLE) statistics - that is essentially dirty +
>>> unstable_nfs broken down to bdi_writeback.
>>
>> I don't mind making that change, but I think that should be a separate
>> patch. We're using get_nr_dirty_pages() in existing locations where
>> we have the 'wb', like in wb_check_old_data_flush().
> 
> Good point and fully agreed. So you can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Thanks Jan, added. I just sent out the new version, mainly because the
removal or 'nr_pages' changes the later patches a bit. All for the
better, in my opinion.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
  2017-09-19 19:53 ` Jens Axboe
@ 2017-09-20 19:29   ` John Stoffel
  -1 siblings, 0 replies; 66+ messages in thread
From: John Stoffel @ 2017-09-20 19:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
> We've had some issues with writeback in presence of memory reclaim
> at Facebook, and this patch set attempts to fix it up. The real
> functional change is the last patch in the series, the first 5 are
> prep and cleanup patches.
> 
> The basic idea is that we have callers that call
> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
> everything'. For memory reclaim situations, we can end up queuing
> a TON of these kinds of writeback units. This can cause softlockups
> and further memory issues, since we allocate huge amounts of
> struct wb_writeback_work to handle this writeback. Handle this
> situation more gracefully.

This looks nice, but do you have any numbers to show how this improves
things?  I read the patches, but I'm not strong enough to comment on
them at all.  But I am interested in how this improves writeback under
pressure, if at all.

John

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
@ 2017-09-20 19:29   ` John Stoffel
  0 siblings, 0 replies; 66+ messages in thread
From: John Stoffel @ 2017-09-20 19:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
> We've had some issues with writeback in presence of memory reclaim
> at Facebook, and this patch set attempts to fix it up. The real
> functional change is the last patch in the series, the first 5 are
> prep and cleanup patches.
> 
> The basic idea is that we have callers that call
> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
> everything'. For memory reclaim situations, we can end up queuing
> a TON of these kinds of writeback units. This can cause softlockups
> and further memory issues, since we allocate huge amounts of
> struct wb_writeback_work to handle this writeback. Handle this
> situation more gracefully.

This looks nice, but do you have any numbers to show how this improves
things?  I read the patches, but I'm not strong enough to comment on
them at all.  But I am interested in how this improves writeback under
pressure, if at all.

John

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
  2017-09-20 19:29   ` John Stoffel
@ 2017-09-20 19:32     ` Jens Axboe
  -1 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 19:32 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On 09/20/2017 01:29 PM, John Stoffel wrote:
> On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
>> We've had some issues with writeback in presence of memory reclaim
>> at Facebook, and this patch set attempts to fix it up. The real
>> functional change is the last patch in the series, the first 5 are
>> prep and cleanup patches.
>>
>> The basic idea is that we have callers that call
>> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
>> everything'. For memory reclaim situations, we can end up queuing
>> a TON of these kinds of writeback units. This can cause softlockups
>> and further memory issues, since we allocate huge amounts of
>> struct wb_writeback_work to handle this writeback. Handle this
>> situation more gracefully.
> 
> This looks nice, but do you have any numbers to show how this improves
> things?  I read the patches, but I'm not strong enough to comment on
> them at all.  But I am interested in how this improves writeback under
> pressure, if at all.

Writeback should be about the same, it's mostly about preventing
softlockups and excessive memory usage, under conditions where we are
actively trying to reclaim/clean memory. It was bad enough to cause
softlockups for writeback work processing, while the pending writeback
work units grew to insane lengths.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
@ 2017-09-20 19:32     ` Jens Axboe
  0 siblings, 0 replies; 66+ messages in thread
From: Jens Axboe @ 2017-09-20 19:32 UTC (permalink / raw)
  To: John Stoffel; +Cc: linux-kernel, linux-fsdevel, linux-mm, hannes, clm, jack

On 09/20/2017 01:29 PM, John Stoffel wrote:
> On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
>> We've had some issues with writeback in presence of memory reclaim
>> at Facebook, and this patch set attempts to fix it up. The real
>> functional change is the last patch in the series, the first 5 are
>> prep and cleanup patches.
>>
>> The basic idea is that we have callers that call
>> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
>> everything'. For memory reclaim situations, we can end up queuing
>> a TON of these kinds of writeback units. This can cause softlockups
>> and further memory issues, since we allocate huge amounts of
>> struct wb_writeback_work to handle this writeback. Handle this
>> situation more gracefully.
> 
> This looks nice, but do you have any numbers to show how this improves
> things?  I read the patches, but I'm not strong enough to comment on
> them at all.  But I am interested in how this improves writeback under
> pressure, if at all.

Writeback should be about the same, it's mostly about preventing
softlockups and excessive memory usage, under conditions where we are
actively trying to reclaim/clean memory. It was bad enough to cause
softlockups for writeback work processing, while the pending writeback
work units grew to insane lengths.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
  2017-09-20 19:32     ` Jens Axboe
@ 2017-09-20 23:11       ` Johannes Weiner
  -1 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-20 23:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: John Stoffel, linux-kernel, linux-fsdevel, linux-mm, clm, jack

[ Fixed up CC list. John, you're sending email with
  From: John Stoffel <john@quad.stoffel.home> ]

On Wed, Sep 20, 2017 at 01:32:25PM -0600, Jens Axboe wrote:
> On 09/20/2017 01:29 PM, John Stoffel wrote:
> > On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
> >> We've had some issues with writeback in presence of memory reclaim
> >> at Facebook, and this patch set attempts to fix it up. The real
> >> functional change is the last patch in the series, the first 5 are
> >> prep and cleanup patches.
> >>
> >> The basic idea is that we have callers that call
> >> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
> >> everything'. For memory reclaim situations, we can end up queuing
> >> a TON of these kinds of writeback units. This can cause softlockups
> >> and further memory issues, since we allocate huge amounts of
> >> struct wb_writeback_work to handle this writeback. Handle this
> >> situation more gracefully.
> > 
> > This looks nice, but do you have any numbers to show how this improves
> > things?  I read the patches, but I'm not strong enough to comment on
> > them at all.  But I am interested in how this improves writeback under
> > pressure, if at all.
> 
> Writeback should be about the same, it's mostly about preventing
> softlockups and excessive memory usage, under conditions where we are
> actively trying to reclaim/clean memory. It was bad enough to cause
> softlockups for writeback work processing, while the pending writeback
> work units grew to insane lengths.

In numbers, we have seen situations where we had 600 million writeback
work items queued up from reclaim under pressure. That's 35G worth of
work descriptors, and the machine was struggling to remain responsive
due to a lack of memory.

Once writeback against all outstanding dirty pages has been requested,
there really isn't a need to queue even a second work item; the job is
already being performed. We can queue the next one when it completes.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 0/6] More graceful flusher thread memory reclaim wakeup
@ 2017-09-20 23:11       ` Johannes Weiner
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Weiner @ 2017-09-20 23:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: John Stoffel, linux-kernel, linux-fsdevel, linux-mm, clm, jack

[ Fixed up CC list. John, you're sending email with
  From: John Stoffel <john@quad.stoffel.home> ]

On Wed, Sep 20, 2017 at 01:32:25PM -0600, Jens Axboe wrote:
> On 09/20/2017 01:29 PM, John Stoffel wrote:
> > On Tue, Sep 19, 2017 at 01:53:01PM -0600, Jens Axboe wrote:
> >> We've had some issues with writeback in presence of memory reclaim
> >> at Facebook, and this patch set attempts to fix it up. The real
> >> functional change is the last patch in the series, the first 5 are
> >> prep and cleanup patches.
> >>
> >> The basic idea is that we have callers that call
> >> wakeup_flusher_threads() with nr_pages == 0. This means 'writeback
> >> everything'. For memory reclaim situations, we can end up queuing
> >> a TON of these kinds of writeback units. This can cause softlockups
> >> and further memory issues, since we allocate huge amounts of
> >> struct wb_writeback_work to handle this writeback. Handle this
> >> situation more gracefully.
> > 
> > This looks nice, but do you have any numbers to show how this improves
> > things?  I read the patches, but I'm not strong enough to comment on
> > them at all.  But I am interested in how this improves writeback under
> > pressure, if at all.
> 
> Writeback should be about the same, it's mostly about preventing
> softlockups and excessive memory usage, under conditions where we are
> actively trying to reclaim/clean memory. It was bad enough to cause
> softlockups for writeback work processing, while the pending writeback
> work units grew to insane lengths.

In numbers, we have seen situations where we had 600 million writeback
work items queued up from reclaim under pressure. That's 35G worth of
work descriptors, and the machine was struggling to remain responsive
due to a lack of memory.

Once writeback against all outstanding dirty pages has been requested,
there really isn't a need to queue even a second work item; the job is
already being performed. We can queue the next one when it completes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2017-09-20 23:11 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 19:53 [PATCH 0/6] More graceful flusher thread memory reclaim wakeup Jens Axboe
2017-09-19 19:53 ` Jens Axboe
2017-09-19 19:53 ` [PATCH 1/6] buffer: cleanup free_more_memory() flusher wakeup Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:05   ` Johannes Weiner
2017-09-19 20:05     ` Johannes Weiner
2017-09-20 14:17   ` Jan Kara
2017-09-20 14:17     ` Jan Kara
2017-09-20 15:18     ` Jens Axboe
2017-09-20 15:18       ` Jens Axboe
2017-09-19 19:53 ` [PATCH 2/6] fs-writeback: provide a wakeup_flusher_threads_bdi() Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:05   ` Johannes Weiner
2017-09-19 20:05     ` Johannes Weiner
2017-09-20 14:19   ` Jan Kara
2017-09-20 14:19     ` Jan Kara
2017-09-19 19:53 ` [PATCH 3/6] page-writeback: pass in '0' for nr_pages writeback in laptop mode Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:06   ` Johannes Weiner
2017-09-19 20:06     ` Johannes Weiner
2017-09-20 14:35   ` Jan Kara
2017-09-20 14:35     ` Jan Kara
2017-09-20 15:19     ` Jens Axboe
2017-09-20 15:19       ` Jens Axboe
2017-09-19 19:53 ` [PATCH 4/6] fs-writeback: make wb_start_writeback() static Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:07   ` Johannes Weiner
2017-09-19 20:07     ` Johannes Weiner
2017-09-20 14:35   ` Jan Kara
2017-09-20 14:35     ` Jan Kara
2017-09-19 19:53 ` [PATCH 5/6] fs-writeback: move nr_pages == 0 logic to one location Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:07   ` Johannes Weiner
2017-09-19 20:07     ` Johannes Weiner
2017-09-20 14:41   ` Jan Kara
2017-09-20 14:41     ` Jan Kara
2017-09-20 15:05     ` Jens Axboe
2017-09-20 15:05       ` Jens Axboe
2017-09-20 15:36       ` Jan Kara
2017-09-20 15:36         ` Jan Kara
2017-09-20 15:40         ` Jens Axboe
2017-09-20 15:40           ` Jens Axboe
2017-09-19 19:53 ` [PATCH 6/6] fs-writeback: only allow one inflight and pending !nr_pages flush Jens Axboe
2017-09-19 19:53   ` Jens Axboe
2017-09-19 20:18   ` Johannes Weiner
2017-09-19 20:18     ` Johannes Weiner
2017-09-19 20:39     ` Jens Axboe
2017-09-19 20:39       ` Jens Axboe
2017-09-20  1:57   ` Jens Axboe
2017-09-20  1:57     ` Jens Axboe
2017-09-20  3:10   ` Amir Goldstein
2017-09-20  3:10     ` Amir Goldstein
2017-09-20  4:13     ` Jens Axboe
2017-09-20  4:13       ` Jens Axboe
2017-09-20  6:05       ` Amir Goldstein
2017-09-20  6:05         ` Amir Goldstein
2017-09-20 12:35         ` Jens Axboe
2017-09-20 12:35           ` Jens Axboe
2017-09-20 14:43       ` Jan Kara
2017-09-20 14:43         ` Jan Kara
2017-09-20 19:29 ` [PATCH 0/6] More graceful flusher thread memory reclaim wakeup John Stoffel
2017-09-20 19:29   ` John Stoffel
2017-09-20 19:32   ` Jens Axboe
2017-09-20 19:32     ` Jens Axboe
2017-09-20 23:11     ` Johannes Weiner
2017-09-20 23:11       ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.