All of lore.kernel.org
 help / color / mirror / Atom feed
* Convert write_cache_pages() to an iterator v2
@ 2023-12-14 13:25 Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 01/11] writeback: Factor out writeback_finish() Christoph Hellwig
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

Hi all,

this is basically a report of willy's series of the same name for June.
I picked the lastest version from his (now apparently defunct) git tree,
rebased it to mainline (no coflict, neither for linux-next), reordered
the new fields in struct writeback_control to document what is interface
vs internal, and temporarily dropped the iomap patch due to a conflict
in the VFS tree.

willy: let me know if me reposting it like this is fine, or if you
want me to stop.  I'd just really like to see it merged :)
Note that patch 4 is missing your signoff, so we'd need that before
proceeding anyway.

The original cover letter is below:

Dave Howells doesn't like the indirect function call imposed by
write_cache_pages(), so refactor it into an iterator.  I took the
opportunity to add the ability to iterate a folio_batch without having
an external variable.

This is against next-20230623.  If you try to apply it on top of a tree
which doesn't include the pagevec removal series, IT WILL CRASH because
it won't reinitialise folio_batch->i and the iteration will index out
of bounds.

I have a feeling the 'done' parameter could have a better name, but I
can't think what it might be.

Diffstat:
 include/linux/pagevec.h   |   18 ++
 include/linux/writeback.h |   24 +++
 mm/page-writeback.c       |  313 +++++++++++++++++++++++++---------------------
 3 files changed, 211 insertions(+), 144 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/11] writeback: Factor out writeback_finish()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-15 13:26   ` Jan Kara
  2023-12-14 13:25 ` [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages() Christoph Hellwig
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Instead of having a 'done' variable that controls the nested loops,
have a writeback_finish() that can be returned directly.  This involves
keeping more things in writeback_control, but it's just moving stuff
allocated on the stack to being allocated slightly earlier on the stack.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[hch: reorder and comment struct writeback_control]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/writeback.h |  8 +++++
 mm/page-writeback.c       | 72 +++++++++++++++++++++------------------
 2 files changed, 46 insertions(+), 34 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 083387c00f0c8b..05e8add4b5ae3c 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -11,6 +11,7 @@
 #include <linux/flex_proportions.h>
 #include <linux/backing-dev-defs.h>
 #include <linux/blk_types.h>
+#include <linux/pagevec.h>
 
 struct bio;
 
@@ -40,6 +41,7 @@ enum writeback_sync_modes {
  * in a manner such that unspecified fields are set to zero.
  */
 struct writeback_control {
+	/* public fields that can be set and/or consumed by the caller: */
 	long nr_to_write;		/* Write this many pages, and decrement
 					   this for each page written */
 	long pages_skipped;		/* Pages which were not written */
@@ -77,6 +79,12 @@ struct writeback_control {
 	 */
 	struct swap_iocb **swap_plug;
 
+	/* internal fields used by the ->writepages implementation: */
+	struct folio_batch fbatch;
+	pgoff_t done_index;
+	int err;
+	unsigned range_whole:1;		/* entire file */
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 	struct bdi_writeback *wb;	/* wb this writeback is issued under */
 	struct inode *inode;		/* inode being written out */
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index ee2fd6a6af4072..45309f3b8193f8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2360,6 +2360,24 @@ void tag_pages_for_writeback(struct address_space *mapping,
 }
 EXPORT_SYMBOL(tag_pages_for_writeback);
 
+static int writeback_finish(struct address_space *mapping,
+		struct writeback_control *wbc, bool done)
+{
+	folio_batch_release(&wbc->fbatch);
+
+	/*
+	 * If we hit the last page and there is more work to be done:
+	 * wrap the index back to the start of the file for the next
+	 * time we are called.
+	 */
+	if (wbc->range_cyclic && !done)
+		wbc->done_index = 0;
+	if (wbc->range_cyclic || (wbc->range_whole && wbc->nr_to_write > 0))
+		mapping->writeback_index = wbc->done_index;
+
+	return wbc->err;
+}
+
 /**
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
@@ -2395,18 +2413,12 @@ int write_cache_pages(struct address_space *mapping,
 		      struct writeback_control *wbc, writepage_t writepage,
 		      void *data)
 {
-	int ret = 0;
-	int done = 0;
 	int error;
-	struct folio_batch fbatch;
 	int nr_folios;
 	pgoff_t index;
 	pgoff_t end;		/* Inclusive */
-	pgoff_t done_index;
-	int range_whole = 0;
 	xa_mark_t tag;
 
-	folio_batch_init(&fbatch);
 	if (wbc->range_cyclic) {
 		index = mapping->writeback_index; /* prev offset */
 		end = -1;
@@ -2414,7 +2426,7 @@ int write_cache_pages(struct address_space *mapping,
 		index = wbc->range_start >> PAGE_SHIFT;
 		end = wbc->range_end >> PAGE_SHIFT;
 		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
-			range_whole = 1;
+			wbc->range_whole = 1;
 	}
 	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages) {
 		tag_pages_for_writeback(mapping, index, end);
@@ -2422,21 +2434,25 @@ int write_cache_pages(struct address_space *mapping,
 	} else {
 		tag = PAGECACHE_TAG_DIRTY;
 	}
-	done_index = index;
-	while (!done && (index <= end)) {
+
+	wbc->done_index = index;
+	folio_batch_init(&wbc->fbatch);
+	wbc->err = 0;
+
+	while (index <= end) {
 		int i;
 
 		nr_folios = filemap_get_folios_tag(mapping, &index, end,
-				tag, &fbatch);
+				tag, &wbc->fbatch);
 
 		if (nr_folios == 0)
 			break;
 
 		for (i = 0; i < nr_folios; i++) {
-			struct folio *folio = fbatch.folios[i];
+			struct folio *folio = wbc->fbatch.folios[i];
 			unsigned long nr;
 
-			done_index = folio->index;
+			wbc->done_index = folio->index;
 
 			folio_lock(folio);
 
@@ -2490,13 +2506,13 @@ int write_cache_pages(struct address_space *mapping,
 					folio_unlock(folio);
 					error = 0;
 				} else if (wbc->sync_mode != WB_SYNC_ALL) {
-					ret = error;
-					done_index = folio->index + nr;
-					done = 1;
-					break;
+					wbc->err = error;
+					wbc->done_index = folio->index + nr;
+					return writeback_finish(mapping,
+							wbc, true);
 				}
-				if (!ret)
-					ret = error;
+				if (!wbc->err)
+					wbc->err = error;
 			}
 
 			/*
@@ -2507,26 +2523,14 @@ int write_cache_pages(struct address_space *mapping,
 			 */
 			wbc->nr_to_write -= nr;
 			if (wbc->nr_to_write <= 0 &&
-			    wbc->sync_mode == WB_SYNC_NONE) {
-				done = 1;
-				break;
-			}
+			    wbc->sync_mode == WB_SYNC_NONE)
+				return writeback_finish(mapping, wbc, true);
 		}
-		folio_batch_release(&fbatch);
+		folio_batch_release(&wbc->fbatch);
 		cond_resched();
 	}
 
-	/*
-	 * If we hit the last page and there is more work to be done: wrap
-	 * back the index back to the start of the file for the next
-	 * time we are called.
-	 */
-	if (wbc->range_cyclic && !done)
-		done_index = 0;
-	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
-		mapping->writeback_index = done_index;
-
-	return ret;
+	return writeback_finish(mapping, wbc, false);
 }
 EXPORT_SYMBOL(write_cache_pages);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 01/11] writeback: Factor out writeback_finish() Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-15 14:14   ` Jan Kara
  2023-12-14 13:25 ` [PATCH 03/11] writeback: Factor should_writeback_folio() " Christoph Hellwig
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

This simple helper will be the basis of the writeback iterator.
To make this work, we need to remember the current index
and end positions in writeback_control.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/writeback.h |  2 ++
 mm/page-writeback.c       | 49 +++++++++++++++++++++------------------
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 05e8add4b5ae3c..be960f92ad9dbd 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -81,6 +81,8 @@ struct writeback_control {
 
 	/* internal fields used by the ->writepages implementation: */
 	struct folio_batch fbatch;
+	pgoff_t index;
+	pgoff_t end;			/* inclusive */
 	pgoff_t done_index;
 	int err;
 	unsigned range_whole:1;		/* entire file */
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 45309f3b8193f8..5d33e7b468e2cc 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2378,6 +2378,22 @@ static int writeback_finish(struct address_space *mapping,
 	return wbc->err;
 }
 
+static void writeback_get_batch(struct address_space *mapping,
+		struct writeback_control *wbc)
+{
+	xa_mark_t tag;
+
+	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+		tag = PAGECACHE_TAG_TOWRITE;
+	else
+		tag = PAGECACHE_TAG_DIRTY;
+
+	folio_batch_release(&wbc->fbatch);
+	cond_resched();
+	filemap_get_folios_tag(mapping, &wbc->index, wbc->end, tag,
+			&wbc->fbatch);
+}
+
 /**
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
@@ -2414,41 +2430,32 @@ int write_cache_pages(struct address_space *mapping,
 		      void *data)
 {
 	int error;
-	int nr_folios;
-	pgoff_t index;
-	pgoff_t end;		/* Inclusive */
-	xa_mark_t tag;
 
 	if (wbc->range_cyclic) {
-		index = mapping->writeback_index; /* prev offset */
-		end = -1;
+		wbc->index = mapping->writeback_index; /* prev offset */
+		wbc->end = -1;
 	} else {
-		index = wbc->range_start >> PAGE_SHIFT;
-		end = wbc->range_end >> PAGE_SHIFT;
+		wbc->index = wbc->range_start >> PAGE_SHIFT;
+		wbc->end = wbc->range_end >> PAGE_SHIFT;
 		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
 			wbc->range_whole = 1;
 	}
-	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages) {
-		tag_pages_for_writeback(mapping, index, end);
-		tag = PAGECACHE_TAG_TOWRITE;
-	} else {
-		tag = PAGECACHE_TAG_DIRTY;
-	}
+	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+		tag_pages_for_writeback(mapping, wbc->index, wbc->end);
 
-	wbc->done_index = index;
+	wbc->done_index = wbc->index;
 	folio_batch_init(&wbc->fbatch);
 	wbc->err = 0;
 
-	while (index <= end) {
+	while (wbc->index <= wbc->end) {
 		int i;
 
-		nr_folios = filemap_get_folios_tag(mapping, &index, end,
-				tag, &wbc->fbatch);
+		writeback_get_batch(mapping, wbc);
 
-		if (nr_folios == 0)
+		if (wbc->fbatch.nr == 0)
 			break;
 
-		for (i = 0; i < nr_folios; i++) {
+		for (i = 0; i < wbc->fbatch.nr; i++) {
 			struct folio *folio = wbc->fbatch.folios[i];
 			unsigned long nr;
 
@@ -2526,8 +2533,6 @@ int write_cache_pages(struct address_space *mapping,
 			    wbc->sync_mode == WB_SYNC_NONE)
 				return writeback_finish(mapping, wbc, true);
 		}
-		folio_batch_release(&wbc->fbatch);
-		cond_resched();
 	}
 
 	return writeback_finish(mapping, wbc, false);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/11] writeback: Factor should_writeback_folio() out of write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 01/11] writeback: Factor out writeback_finish() Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages() Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-15 14:16   ` Jan Kara
  2023-12-14 13:25 ` [PATCH 04/11] writeback: Simplify the loops in write_cache_pages() Christoph Hellwig
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Reduce write_cache_pages() by about 30 lines; much of it is commentary,
but it all bundles nicely into an obvious function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 59 ++++++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 27 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 5d33e7b468e2cc..5a3df8665ff4f9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2394,6 +2394,36 @@ static void writeback_get_batch(struct address_space *mapping,
 			&wbc->fbatch);
 }
 
+static bool should_writeback_folio(struct address_space *mapping,
+		struct writeback_control *wbc, struct folio *folio)
+{
+	/*
+	 * Folio truncated or invalidated. We can freely skip it then,
+	 * even for data integrity operations: the folio has disappeared
+	 * concurrently, so there could be no real expectation of this
+	 * data integrity operation even if there is now a new, dirty
+	 * folio at the same pagecache index.
+	 */
+	if (unlikely(folio->mapping != mapping))
+		return false;
+
+	/* Did somebody write it for us? */
+	if (!folio_test_dirty(folio))
+		return false;
+
+	if (folio_test_writeback(folio)) {
+		if (wbc->sync_mode == WB_SYNC_NONE)
+			return false;
+		folio_wait_writeback(folio);
+	}
+
+	BUG_ON(folio_test_writeback(folio));
+	if (!folio_clear_dirty_for_io(folio))
+		return false;
+
+	return true;
+}
+
 /**
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
@@ -2462,38 +2492,13 @@ int write_cache_pages(struct address_space *mapping,
 			wbc->done_index = folio->index;
 
 			folio_lock(folio);
-
-			/*
-			 * Page truncated or invalidated. We can freely skip it
-			 * then, even for data integrity operations: the page
-			 * has disappeared concurrently, so there could be no
-			 * real expectation of this data integrity operation
-			 * even if there is now a new, dirty page at the same
-			 * pagecache address.
-			 */
-			if (unlikely(folio->mapping != mapping)) {
-continue_unlock:
+			if (!should_writeback_folio(mapping, wbc, folio)) {
 				folio_unlock(folio);
 				continue;
 			}
 
-			if (!folio_test_dirty(folio)) {
-				/* someone wrote it for us */
-				goto continue_unlock;
-			}
-
-			if (folio_test_writeback(folio)) {
-				if (wbc->sync_mode != WB_SYNC_NONE)
-					folio_wait_writeback(folio);
-				else
-					goto continue_unlock;
-			}
-
-			BUG_ON(folio_test_writeback(folio));
-			if (!folio_clear_dirty_for_io(folio))
-				goto continue_unlock;
-
 			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
+
 			error = writepage(folio, wbc, data);
 			nr = folio_nr_pages(folio);
 			if (unlikely(error)) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/11] writeback: Simplify the loops in write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 03/11] writeback: Factor should_writeback_folio() " Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-15 14:25   ` Jan Kara
  2023-12-16  6:16   ` Matthew Wilcox
  2023-12-14 13:25 ` [PATCH 05/11] pagevec: Add ability to iterate a queue Christoph Hellwig
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Collapse the two nested loops into one.  This is needed as a step
towards turning this into an iterator.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 98 ++++++++++++++++++++++-----------------------
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 5a3df8665ff4f9..2087d16115710e 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2460,6 +2460,7 @@ int write_cache_pages(struct address_space *mapping,
 		      void *data)
 {
 	int error;
+	int i = 0;
 
 	if (wbc->range_cyclic) {
 		wbc->index = mapping->writeback_index; /* prev offset */
@@ -2477,67 +2478,66 @@ int write_cache_pages(struct address_space *mapping,
 	folio_batch_init(&wbc->fbatch);
 	wbc->err = 0;
 
-	while (wbc->index <= wbc->end) {
-		int i;
-
-		writeback_get_batch(mapping, wbc);
+	for (;;) {
+		struct folio *folio;
+		unsigned long nr;
 
+		if (i == wbc->fbatch.nr) {
+			writeback_get_batch(mapping, wbc);
+			i = 0;
+		}
 		if (wbc->fbatch.nr == 0)
 			break;
 
-		for (i = 0; i < wbc->fbatch.nr; i++) {
-			struct folio *folio = wbc->fbatch.folios[i];
-			unsigned long nr;
+		folio = wbc->fbatch.folios[i++];
 
-			wbc->done_index = folio->index;
+		wbc->done_index = folio->index;
 
-			folio_lock(folio);
-			if (!should_writeback_folio(mapping, wbc, folio)) {
-				folio_unlock(folio);
-				continue;
-			}
+		folio_lock(folio);
+		if (!should_writeback_folio(mapping, wbc, folio)) {
+			folio_unlock(folio);
+			continue;
+		}
 
-			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
-
-			error = writepage(folio, wbc, data);
-			nr = folio_nr_pages(folio);
-			if (unlikely(error)) {
-				/*
-				 * Handle errors according to the type of
-				 * writeback. There's no need to continue for
-				 * background writeback. Just push done_index
-				 * past this page so media errors won't choke
-				 * writeout for the entire file. For integrity
-				 * writeback, we must process the entire dirty
-				 * set regardless of errors because the fs may
-				 * still have state to clear for each page. In
-				 * that case we continue processing and return
-				 * the first error.
-				 */
-				if (error == AOP_WRITEPAGE_ACTIVATE) {
-					folio_unlock(folio);
-					error = 0;
-				} else if (wbc->sync_mode != WB_SYNC_ALL) {
-					wbc->err = error;
-					wbc->done_index = folio->index + nr;
-					return writeback_finish(mapping,
-							wbc, true);
-				}
-				if (!wbc->err)
-					wbc->err = error;
-			}
+		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
 
+		error = writepage(folio, wbc, data);
+		nr = folio_nr_pages(folio);
+		if (unlikely(error)) {
 			/*
-			 * We stop writing back only if we are not doing
-			 * integrity sync. In case of integrity sync we have to
-			 * keep going until we have written all the pages
-			 * we tagged for writeback prior to entering this loop.
+			 * Handle errors according to the type of
+			 * writeback. There's no need to continue for
+			 * background writeback. Just push done_index
+			 * past this page so media errors won't choke
+			 * writeout for the entire file. For integrity
+			 * writeback, we must process the entire dirty
+			 * set regardless of errors because the fs may
+			 * still have state to clear for each page. In
+			 * that case we continue processing and return
+			 * the first error.
 			 */
-			wbc->nr_to_write -= nr;
-			if (wbc->nr_to_write <= 0 &&
-			    wbc->sync_mode == WB_SYNC_NONE)
+			if (error == AOP_WRITEPAGE_ACTIVATE) {
+				folio_unlock(folio);
+				error = 0;
+			} else if (wbc->sync_mode != WB_SYNC_ALL) {
+				wbc->err = error;
+				wbc->done_index = folio->index + nr;
 				return writeback_finish(mapping, wbc, true);
+			}
+			if (!wbc->err)
+				wbc->err = error;
 		}
+
+		/*
+		 * We stop writing back only if we are not doing
+		 * integrity sync. In case of integrity sync we have to
+		 * keep going until we have written all the pages
+		 * we tagged for writeback prior to entering this loop.
+		 */
+		wbc->nr_to_write -= nr;
+		if (wbc->nr_to_write <= 0 &&
+		    wbc->sync_mode == WB_SYNC_NONE)
+			return writeback_finish(mapping, wbc, true);
 	}
 
 	return writeback_finish(mapping, wbc, false);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/11] pagevec: Add ability to iterate a queue
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 04/11] writeback: Simplify the loops in write_cache_pages() Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 06/11] writeback: Use the folio_batch queue iterator Christoph Hellwig
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Add a loop counter inside the folio_batch to let us iterate from 0-nr
instead of decrementing nr and treating the batch as a stack.  It would
generate some very weird and suboptimal I/O patterns for page writeback
to iterate over the batch as a stack.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/pagevec.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
index 87cc678adc850b..fcc06c300a72c3 100644
--- a/include/linux/pagevec.h
+++ b/include/linux/pagevec.h
@@ -27,6 +27,7 @@ struct folio;
  */
 struct folio_batch {
 	unsigned char nr;
+	unsigned char i;
 	bool percpu_pvec_drained;
 	struct folio *folios[PAGEVEC_SIZE];
 };
@@ -40,12 +41,14 @@ struct folio_batch {
 static inline void folio_batch_init(struct folio_batch *fbatch)
 {
 	fbatch->nr = 0;
+	fbatch->i = 0;
 	fbatch->percpu_pvec_drained = false;
 }
 
 static inline void folio_batch_reinit(struct folio_batch *fbatch)
 {
 	fbatch->nr = 0;
+	fbatch->i = 0;
 }
 
 static inline unsigned int folio_batch_count(struct folio_batch *fbatch)
@@ -75,6 +78,21 @@ static inline unsigned folio_batch_add(struct folio_batch *fbatch,
 	return folio_batch_space(fbatch);
 }
 
+/**
+ * folio_batch_next - Return the next folio to process.
+ * @fbatch: The folio batch being processed.
+ *
+ * Use this function to implement a queue of folios.
+ *
+ * Return: The next folio in the queue, or NULL if the queue is empty.
+ */
+static inline struct folio *folio_batch_next(struct folio_batch *fbatch)
+{
+	if (fbatch->i == fbatch->nr)
+		return NULL;
+	return fbatch->folios[fbatch->i++];
+}
+
 void __folio_batch_release(struct folio_batch *pvec);
 
 static inline void folio_batch_release(struct folio_batch *fbatch)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/11] writeback: Use the folio_batch queue iterator
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 05/11] pagevec: Add ability to iterate a queue Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 07/11] writeback: Factor writeback_iter_init() out of write_cache_pages() Christoph Hellwig
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Instead of keeping our own local iterator variable, use the one just
added to folio_batch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2087d16115710e..2243a0d1b2d3c7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2378,11 +2378,15 @@ static int writeback_finish(struct address_space *mapping,
 	return wbc->err;
 }
 
-static void writeback_get_batch(struct address_space *mapping,
+static struct folio *writeback_get_next(struct address_space *mapping,
 		struct writeback_control *wbc)
 {
+	struct folio *folio = folio_batch_next(&wbc->fbatch);
 	xa_mark_t tag;
 
+	if (folio)
+		return folio;
+
 	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
 		tag = PAGECACHE_TAG_TOWRITE;
 	else
@@ -2392,6 +2396,7 @@ static void writeback_get_batch(struct address_space *mapping,
 	cond_resched();
 	filemap_get_folios_tag(mapping, &wbc->index, wbc->end, tag,
 			&wbc->fbatch);
+	return folio_batch_next(&wbc->fbatch);
 }
 
 static bool should_writeback_folio(struct address_space *mapping,
@@ -2460,7 +2465,6 @@ int write_cache_pages(struct address_space *mapping,
 		      void *data)
 {
 	int error;
-	int i = 0;
 
 	if (wbc->range_cyclic) {
 		wbc->index = mapping->writeback_index; /* prev offset */
@@ -2479,18 +2483,12 @@ int write_cache_pages(struct address_space *mapping,
 	wbc->err = 0;
 
 	for (;;) {
-		struct folio *folio;
+		struct folio *folio = writeback_get_next(mapping, wbc);
 		unsigned long nr;
 
-		if (i == wbc->fbatch.nr) {
-			writeback_get_batch(mapping, wbc);
-			i = 0;
-		}
-		if (wbc->fbatch.nr == 0)
+		if (!folio)
 			break;
 
-		folio = wbc->fbatch.folios[i++];
-
 		wbc->done_index = folio->index;
 
 		folio_lock(folio);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/11] writeback: Factor writeback_iter_init() out of write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 06/11] writeback: Use the folio_batch queue iterator Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 08/11] writeback: Factor writeback_get_folio() " Christoph Hellwig
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Make it return the first folio in the batch so that we can use it
in a typical for() pattern.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 47 +++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2243a0d1b2d3c7..8c220c6a7f824d 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2429,6 +2429,28 @@ static bool should_writeback_folio(struct address_space *mapping,
 	return true;
 }
 
+static struct folio *writeback_iter_init(struct address_space *mapping,
+		struct writeback_control *wbc)
+{
+	if (wbc->range_cyclic) {
+		wbc->index = mapping->writeback_index; /* prev offset */
+		wbc->end = -1;
+	} else {
+		wbc->index = wbc->range_start >> PAGE_SHIFT;
+		wbc->end = wbc->range_end >> PAGE_SHIFT;
+		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
+			wbc->range_whole = 1;
+	}
+	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+		tag_pages_for_writeback(mapping, wbc->index, wbc->end);
+
+	wbc->done_index = wbc->index;
+	folio_batch_init(&wbc->fbatch);
+	wbc->err = 0;
+
+	return writeback_get_next(mapping, wbc);
+}
+
 /**
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
@@ -2464,31 +2486,14 @@ int write_cache_pages(struct address_space *mapping,
 		      struct writeback_control *wbc, writepage_t writepage,
 		      void *data)
 {
+	struct folio *folio;
 	int error;
 
-	if (wbc->range_cyclic) {
-		wbc->index = mapping->writeback_index; /* prev offset */
-		wbc->end = -1;
-	} else {
-		wbc->index = wbc->range_start >> PAGE_SHIFT;
-		wbc->end = wbc->range_end >> PAGE_SHIFT;
-		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
-			wbc->range_whole = 1;
-	}
-	if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
-		tag_pages_for_writeback(mapping, wbc->index, wbc->end);
-
-	wbc->done_index = wbc->index;
-	folio_batch_init(&wbc->fbatch);
-	wbc->err = 0;
-
-	for (;;) {
-		struct folio *folio = writeback_get_next(mapping, wbc);
+	for (folio = writeback_iter_init(mapping, wbc);
+	     folio;
+	     folio = writeback_get_next(mapping, wbc)) {
 		unsigned long nr;
 
-		if (!folio)
-			break;
-
 		wbc->done_index = folio->index;
 
 		folio_lock(folio);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/11] writeback: Factor writeback_get_folio() out of write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 07/11] writeback: Factor writeback_iter_init() out of write_cache_pages() Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 09/11] writeback: Factor writeback_iter_next() " Christoph Hellwig
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Move the loop for should-we-write-this-folio to its own function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8c220c6a7f824d..b0accca1f4bfa7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2429,6 +2429,27 @@ static bool should_writeback_folio(struct address_space *mapping,
 	return true;
 }
 
+static struct folio *writeback_get_folio(struct address_space *mapping,
+		struct writeback_control *wbc)
+{
+	struct folio *folio;
+
+	for (;;) {
+		folio = writeback_get_next(mapping, wbc);
+		if (!folio)
+			return NULL;
+		wbc->done_index = folio->index;
+
+		folio_lock(folio);
+		if (likely(should_writeback_folio(mapping, wbc, folio)))
+			break;
+		folio_unlock(folio);
+	}
+
+	trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
+	return folio;
+}
+
 static struct folio *writeback_iter_init(struct address_space *mapping,
 		struct writeback_control *wbc)
 {
@@ -2448,7 +2469,7 @@ static struct folio *writeback_iter_init(struct address_space *mapping,
 	folio_batch_init(&wbc->fbatch);
 	wbc->err = 0;
 
-	return writeback_get_next(mapping, wbc);
+	return writeback_get_folio(mapping, wbc);
 }
 
 /**
@@ -2491,19 +2512,9 @@ int write_cache_pages(struct address_space *mapping,
 
 	for (folio = writeback_iter_init(mapping, wbc);
 	     folio;
-	     folio = writeback_get_next(mapping, wbc)) {
+	     folio = writeback_get_folio(mapping, wbc)) {
 		unsigned long nr;
 
-		wbc->done_index = folio->index;
-
-		folio_lock(folio);
-		if (!should_writeback_folio(mapping, wbc, folio)) {
-			folio_unlock(folio);
-			continue;
-		}
-
-		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
-
 		error = writepage(folio, wbc, data);
 		nr = folio_nr_pages(folio);
 		if (unlikely(error)) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/11] writeback: Factor writeback_iter_next() out of write_cache_pages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 08/11] writeback: Factor writeback_get_folio() " Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-15 14:17   ` Brian Foster
  2023-12-14 13:25 ` [PATCH 10/11] writeback: Add for_each_writeback_folio() Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 11/11] writeback: Remove a use of write_cache_pages() from do_writepages() Christoph Hellwig
  10 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Pull the post-processing of the writepage_t callback into a
separate function.  That means changing writeback_finish() to
return NULL, and writeback_get_next() to call writeback_finish()
when we naturally run out of folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 89 +++++++++++++++++++++++----------------------
 1 file changed, 46 insertions(+), 43 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index b0accca1f4bfa7..4fae912f7a86e2 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2360,7 +2360,7 @@ void tag_pages_for_writeback(struct address_space *mapping,
 }
 EXPORT_SYMBOL(tag_pages_for_writeback);
 
-static int writeback_finish(struct address_space *mapping,
+static struct folio *writeback_finish(struct address_space *mapping,
 		struct writeback_control *wbc, bool done)
 {
 	folio_batch_release(&wbc->fbatch);
@@ -2375,7 +2375,7 @@ static int writeback_finish(struct address_space *mapping,
 	if (wbc->range_cyclic || (wbc->range_whole && wbc->nr_to_write > 0))
 		mapping->writeback_index = wbc->done_index;
 
-	return wbc->err;
+	return NULL;
 }
 
 static struct folio *writeback_get_next(struct address_space *mapping,
@@ -2437,7 +2437,7 @@ static struct folio *writeback_get_folio(struct address_space *mapping,
 	for (;;) {
 		folio = writeback_get_next(mapping, wbc);
 		if (!folio)
-			return NULL;
+			return writeback_finish(mapping, wbc, false);
 		wbc->done_index = folio->index;
 
 		folio_lock(folio);
@@ -2472,6 +2472,47 @@ static struct folio *writeback_iter_init(struct address_space *mapping,
 	return writeback_get_folio(mapping, wbc);
 }
 
+static struct folio *writeback_iter_next(struct address_space *mapping,
+		struct writeback_control *wbc, struct folio *folio, int error)
+{
+	unsigned long nr = folio_nr_pages(folio);
+
+	if (unlikely(error)) {
+		/*
+		 * Handle errors according to the type of writeback.
+		 * There's no need to continue for background writeback.
+		 * Just push done_index past this folio so media
+		 * errors won't choke writeout for the entire file.
+		 * For integrity writeback, we must process the entire
+		 * dirty set regardless of errors because the fs may
+		 * still have state to clear for each folio.  In that
+		 * case we continue processing and return the first error.
+		 */
+		if (error == AOP_WRITEPAGE_ACTIVATE) {
+			folio_unlock(folio);
+			error = 0;
+		} else if (wbc->sync_mode != WB_SYNC_ALL) {
+			wbc->err = error;
+			wbc->done_index = folio->index + nr;
+			return writeback_finish(mapping, wbc, true);
+		}
+		if (!wbc->err)
+			wbc->err = error;
+	}
+
+	/*
+	 * We stop writing back only if we are not doing integrity
+	 * sync. In case of integrity sync we have to keep going until
+	 * we have written all the folios we tagged for writeback prior
+	 * to entering this loop.
+	 */
+	wbc->nr_to_write -= nr;
+	if (wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE)
+		return writeback_finish(mapping, wbc, true);
+
+	return writeback_get_folio(mapping, wbc);
+}
+
 /**
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
@@ -2512,49 +2553,11 @@ int write_cache_pages(struct address_space *mapping,
 
 	for (folio = writeback_iter_init(mapping, wbc);
 	     folio;
-	     folio = writeback_get_folio(mapping, wbc)) {
-		unsigned long nr;
-
+	     folio = writeback_iter_next(mapping, wbc, folio, error)) {
 		error = writepage(folio, wbc, data);
-		nr = folio_nr_pages(folio);
-		if (unlikely(error)) {
-			/*
-			 * Handle errors according to the type of
-			 * writeback. There's no need to continue for
-			 * background writeback. Just push done_index
-			 * past this page so media errors won't choke
-			 * writeout for the entire file. For integrity
-			 * writeback, we must process the entire dirty
-			 * set regardless of errors because the fs may
-			 * still have state to clear for each page. In
-			 * that case we continue processing and return
-			 * the first error.
-			 */
-			if (error == AOP_WRITEPAGE_ACTIVATE) {
-				folio_unlock(folio);
-				error = 0;
-			} else if (wbc->sync_mode != WB_SYNC_ALL) {
-				wbc->err = error;
-				wbc->done_index = folio->index + nr;
-				return writeback_finish(mapping, wbc, true);
-			}
-			if (!wbc->err)
-				wbc->err = error;
-		}
-
-		/*
-		 * We stop writing back only if we are not doing
-		 * integrity sync. In case of integrity sync we have to
-		 * keep going until we have written all the pages
-		 * we tagged for writeback prior to entering this loop.
-		 */
-		wbc->nr_to_write -= nr;
-		if (wbc->nr_to_write <= 0 &&
-		    wbc->sync_mode == WB_SYNC_NONE)
-			return writeback_finish(mapping, wbc, true);
 	}
 
-	return writeback_finish(mapping, wbc, false);
+	return wbc->err;
 }
 EXPORT_SYMBOL(write_cache_pages);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/11] writeback: Add for_each_writeback_folio()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 09/11] writeback: Factor writeback_iter_next() " Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  2023-12-14 13:25 ` [PATCH 11/11] writeback: Remove a use of write_cache_pages() from do_writepages() Christoph Hellwig
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Wrap up the iterator with a nice bit of syntactic sugar.  Now the
caller doesn't need to know about wbc->err and can just return error,
not knowing that the iterator took care of storing errors correctly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/writeback.h | 14 +++++++++++---
 mm/page-writeback.c       | 11 ++++-------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index be960f92ad9dbd..b5fcf91cf18bdd 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -371,14 +371,22 @@ int balance_dirty_pages_ratelimited_flags(struct address_space *mapping,
 
 bool wb_over_bg_thresh(struct bdi_writeback *wb);
 
+struct folio *writeback_iter_init(struct address_space *mapping,
+		struct writeback_control *wbc);
+struct folio *writeback_iter_next(struct address_space *mapping,
+		struct writeback_control *wbc, struct folio *folio, int error);
+
+#define for_each_writeback_folio(mapping, wbc, folio, error)		\
+	for (folio = writeback_iter_init(mapping, wbc);			\
+	     folio || ((error = wbc->err), false);			\
+	     folio = writeback_iter_next(mapping, wbc, folio, error))
+
 typedef int (*writepage_t)(struct folio *folio, struct writeback_control *wbc,
 				void *data);
-
-void tag_pages_for_writeback(struct address_space *mapping,
-			     pgoff_t start, pgoff_t end);
 int write_cache_pages(struct address_space *mapping,
 		      struct writeback_control *wbc, writepage_t writepage,
 		      void *data);
+
 int do_writepages(struct address_space *mapping, struct writeback_control *wbc);
 void writeback_set_ratelimit(void);
 void tag_pages_for_writeback(struct address_space *mapping,
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 4fae912f7a86e2..e4a1444502ccd4 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2450,7 +2450,7 @@ static struct folio *writeback_get_folio(struct address_space *mapping,
 	return folio;
 }
 
-static struct folio *writeback_iter_init(struct address_space *mapping,
+struct folio *writeback_iter_init(struct address_space *mapping,
 		struct writeback_control *wbc)
 {
 	if (wbc->range_cyclic) {
@@ -2472,7 +2472,7 @@ static struct folio *writeback_iter_init(struct address_space *mapping,
 	return writeback_get_folio(mapping, wbc);
 }
 
-static struct folio *writeback_iter_next(struct address_space *mapping,
+struct folio *writeback_iter_next(struct address_space *mapping,
 		struct writeback_control *wbc, struct folio *folio, int error)
 {
 	unsigned long nr = folio_nr_pages(folio);
@@ -2551,13 +2551,10 @@ int write_cache_pages(struct address_space *mapping,
 	struct folio *folio;
 	int error;
 
-	for (folio = writeback_iter_init(mapping, wbc);
-	     folio;
-	     folio = writeback_iter_next(mapping, wbc, folio, error)) {
+	for_each_writeback_folio(mapping, wbc, folio, error)
 		error = writepage(folio, wbc, data);
-	}
 
-	return wbc->err;
+	return error;
 }
 EXPORT_SYMBOL(write_cache_pages);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/11] writeback: Remove a use of write_cache_pages() from do_writepages()
  2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2023-12-14 13:25 ` [PATCH 10/11] writeback: Add for_each_writeback_folio() Christoph Hellwig
@ 2023-12-14 13:25 ` Christoph Hellwig
  10 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-14 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Use the new for_each_writeback_folio() directly instead of indirecting
through a callback.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/page-writeback.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index e4a1444502ccd4..338021bdd136b9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2558,13 +2558,21 @@ int write_cache_pages(struct address_space *mapping,
 }
 EXPORT_SYMBOL(write_cache_pages);
 
-static int writepage_cb(struct folio *folio, struct writeback_control *wbc,
-		void *data)
+static int writeback_use_writepage(struct address_space *mapping,
+		struct writeback_control *wbc)
 {
-	struct address_space *mapping = data;
-	int ret = mapping->a_ops->writepage(&folio->page, wbc);
-	mapping_set_error(mapping, ret);
-	return ret;
+	struct blk_plug plug;
+	struct folio *folio;
+	int err;
+
+	blk_start_plug(&plug);
+	for_each_writeback_folio(mapping, wbc, folio, err) {
+		err = mapping->a_ops->writepage(&folio->page, wbc);
+		mapping_set_error(mapping, err);
+	}
+	blk_finish_plug(&plug);
+
+	return err;
 }
 
 int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
@@ -2580,12 +2588,7 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
 		if (mapping->a_ops->writepages) {
 			ret = mapping->a_ops->writepages(mapping, wbc);
 		} else if (mapping->a_ops->writepage) {
-			struct blk_plug plug;
-
-			blk_start_plug(&plug);
-			ret = write_cache_pages(mapping, wbc, writepage_cb,
-						mapping);
-			blk_finish_plug(&plug);
+			ret = writeback_use_writepage(mapping, wbc);
 		} else {
 			/* deal with chardevs and other special files */
 			ret = 0;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 01/11] writeback: Factor out writeback_finish()
  2023-12-14 13:25 ` [PATCH 01/11] writeback: Factor out writeback_finish() Christoph Hellwig
@ 2023-12-15 13:26   ` Jan Kara
  2023-12-15 16:54     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2023-12-15 13:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu 14-12-23 14:25:34, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Instead of having a 'done' variable that controls the nested loops,
> have a writeback_finish() that can be returned directly.  This involves
> keeping more things in writeback_control, but it's just moving stuff
> allocated on the stack to being allocated slightly earlier on the stack.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> [hch: reorder and comment struct writeback_control]
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  include/linux/writeback.h |  8 +++++
>  mm/page-writeback.c       | 72 +++++++++++++++++++++------------------
>  2 files changed, 46 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 083387c00f0c8b..05e8add4b5ae3c 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -11,6 +11,7 @@
>  #include <linux/flex_proportions.h>
>  #include <linux/backing-dev-defs.h>
>  #include <linux/blk_types.h>
> +#include <linux/pagevec.h>
>  
>  struct bio;
>  
> @@ -40,6 +41,7 @@ enum writeback_sync_modes {
>   * in a manner such that unspecified fields are set to zero.
>   */
>  struct writeback_control {
> +	/* public fields that can be set and/or consumed by the caller: */
>  	long nr_to_write;		/* Write this many pages, and decrement
>  					   this for each page written */
>  	long pages_skipped;		/* Pages which were not written */
> @@ -77,6 +79,12 @@ struct writeback_control {
>  	 */
>  	struct swap_iocb **swap_plug;
>  
> +	/* internal fields used by the ->writepages implementation: */
> +	struct folio_batch fbatch;
> +	pgoff_t done_index;
> +	int err;
> +	unsigned range_whole:1;		/* entire file */

Do we really need the range_whole member here? It is trivially derived from
range_start && range_end and used only in one place in writeback_finish().

> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index ee2fd6a6af4072..45309f3b8193f8 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2360,6 +2360,24 @@ void tag_pages_for_writeback(struct address_space *mapping,
>  }
>  EXPORT_SYMBOL(tag_pages_for_writeback);
>  
> +static int writeback_finish(struct address_space *mapping,
> +		struct writeback_control *wbc, bool done)
> +{
> +	folio_batch_release(&wbc->fbatch);
> +
> +	/*
> +	 * If we hit the last page and there is more work to be done:
> +	 * wrap the index back to the start of the file for the next
> +	 * time we are called.
> +	 */
> +	if (wbc->range_cyclic && !done)
> +		wbc->done_index = 0;
> +	if (wbc->range_cyclic || (wbc->range_whole && wbc->nr_to_write > 0))
> +		mapping->writeback_index = wbc->done_index;
> +
> +	return wbc->err;
> +}

Also I suspect we can get rid of the 'done' argument here. After all it
just controls whether we cycle back to index 0 which we could just
determine as:

	if (wbc->range_cyclic && !wbc->err && wbc->nr_to_write > 0) {
		WARN_ON_ONCE(wbc->sync_mode != WB_SYNC_NONE);
		wbc->done_index = 0;
	}

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages()
  2023-12-14 13:25 ` [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages() Christoph Hellwig
@ 2023-12-15 14:14   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-12-15 14:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu 14-12-23 14:25:35, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> This simple helper will be the basis of the writeback iterator.
> To make this work, we need to remember the current index
> and end positions in writeback_control.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Just some nits:

> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -81,6 +81,8 @@ struct writeback_control {
>  
>  	/* internal fields used by the ->writepages implementation: */
>  	struct folio_batch fbatch;
> +	pgoff_t index;
> +	pgoff_t end;			/* inclusive */
>  	pgoff_t done_index;

I don't think we need to cache 'end' since it isn't used that much. In
writeback_get_batch() we can just compute it locally as:

	if (wbc->range_cyclic)
		end = -1;
	else
		end = wbc->range_end >> PAGE_SHIFT;

and in the termination condition of the loop we can have it like:

	while (wbc->range_cyclic || index <= wbc->range_end >> PAGE_SHIFT)

Also I don't think we need both done_index and index since they are closely
related and when spread over several functions it gets a bit confusing
what's for what. So I'd just remove done_index, use index instead for
setting writeback_index and just reset 'index' to the desired value in
those two cases where we break out of the loop early and thus index !=
done_index.

I'm sorry for nitpicking about these state variables but IMO reducing their
amount actually makes things easier to verify (and thus maintain) when they
are spread over several functions.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 03/11] writeback: Factor should_writeback_folio() out of write_cache_pages()
  2023-12-14 13:25 ` [PATCH 03/11] writeback: Factor should_writeback_folio() " Christoph Hellwig
@ 2023-12-15 14:16   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-12-15 14:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu 14-12-23 14:25:36, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Reduce write_cache_pages() by about 30 lines; much of it is commentary,
> but it all bundles nicely into an obvious function.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

I like this! Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/page-writeback.c | 59 ++++++++++++++++++++++++---------------------
>  1 file changed, 32 insertions(+), 27 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 5d33e7b468e2cc..5a3df8665ff4f9 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2394,6 +2394,36 @@ static void writeback_get_batch(struct address_space *mapping,
>  			&wbc->fbatch);
>  }
>  
> +static bool should_writeback_folio(struct address_space *mapping,
> +		struct writeback_control *wbc, struct folio *folio)
> +{
> +	/*
> +	 * Folio truncated or invalidated. We can freely skip it then,
> +	 * even for data integrity operations: the folio has disappeared
> +	 * concurrently, so there could be no real expectation of this
> +	 * data integrity operation even if there is now a new, dirty
> +	 * folio at the same pagecache index.
> +	 */
> +	if (unlikely(folio->mapping != mapping))
> +		return false;
> +
> +	/* Did somebody write it for us? */
> +	if (!folio_test_dirty(folio))
> +		return false;
> +
> +	if (folio_test_writeback(folio)) {
> +		if (wbc->sync_mode == WB_SYNC_NONE)
> +			return false;
> +		folio_wait_writeback(folio);
> +	}
> +
> +	BUG_ON(folio_test_writeback(folio));
> +	if (!folio_clear_dirty_for_io(folio))
> +		return false;
> +
> +	return true;
> +}
> +
>  /**
>   * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
>   * @mapping: address space structure to write
> @@ -2462,38 +2492,13 @@ int write_cache_pages(struct address_space *mapping,
>  			wbc->done_index = folio->index;
>  
>  			folio_lock(folio);
> -
> -			/*
> -			 * Page truncated or invalidated. We can freely skip it
> -			 * then, even for data integrity operations: the page
> -			 * has disappeared concurrently, so there could be no
> -			 * real expectation of this data integrity operation
> -			 * even if there is now a new, dirty page at the same
> -			 * pagecache address.
> -			 */
> -			if (unlikely(folio->mapping != mapping)) {
> -continue_unlock:
> +			if (!should_writeback_folio(mapping, wbc, folio)) {
>  				folio_unlock(folio);
>  				continue;
>  			}
>  
> -			if (!folio_test_dirty(folio)) {
> -				/* someone wrote it for us */
> -				goto continue_unlock;
> -			}
> -
> -			if (folio_test_writeback(folio)) {
> -				if (wbc->sync_mode != WB_SYNC_NONE)
> -					folio_wait_writeback(folio);
> -				else
> -					goto continue_unlock;
> -			}
> -
> -			BUG_ON(folio_test_writeback(folio));
> -			if (!folio_clear_dirty_for_io(folio))
> -				goto continue_unlock;
> -
>  			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> +
>  			error = writepage(folio, wbc, data);
>  			nr = folio_nr_pages(folio);
>  			if (unlikely(error)) {
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/11] writeback: Factor writeback_iter_next() out of write_cache_pages()
  2023-12-14 13:25 ` [PATCH 09/11] writeback: Factor writeback_iter_next() " Christoph Hellwig
@ 2023-12-15 14:17   ` Brian Foster
  2023-12-16  4:51     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Brian Foster @ 2023-12-15 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu, Dec 14, 2023 at 02:25:42PM +0100, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Pull the post-processing of the writepage_t callback into a
> separate function.  That means changing writeback_finish() to
> return NULL, and writeback_get_next() to call writeback_finish()
> when we naturally run out of folios.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  mm/page-writeback.c | 89 +++++++++++++++++++++++----------------------
>  1 file changed, 46 insertions(+), 43 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index b0accca1f4bfa7..4fae912f7a86e2 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2360,7 +2360,7 @@ void tag_pages_for_writeback(struct address_space *mapping,
>  }
>  EXPORT_SYMBOL(tag_pages_for_writeback);
>  
> -static int writeback_finish(struct address_space *mapping,
> +static struct folio *writeback_finish(struct address_space *mapping,
>  		struct writeback_control *wbc, bool done)
>  {
>  	folio_batch_release(&wbc->fbatch);
> @@ -2375,7 +2375,7 @@ static int writeback_finish(struct address_space *mapping,
>  	if (wbc->range_cyclic || (wbc->range_whole && wbc->nr_to_write > 0))
>  		mapping->writeback_index = wbc->done_index;
>  
> -	return wbc->err;
> +	return NULL;

The series looks reasonable to me on a first pass, but this stood out to
me as really odd. I was initially wondering if it made sense to use an
ERR_PTR() here or something, but on further staring it kind of seems
like this is better off being factored out of the internal iteration
paths. Untested and Probably Broken patch (based on this one) below as a
quick reference, but just a thought...

BTW it would be nicer to just drop the ->done field entirely as Jan has
already suggested, but I just stuffed it in wbc for simplicity.

Brian

--- 8< ---

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index be960f92ad9d..0babca17a2c0 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -84,6 +84,7 @@ struct writeback_control {
 	pgoff_t index;
 	pgoff_t end;			/* inclusive */
 	pgoff_t done_index;
+	bool done;
 	int err;
 	unsigned range_whole:1;		/* entire file */
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 4fae912f7a86..3ee2058a2559 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2360,8 +2360,8 @@ void tag_pages_for_writeback(struct address_space *mapping,
 }
 EXPORT_SYMBOL(tag_pages_for_writeback);
 
-static struct folio *writeback_finish(struct address_space *mapping,
-		struct writeback_control *wbc, bool done)
+static int writeback_iter_finish(struct address_space *mapping,
+		struct writeback_control *wbc)
 {
 	folio_batch_release(&wbc->fbatch);
 
@@ -2370,12 +2370,12 @@ static struct folio *writeback_finish(struct address_space *mapping,
 	 * wrap the index back to the start of the file for the next
 	 * time we are called.
 	 */
-	if (wbc->range_cyclic && !done)
+	if (wbc->range_cyclic && !wbc->done)
 		wbc->done_index = 0;
 	if (wbc->range_cyclic || (wbc->range_whole && wbc->nr_to_write > 0))
 		mapping->writeback_index = wbc->done_index;
 
-	return NULL;
+	return wbc->err;
 }
 
 static struct folio *writeback_get_next(struct address_space *mapping,
@@ -2434,19 +2434,16 @@ static struct folio *writeback_get_folio(struct address_space *mapping,
 {
 	struct folio *folio;
 
-	for (;;) {
-		folio = writeback_get_next(mapping, wbc);
-		if (!folio)
-			return writeback_finish(mapping, wbc, false);
+	while ((folio = writeback_get_next(mapping, wbc)) != NULL) {
 		wbc->done_index = folio->index;
-
 		folio_lock(folio);
 		if (likely(should_writeback_folio(mapping, wbc, folio)))
 			break;
 		folio_unlock(folio);
 	}
 
-	trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
+	if (folio)
+		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
 	return folio;
 }
 
@@ -2466,6 +2463,7 @@ static struct folio *writeback_iter_init(struct address_space *mapping,
 		tag_pages_for_writeback(mapping, wbc->index, wbc->end);
 
 	wbc->done_index = wbc->index;
+	wbc->done = false;
 	folio_batch_init(&wbc->fbatch);
 	wbc->err = 0;
 
@@ -2494,7 +2492,8 @@ static struct folio *writeback_iter_next(struct address_space *mapping,
 		} else if (wbc->sync_mode != WB_SYNC_ALL) {
 			wbc->err = error;
 			wbc->done_index = folio->index + nr;
-			return writeback_finish(mapping, wbc, true);
+			wbc->done = true;
+			return NULL;
 		}
 		if (!wbc->err)
 			wbc->err = error;
@@ -2507,8 +2506,10 @@ static struct folio *writeback_iter_next(struct address_space *mapping,
 	 * to entering this loop.
 	 */
 	wbc->nr_to_write -= nr;
-	if (wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE)
-		return writeback_finish(mapping, wbc, true);
+	if (wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE) {
+		wbc->done = true;
+		return NULL;
+	}
 
 	return writeback_get_folio(mapping, wbc);
 }
@@ -2557,7 +2558,7 @@ int write_cache_pages(struct address_space *mapping,
 		error = writepage(folio, wbc, data);
 	}
 
-	return wbc->err;
+	return writeback_iter_finish(mapping, wbc);
 }
 EXPORT_SYMBOL(write_cache_pages);
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/11] writeback: Simplify the loops in write_cache_pages()
  2023-12-14 13:25 ` [PATCH 04/11] writeback: Simplify the loops in write_cache_pages() Christoph Hellwig
@ 2023-12-15 14:25   ` Jan Kara
  2023-12-16  6:16   ` Matthew Wilcox
  1 sibling, 0 replies; 20+ messages in thread
From: Jan Kara @ 2023-12-15 14:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu 14-12-23 14:25:37, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Collapse the two nested loops into one.  This is needed as a step
> towards turning this into an iterator.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

It would be good to mention in the changelog that we drop the condition
index <= end and just rely on filemap_get_folios_tag() to return 0 entries
when index > end. This actually has a subtle implication when end == -1
because then the returned index will be -1 as well and thus if there is
page present on index -1, we could be looping indefinitely. But I think
that's mostly a theoretical concern so I'd be fine with just mentioning
this subtlety in the changelog and possibly in a comment in the code.

								Honza

> ---
>  mm/page-writeback.c | 98 ++++++++++++++++++++++-----------------------
>  1 file changed, 49 insertions(+), 49 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 5a3df8665ff4f9..2087d16115710e 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2460,6 +2460,7 @@ int write_cache_pages(struct address_space *mapping,
>  		      void *data)
>  {
>  	int error;
> +	int i = 0;
>  
>  	if (wbc->range_cyclic) {
>  		wbc->index = mapping->writeback_index; /* prev offset */
> @@ -2477,67 +2478,66 @@ int write_cache_pages(struct address_space *mapping,
>  	folio_batch_init(&wbc->fbatch);
>  	wbc->err = 0;
>  
> -	while (wbc->index <= wbc->end) {
> -		int i;
> -
> -		writeback_get_batch(mapping, wbc);
> +	for (;;) {
> +		struct folio *folio;
> +		unsigned long nr;
>  
> +		if (i == wbc->fbatch.nr) {
> +			writeback_get_batch(mapping, wbc);
> +			i = 0;
> +		}
>  		if (wbc->fbatch.nr == 0)
>  			break;
>  
> -		for (i = 0; i < wbc->fbatch.nr; i++) {
> -			struct folio *folio = wbc->fbatch.folios[i];
> -			unsigned long nr;
> +		folio = wbc->fbatch.folios[i++];
>  
> -			wbc->done_index = folio->index;
> +		wbc->done_index = folio->index;
>  
> -			folio_lock(folio);
> -			if (!should_writeback_folio(mapping, wbc, folio)) {
> -				folio_unlock(folio);
> -				continue;
> -			}
> +		folio_lock(folio);
> +		if (!should_writeback_folio(mapping, wbc, folio)) {
> +			folio_unlock(folio);
> +			continue;
> +		}
>  
> -			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> -
> -			error = writepage(folio, wbc, data);
> -			nr = folio_nr_pages(folio);
> -			if (unlikely(error)) {
> -				/*
> -				 * Handle errors according to the type of
> -				 * writeback. There's no need to continue for
> -				 * background writeback. Just push done_index
> -				 * past this page so media errors won't choke
> -				 * writeout for the entire file. For integrity
> -				 * writeback, we must process the entire dirty
> -				 * set regardless of errors because the fs may
> -				 * still have state to clear for each page. In
> -				 * that case we continue processing and return
> -				 * the first error.
> -				 */
> -				if (error == AOP_WRITEPAGE_ACTIVATE) {
> -					folio_unlock(folio);
> -					error = 0;
> -				} else if (wbc->sync_mode != WB_SYNC_ALL) {
> -					wbc->err = error;
> -					wbc->done_index = folio->index + nr;
> -					return writeback_finish(mapping,
> -							wbc, true);
> -				}
> -				if (!wbc->err)
> -					wbc->err = error;
> -			}
> +		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
>  
> +		error = writepage(folio, wbc, data);
> +		nr = folio_nr_pages(folio);
> +		if (unlikely(error)) {
>  			/*
> -			 * We stop writing back only if we are not doing
> -			 * integrity sync. In case of integrity sync we have to
> -			 * keep going until we have written all the pages
> -			 * we tagged for writeback prior to entering this loop.
> +			 * Handle errors according to the type of
> +			 * writeback. There's no need to continue for
> +			 * background writeback. Just push done_index
> +			 * past this page so media errors won't choke
> +			 * writeout for the entire file. For integrity
> +			 * writeback, we must process the entire dirty
> +			 * set regardless of errors because the fs may
> +			 * still have state to clear for each page. In
> +			 * that case we continue processing and return
> +			 * the first error.
>  			 */
> -			wbc->nr_to_write -= nr;
> -			if (wbc->nr_to_write <= 0 &&
> -			    wbc->sync_mode == WB_SYNC_NONE)
> +			if (error == AOP_WRITEPAGE_ACTIVATE) {
> +				folio_unlock(folio);
> +				error = 0;
> +			} else if (wbc->sync_mode != WB_SYNC_ALL) {
> +				wbc->err = error;
> +				wbc->done_index = folio->index + nr;
>  				return writeback_finish(mapping, wbc, true);
> +			}
> +			if (!wbc->err)
> +				wbc->err = error;
>  		}
> +
> +		/*
> +		 * We stop writing back only if we are not doing
> +		 * integrity sync. In case of integrity sync we have to
> +		 * keep going until we have written all the pages
> +		 * we tagged for writeback prior to entering this loop.
> +		 */
> +		wbc->nr_to_write -= nr;
> +		if (wbc->nr_to_write <= 0 &&
> +		    wbc->sync_mode == WB_SYNC_NONE)
> +			return writeback_finish(mapping, wbc, true);
>  	}
>  
>  	return writeback_finish(mapping, wbc, false);
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 01/11] writeback: Factor out writeback_finish()
  2023-12-15 13:26   ` Jan Kara
@ 2023-12-15 16:54     ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-15 16:54 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Fri, Dec 15, 2023 at 02:26:39PM +0100, Jan Kara wrote:
> > +	/* internal fields used by the ->writepages implementation: */
> > +	struct folio_batch fbatch;
> > +	pgoff_t done_index;
> > +	int err;
> > +	unsigned range_whole:1;		/* entire file */
> 
> Do we really need the range_whole member here? It is trivially derived from
> range_start && range_end and used only in one place in writeback_finish().

Yes, as nothing modified range_start and range_end this should be
easily doable.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/11] writeback: Factor writeback_iter_next() out of write_cache_pages()
  2023-12-15 14:17   ` Brian Foster
@ 2023-12-16  4:51     ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-12-16  4:51 UTC (permalink / raw)
  To: Brian Foster
  Cc: Christoph Hellwig, linux-mm, Matthew Wilcox (Oracle),
	linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Fri, Dec 15, 2023 at 09:17:05AM -0500, Brian Foster wrote:
> +	while ((folio = writeback_get_next(mapping, wbc)) != NULL) {
>  		wbc->done_index = folio->index;
>  		folio_lock(folio);
>  		if (likely(should_writeback_folio(mapping, wbc, folio)))
>  			break;
>  		folio_unlock(folio);
>  	}
>  
> +	if (folio)
> +		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
>  	return folio;

I posted a very similar transformation in reply to willy's first posting
of the series :)  I guess that's 2:1 now and I might as well go ahead
and do something like that now :)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/11] writeback: Simplify the loops in write_cache_pages()
  2023-12-14 13:25 ` [PATCH 04/11] writeback: Simplify the loops in write_cache_pages() Christoph Hellwig
  2023-12-15 14:25   ` Jan Kara
@ 2023-12-16  6:16   ` Matthew Wilcox
  1 sibling, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2023-12-16  6:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-mm, linux-fsdevel, linux-kernel, Jan Kara, David Howells

On Thu, Dec 14, 2023 at 02:25:37PM +0100, Christoph Hellwig wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Collapse the two nested loops into one.  This is needed as a step
> towards turning this into an iterator.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  mm/page-writeback.c | 98 ++++++++++++++++++++++-----------------------
>  1 file changed, 49 insertions(+), 49 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 5a3df8665ff4f9..2087d16115710e 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2460,6 +2460,7 @@ int write_cache_pages(struct address_space *mapping,
>  		      void *data)
>  {
>  	int error;
> +	int i = 0;
>  
>  	if (wbc->range_cyclic) {
>  		wbc->index = mapping->writeback_index; /* prev offset */
> @@ -2477,67 +2478,66 @@ int write_cache_pages(struct address_space *mapping,
>  	folio_batch_init(&wbc->fbatch);
>  	wbc->err = 0;
>  
> -	while (wbc->index <= wbc->end) {
> -		int i;
> -
> -		writeback_get_batch(mapping, wbc);
> +	for (;;) {
> +		struct folio *folio;
> +		unsigned long nr;
>  
> +		if (i == wbc->fbatch.nr) {
> +			writeback_get_batch(mapping, wbc);
> +			i = 0;
> +		}
>  		if (wbc->fbatch.nr == 0)
>  			break;
>  
> -		for (i = 0; i < wbc->fbatch.nr; i++) {
> -			struct folio *folio = wbc->fbatch.folios[i];
> -			unsigned long nr;
> +		folio = wbc->fbatch.folios[i++];
>  
> -			wbc->done_index = folio->index;
> +		wbc->done_index = folio->index;
>  
> -			folio_lock(folio);
> -			if (!should_writeback_folio(mapping, wbc, folio)) {
> -				folio_unlock(folio);
> -				continue;
> -			}
> +		folio_lock(folio);
> +		if (!should_writeback_folio(mapping, wbc, folio)) {
> +			folio_unlock(folio);
> +			continue;
> +		}
>  
> -			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> -
> -			error = writepage(folio, wbc, data);
> -			nr = folio_nr_pages(folio);
> -			if (unlikely(error)) {
> -				/*
> -				 * Handle errors according to the type of
> -				 * writeback. There's no need to continue for
> -				 * background writeback. Just push done_index
> -				 * past this page so media errors won't choke
> -				 * writeout for the entire file. For integrity
> -				 * writeback, we must process the entire dirty
> -				 * set regardless of errors because the fs may
> -				 * still have state to clear for each page. In
> -				 * that case we continue processing and return
> -				 * the first error.
> -				 */
> -				if (error == AOP_WRITEPAGE_ACTIVATE) {
> -					folio_unlock(folio);
> -					error = 0;
> -				} else if (wbc->sync_mode != WB_SYNC_ALL) {
> -					wbc->err = error;
> -					wbc->done_index = folio->index + nr;
> -					return writeback_finish(mapping,
> -							wbc, true);
> -				}
> -				if (!wbc->err)
> -					wbc->err = error;
> -			}
> +		trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
>  
> +		error = writepage(folio, wbc, data);
> +		nr = folio_nr_pages(folio);
> +		if (unlikely(error)) {
>  			/*
> -			 * We stop writing back only if we are not doing
> -			 * integrity sync. In case of integrity sync we have to
> -			 * keep going until we have written all the pages
> -			 * we tagged for writeback prior to entering this loop.
> +			 * Handle errors according to the type of
> +			 * writeback. There's no need to continue for
> +			 * background writeback. Just push done_index
> +			 * past this page so media errors won't choke
> +			 * writeout for the entire file. For integrity
> +			 * writeback, we must process the entire dirty
> +			 * set regardless of errors because the fs may
> +			 * still have state to clear for each page. In
> +			 * that case we continue processing and return
> +			 * the first error.
>  			 */
> -			wbc->nr_to_write -= nr;
> -			if (wbc->nr_to_write <= 0 &&
> -			    wbc->sync_mode == WB_SYNC_NONE)
> +			if (error == AOP_WRITEPAGE_ACTIVATE) {
> +				folio_unlock(folio);
> +				error = 0;
> +			} else if (wbc->sync_mode != WB_SYNC_ALL) {
> +				wbc->err = error;
> +				wbc->done_index = folio->index + nr;
>  				return writeback_finish(mapping, wbc, true);
> +			}
> +			if (!wbc->err)
> +				wbc->err = error;
>  		}
> +
> +		/*
> +		 * We stop writing back only if we are not doing
> +		 * integrity sync. In case of integrity sync we have to
> +		 * keep going until we have written all the pages
> +		 * we tagged for writeback prior to entering this loop.
> +		 */
> +		wbc->nr_to_write -= nr;
> +		if (wbc->nr_to_write <= 0 &&
> +		    wbc->sync_mode == WB_SYNC_NONE)
> +			return writeback_finish(mapping, wbc, true);
>  	}
>  
>  	return writeback_finish(mapping, wbc, false);
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-12-16  6:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-14 13:25 Convert write_cache_pages() to an iterator v2 Christoph Hellwig
2023-12-14 13:25 ` [PATCH 01/11] writeback: Factor out writeback_finish() Christoph Hellwig
2023-12-15 13:26   ` Jan Kara
2023-12-15 16:54     ` Christoph Hellwig
2023-12-14 13:25 ` [PATCH 02/11] writeback: Factor writeback_get_batch() out of write_cache_pages() Christoph Hellwig
2023-12-15 14:14   ` Jan Kara
2023-12-14 13:25 ` [PATCH 03/11] writeback: Factor should_writeback_folio() " Christoph Hellwig
2023-12-15 14:16   ` Jan Kara
2023-12-14 13:25 ` [PATCH 04/11] writeback: Simplify the loops in write_cache_pages() Christoph Hellwig
2023-12-15 14:25   ` Jan Kara
2023-12-16  6:16   ` Matthew Wilcox
2023-12-14 13:25 ` [PATCH 05/11] pagevec: Add ability to iterate a queue Christoph Hellwig
2023-12-14 13:25 ` [PATCH 06/11] writeback: Use the folio_batch queue iterator Christoph Hellwig
2023-12-14 13:25 ` [PATCH 07/11] writeback: Factor writeback_iter_init() out of write_cache_pages() Christoph Hellwig
2023-12-14 13:25 ` [PATCH 08/11] writeback: Factor writeback_get_folio() " Christoph Hellwig
2023-12-14 13:25 ` [PATCH 09/11] writeback: Factor writeback_iter_next() " Christoph Hellwig
2023-12-15 14:17   ` Brian Foster
2023-12-16  4:51     ` Christoph Hellwig
2023-12-14 13:25 ` [PATCH 10/11] writeback: Add for_each_writeback_folio() Christoph Hellwig
2023-12-14 13:25 ` [PATCH 11/11] writeback: Remove a use of write_cache_pages() from do_writepages() Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.