linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs compressed writeback cleanups
@ 2023-06-28 15:31 Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
                   ` (24 more replies)
  0 siblings, 25 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Hi all,

this series is a prequel to another big set of writeback patches, and
mostly deals with compressed writeback, which does a few things very
different from other writeback code.  The biggest results are the removal
of the magic redirtying when handing off to the worker thread, and a fix
for out of zone active resources handling when using compressed writeback
on a zoned file system, but mostly it just removes a whole bunch of code.

Note that the first 5 patches have been out on the btrfs list as
standalone submissions for a while, but they are included for completeness
so that this series can be easily applied to the btrfs misc-next tree.

A git tree is also available here:

    git://git.infradead.org/users/hch/misc.git btrfs-compressed-writeback

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-compressed-writeback


Diffstat:
 fs/btrfs/btrfs_inode.h    |    6 
 fs/btrfs/extent_io.c      |  304 ++++++++-------------
 fs/btrfs/extent_io.h      |    9 
 fs/btrfs/inode.c          |  652 +++++++++++++++++-----------------------------
 fs/btrfs/ordered-data.c   |    4 
 include/linux/writeback.h |    5 
 mm/page-writeback.c       |   49 ---
 7 files changed, 383 insertions(+), 646 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/23] btrfs: pass a flags argument to cow_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  8:47   ` Johannes Thumshirn
  2023-07-20 11:22   ` David Sterba
  2023-06-28 15:31 ` [PATCH 02/23] btrfs: don't create inline extents in fallback_to_cow Christoph Hellwig
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

The int used as bool unlock is not a very good way to describe the
behavior, and the next patch will have to add another beahvior modifier.
Switch to pass a flag instead, with an inital CFR_KEEP_LOCKED flag that
specifies the pages should always be kept locked.  This is the inverse
of the old unlock argument for the reason that it requires a flag for
the exceptional behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 51 ++++++++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index dbbb67293e345c..92a78940991fcb 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -124,11 +124,13 @@ static struct kmem_cache *btrfs_inode_cachep;
 
 static int btrfs_setsize(struct inode *inode, struct iattr *attr);
 static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback);
+
+#define CFR_KEEP_LOCKED		(1 << 0)
 static noinline int cow_file_range(struct btrfs_inode *inode,
 				   struct page *locked_page,
 				   u64 start, u64 end, int *page_started,
-				   unsigned long *nr_written, int unlock,
-				   u64 *done_offset);
+				   unsigned long *nr_written, u64 *done_offset,
+				   u32 flags);
 static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
 				       u64 len, u64 orig_start, u64 block_start,
 				       u64 block_len, u64 orig_block_len,
@@ -1148,7 +1150,7 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
 	 * can directly submit them without interruption.
 	 */
 	ret = cow_file_range(inode, locked_page, start, end, &page_started,
-			     &nr_written, 0, NULL);
+			     &nr_written, NULL, CFR_KEEP_LOCKED);
 	/* Inline extent inserted, page gets unlocked and everything is done */
 	if (page_started)
 		return 0;
@@ -1362,25 +1364,18 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
  * locked_page is the page that writepage had locked already.  We use
  * it to make sure we don't do extra locks or unlocks.
  *
- * *page_started is set to one if we unlock locked_page and do everything
- * required to start IO on it.  It may be clean and already done with
- * IO when we return.
- *
- * When unlock == 1, we unlock the pages in successfully allocated regions.
- * When unlock == 0, we leave them locked for writing them out.
+ * When this function fails, it unlocks all pages except @locked_page.
  *
- * However, we unlock all the pages except @locked_page in case of failure.
+ * When this function successfully creates an inline extent, it sets page_started
+ * to 1 and unlocks all pages including locked_page and starts I/O on them.
+ * (In reality inline extents are limited to a single page, so locked_page is
+ * the only page handled anyway).
  *
- * In summary, page locking state will be as follow:
+ * When this function succeed and creates a normal extent, the page locking
+ * status depends on the passed in flags:
  *
- * - page_started == 1 (return value)
- *     - All the pages are unlocked. IO is started.
- *     - Note that this can happen only on success
- * - unlock == 1
- *     - All the pages except @locked_page are unlocked in any case
- * - unlock == 0
- *     - On success, all the pages are locked for writing out them
- *     - On failure, all the pages except @locked_page are unlocked
+ * - If CFR_KEEP_LOCKED is set, all pages are kept locked.
+ * - Else all pages except for @locked_page are unlocked.
  *
  * When a failure happens in the second or later iteration of the
  * while-loop, the ordered extents created in previous iterations are kept
@@ -1391,8 +1386,8 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
 static noinline int cow_file_range(struct btrfs_inode *inode,
 				   struct page *locked_page,
 				   u64 start, u64 end, int *page_started,
-				   unsigned long *nr_written, int unlock,
-				   u64 *done_offset)
+				   unsigned long *nr_written, u64 *done_offset,
+				   u32 flags)
 {
 	struct btrfs_root *root = inode->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -1558,7 +1553,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 		 * Do set the Ordered (Private2) bit so we know this page was
 		 * properly setup for writepage.
 		 */
-		page_ops = unlock ? PAGE_UNLOCK : 0;
+		page_ops = (flags & CFR_KEEP_LOCKED) ? 0 : PAGE_UNLOCK;
 		page_ops |= PAGE_SET_ORDERED;
 
 		extent_clear_unlock_delalloc(inode, start, start + ram_size - 1,
@@ -1627,10 +1622,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 	 * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup
 	 * function.
 	 *
-	 * However, in case of unlock == 0, we still need to unlock the pages
-	 * (except @locked_page) to ensure all the pages are unlocked.
+	 * However, in case of CFR_KEEP_LOCKED, we still need to unlock the
+	 * pages (except @locked_page) to ensure all the pages are unlocked.
 	 */
-	if (!unlock && orig_start < start) {
+	if ((flags & CFR_KEEP_LOCKED) && orig_start < start) {
 		if (!locked_page)
 			mapping_set_error(inode->vfs_inode.i_mapping, ret);
 		extent_clear_unlock_delalloc(inode, orig_start, start - 1,
@@ -1836,7 +1831,7 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 
 	while (start <= end) {
 		ret = cow_file_range(inode, locked_page, start, end, page_started,
-				     nr_written, 0, &done_offset);
+				     nr_written, &done_offset, CFR_KEEP_LOCKED);
 		if (ret && ret != -EAGAIN)
 			return ret;
 
@@ -1956,7 +1951,7 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
 	}
 
 	return cow_file_range(inode, locked_page, start, end, page_started,
-			      nr_written, 1, NULL);
+			      nr_written, NULL, 0);
 }
 
 struct can_nocow_file_extent_args {
@@ -2433,7 +2428,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 					 page_started, nr_written, wbc);
 	else
 		ret = cow_file_range(inode, locked_page, start, end,
-				     page_started, nr_written, 1, NULL);
+				     page_started, nr_written, NULL, 0);
 
 out:
 	ASSERT(ret <= 0);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/23] btrfs: don't create inline extents in fallback_to_cow
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 03/23] btrfs: split page locking out of __process_pages_contig Christoph Hellwig
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

For nodatacow files, run_delalloc_nocow can still fall back to COW
allocations when required and calls to fallback_to_cow helper for
that.  For such an allocation we can have multiple ordered_extents
for existing extents that NOCOW overwrites and new allocations that
fallback_to_cow creates.  If one of the new extents is an inline
extent, the writepages could would have to avoid normal page writeback
for them as indicated by the page_started return argument, which
run_delalloc_nocow can't return.   Fix this by never creating inline
extents from fallback_to_cow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 92a78940991fcb..cddf54bc330c44 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -126,6 +126,7 @@ static int btrfs_setsize(struct inode *inode, struct iattr *attr);
 static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback);
 
 #define CFR_KEEP_LOCKED		(1 << 0)
+#define CFR_NOINLINE		(1 << 1)
 static noinline int cow_file_range(struct btrfs_inode *inode,
 				   struct page *locked_page,
 				   u64 start, u64 end, int *page_started,
@@ -1426,7 +1427,8 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 	 * This means we can trigger inline extent even if we didn't want to.
 	 * So here we skip inline extent creation completely.
 	 */
-	if (start == 0 && fs_info->sectorsize == PAGE_SIZE) {
+	if (start == 0 && fs_info->sectorsize == PAGE_SIZE &&
+	    !(flags & CFR_NOINLINE)) {
 		u64 actual_end = min_t(u64, i_size_read(&inode->vfs_inode),
 				       end + 1);
 
@@ -1889,15 +1891,17 @@ static noinline int csum_exist_in_range(struct btrfs_fs_info *fs_info,
 }
 
 static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
-			   const u64 start, const u64 end,
-			   int *page_started, unsigned long *nr_written)
+			   const u64 start, const u64 end)
 {
 	const bool is_space_ino = btrfs_is_free_space_inode(inode);
 	const bool is_reloc_ino = btrfs_is_data_reloc_root(inode->root);
 	const u64 range_bytes = end + 1 - start;
 	struct extent_io_tree *io_tree = &inode->io_tree;
+	int page_started = 0;
+	unsigned long nr_written;
 	u64 range_start = start;
 	u64 count;
+	int ret;
 
 	/*
 	 * If EXTENT_NORESERVE is set it means that when the buffered write was
@@ -1950,8 +1954,15 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
 					 NULL);
 	}
 
-	return cow_file_range(inode, locked_page, start, end, page_started,
-			      nr_written, NULL, 0);
+	/*
+	 * Don't try to create inline extents, as a mix of inline extent that
+	 * is written out and unlocked directly and a normal nocow extent
+	 * doesn't work.
+	 */
+	ret = cow_file_range(inode, locked_page, start, end, &page_started,
+			     &nr_written, NULL, CFR_NOINLINE);
+	ASSERT(!page_started);
+	return ret;
 }
 
 struct can_nocow_file_extent_args {
@@ -2100,9 +2111,7 @@ static int can_nocow_file_extent(struct btrfs_path *path,
  */
 static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 				       struct page *locked_page,
-				       const u64 start, const u64 end,
-				       int *page_started,
-				       unsigned long *nr_written)
+				       const u64 start, const u64 end)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct btrfs_root *root = inode->root;
@@ -2270,8 +2279,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 		 */
 		if (cow_start != (u64)-1) {
 			ret = fallback_to_cow(inode, locked_page,
-					      cow_start, found_key.offset - 1,
-					      page_started, nr_written);
+					      cow_start, found_key.offset - 1);
 			if (ret)
 				goto error;
 			cow_start = (u64)-1;
@@ -2352,8 +2360,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
 
 	if (cow_start != (u64)-1) {
 		cur_offset = end;
-		ret = fallback_to_cow(inode, locked_page, cow_start, end,
-				      page_started, nr_written);
+		ret = fallback_to_cow(inode, locked_page, cow_start, end);
 		if (ret)
 			goto error;
 	}
@@ -2412,8 +2419,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 		 * preallocated inodes.
 		 */
 		ASSERT(!zoned || btrfs_is_data_reloc_root(inode->root));
-		ret = run_delalloc_nocow(inode, locked_page, start, end,
-					 page_started, nr_written);
+		ret = run_delalloc_nocow(inode, locked_page, start, end);
 		goto out;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/23] btrfs: split page locking out of __process_pages_contig
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 02/23] btrfs: don't create inline extents in fallback_to_cow Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered Christoph Hellwig
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

There is a lot of complexity in __process_pages_contig to deal with the
PAGE_LOCK case that can return an error unlike all the other actions.

Open code the page iteration for page locking in lock_delalloc_pages and
remove all the now unused code from __process_pages_contig.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c | 149 +++++++++++++++++--------------------------
 fs/btrfs/extent_io.h |   1 -
 2 files changed, 59 insertions(+), 91 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a91d5ad2798428..36c3ae947ae8e0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -197,18 +197,9 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end)
 	}
 }
 
-/*
- * Process one page for __process_pages_contig().
- *
- * Return >0 if we hit @page == @locked_page.
- * Return 0 if we updated the page status.
- * Return -EGAIN if the we need to try again.
- * (For PAGE_LOCK case but got dirty page or page not belong to mapping)
- */
-static int process_one_page(struct btrfs_fs_info *fs_info,
-			    struct address_space *mapping,
-			    struct page *page, struct page *locked_page,
-			    unsigned long page_ops, u64 start, u64 end)
+static void process_one_page(struct btrfs_fs_info *fs_info,
+			     struct page *page, struct page *locked_page,
+			     unsigned long page_ops, u64 start, u64 end)
 {
 	u32 len;
 
@@ -224,29 +215,13 @@ static int process_one_page(struct btrfs_fs_info *fs_info,
 	if (page_ops & PAGE_END_WRITEBACK)
 		btrfs_page_clamp_clear_writeback(fs_info, page, start, len);
 
-	if (page == locked_page)
-		return 1;
-
-	if (page_ops & PAGE_LOCK) {
-		int ret;
-
-		ret = btrfs_page_start_writer_lock(fs_info, page, start, len);
-		if (ret)
-			return ret;
-		if (!PageDirty(page) || page->mapping != mapping) {
-			btrfs_page_end_writer_lock(fs_info, page, start, len);
-			return -EAGAIN;
-		}
-	}
-	if (page_ops & PAGE_UNLOCK)
+	if (page != locked_page && (page_ops & PAGE_UNLOCK))
 		btrfs_page_end_writer_lock(fs_info, page, start, len);
-	return 0;
 }
 
-static int __process_pages_contig(struct address_space *mapping,
-				  struct page *locked_page,
-				  u64 start, u64 end, unsigned long page_ops,
-				  u64 *processed_end)
+static void __process_pages_contig(struct address_space *mapping,
+				   struct page *locked_page, u64 start, u64 end,
+				   unsigned long page_ops)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(mapping->host->i_sb);
 	pgoff_t start_index = start >> PAGE_SHIFT;
@@ -254,64 +229,24 @@ static int __process_pages_contig(struct address_space *mapping,
 	pgoff_t index = start_index;
 	unsigned long pages_processed = 0;
 	struct folio_batch fbatch;
-	int err = 0;
 	int i;
 
-	if (page_ops & PAGE_LOCK) {
-		ASSERT(page_ops == PAGE_LOCK);
-		ASSERT(processed_end && *processed_end == start);
-	}
-
 	folio_batch_init(&fbatch);
 	while (index <= end_index) {
 		int found_folios;
 
 		found_folios = filemap_get_folios_contig(mapping, &index,
 				end_index, &fbatch);
-
-		if (found_folios == 0) {
-			/*
-			 * Only if we're going to lock these pages, we can find
-			 * nothing at @index.
-			 */
-			ASSERT(page_ops & PAGE_LOCK);
-			err = -EAGAIN;
-			goto out;
-		}
-
 		for (i = 0; i < found_folios; i++) {
-			int process_ret;
 			struct folio *folio = fbatch.folios[i];
-			process_ret = process_one_page(fs_info, mapping,
-					&folio->page, locked_page, page_ops,
-					start, end);
-			if (process_ret < 0) {
-				err = -EAGAIN;
-				folio_batch_release(&fbatch);
-				goto out;
-			}
+
+			process_one_page(fs_info, &folio->page, locked_page,
+					 page_ops, start, end);
 			pages_processed += folio_nr_pages(folio);
 		}
 		folio_batch_release(&fbatch);
 		cond_resched();
 	}
-out:
-	if (err && processed_end) {
-		/*
-		 * Update @processed_end. I know this is awful since it has
-		 * two different return value patterns (inclusive vs exclusive).
-		 *
-		 * But the exclusive pattern is necessary if @start is 0, or we
-		 * underflow and check against processed_end won't work as
-		 * expected.
-		 */
-		if (pages_processed)
-			*processed_end = min(end,
-			((u64)(start_index + pages_processed) << PAGE_SHIFT) - 1);
-		else
-			*processed_end = start;
-	}
-	return err;
 }
 
 static noinline void __unlock_for_delalloc(struct inode *inode,
@@ -326,29 +261,63 @@ static noinline void __unlock_for_delalloc(struct inode *inode,
 		return;
 
 	__process_pages_contig(inode->i_mapping, locked_page, start, end,
-			       PAGE_UNLOCK, NULL);
+			       PAGE_UNLOCK);
 }
 
 static noinline int lock_delalloc_pages(struct inode *inode,
 					struct page *locked_page,
-					u64 delalloc_start,
-					u64 delalloc_end)
+					u64 start,
+					u64 end)
 {
-	unsigned long index = delalloc_start >> PAGE_SHIFT;
-	unsigned long end_index = delalloc_end >> PAGE_SHIFT;
-	u64 processed_end = delalloc_start;
-	int ret;
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	struct address_space *mapping = inode->i_mapping;
+	pgoff_t start_index = start >> PAGE_SHIFT;
+	pgoff_t end_index = end >> PAGE_SHIFT;
+	pgoff_t index = start_index;
+	u64 processed_end = start;
+	struct folio_batch fbatch;
 
-	ASSERT(locked_page);
 	if (index == locked_page->index && index == end_index)
 		return 0;
 
-	ret = __process_pages_contig(inode->i_mapping, locked_page, delalloc_start,
-				     delalloc_end, PAGE_LOCK, &processed_end);
-	if (ret == -EAGAIN && processed_end > delalloc_start)
-		__unlock_for_delalloc(inode, locked_page, delalloc_start,
-				      processed_end);
-	return ret;
+	folio_batch_init(&fbatch);
+	while (index <= end_index) {
+		unsigned int found_folios, i;
+
+		found_folios = filemap_get_folios_contig(mapping, &index,
+				end_index, &fbatch);
+		if (found_folios == 0)
+			goto out;
+
+		for (i = 0; i < found_folios; i++) {
+			struct page *page = &fbatch.folios[i]->page;
+			u32 len = end + 1 - start;
+
+			if (page == locked_page)
+				continue;
+
+			if (btrfs_page_start_writer_lock(fs_info, page, start,
+							 len))
+				goto out;
+
+			if (!PageDirty(page) || page->mapping != mapping) {
+				btrfs_page_end_writer_lock(fs_info, page, start,
+							   len);
+				goto out;
+			}
+
+			processed_end = page_offset(page) + PAGE_SIZE - 1;
+		}
+		folio_batch_release(&fbatch);
+		cond_resched();
+	}
+
+	return 0;
+out:
+	folio_batch_release(&fbatch);
+	if (processed_end > start)
+		__unlock_for_delalloc(inode, locked_page, start, processed_end);
+	return -EAGAIN;
 }
 
 /*
@@ -467,7 +436,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 	clear_extent_bit(&inode->io_tree, start, end, clear_bits, NULL);
 
 	__process_pages_contig(inode->vfs_inode.i_mapping, locked_page,
-			       start, end, page_ops, NULL);
+			       start, end, page_ops);
 }
 
 static bool btrfs_verify_page(struct page *page, u64 start)
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index c5fae3a7d911bf..285754154fdc5c 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -40,7 +40,6 @@ enum {
 	ENUM_BIT(PAGE_START_WRITEBACK),
 	ENUM_BIT(PAGE_END_WRITEBACK),
 	ENUM_BIT(PAGE_SET_ORDERED),
-	ENUM_BIT(PAGE_LOCK),
 };
 
 /*
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (2 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 03/23] btrfs: split page locking out of __process_pages_contig Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  8:50   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 05/23] btrfs: remove end_extent_writepage Christoph Hellwig
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

btrfs_writepage_endio_finish_ordered is a small wrapper around
btrfs_mark_ordered_io_finished that just changs the argument passing
slightly, and adds a tracepoint.

Move the tracpoint to btrfs_mark_ordered_io_finished, which means
it now also covers the error handling in btrfs_cleanup_ordered_extent
and switch all callers to just call btrfs_mark_ordered_io_finished
directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/btrfs_inode.h  |  3 ---
 fs/btrfs/extent_io.c    | 17 ++++++++---------
 fs/btrfs/inode.c        |  9 ---------
 fs/btrfs/ordered-data.c |  4 ++++
 4 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index d47a927b3504d6..90e60ad9db6200 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -501,9 +501,6 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 			     u64 start, u64 end, int *page_started,
 			     unsigned long *nr_written, struct writeback_control *wbc);
 int btrfs_writepage_cow_fixup(struct page *page);
-void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode,
-					  struct page *page, u64 start,
-					  u64 end, bool uptodate);
 int btrfs_encoded_io_compression_from_extent(struct btrfs_fs_info *fs_info,
 					     int compress_type);
 int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 36c3ae947ae8e0..af05237dc2f186 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -473,17 +473,15 @@ void end_extent_writepage(struct page *page, int err, u64 start, u64 end)
 	struct btrfs_inode *inode;
 	const bool uptodate = (err == 0);
 	int ret = 0;
+	u32 len = end + 1 - start;
 
+	ASSERT(end + 1 - start <= U32_MAX);
 	ASSERT(page && page->mapping);
 	inode = BTRFS_I(page->mapping->host);
-	btrfs_writepage_endio_finish_ordered(inode, page, start, end, uptodate);
+	btrfs_mark_ordered_io_finished(inode, page, start, len, uptodate);
 
 	if (!uptodate) {
 		const struct btrfs_fs_info *fs_info = inode->root->fs_info;
-		u32 len;
-
-		ASSERT(end + 1 - start <= U32_MAX);
-		len = end + 1 - start;
 
 		btrfs_page_clear_uptodate(fs_info, page, start, len);
 		ret = err < 0 ? err : -EIO;
@@ -1328,6 +1326,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 
 	bio_ctrl->end_io_func = end_bio_extent_writepage;
 	while (cur <= end) {
+		u32 len = end - cur + 1;
 		u64 disk_bytenr;
 		u64 em_end;
 		u64 dirty_range_start = cur;
@@ -1335,8 +1334,8 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 		u32 iosize;
 
 		if (cur >= i_size) {
-			btrfs_writepage_endio_finish_ordered(inode, page, cur,
-							     end, true);
+			btrfs_mark_ordered_io_finished(inode, page, cur, len,
+						       true);
 			/*
 			 * This range is beyond i_size, thus we don't need to
 			 * bother writing back.
@@ -1345,7 +1344,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 			 * writeback the sectors with subpage dirty bits,
 			 * causing writeback without ordered extent.
 			 */
-			btrfs_page_clear_dirty(fs_info, page, cur, end + 1 - cur);
+			btrfs_page_clear_dirty(fs_info, page, cur, len);
 			break;
 		}
 
@@ -1356,7 +1355,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 			continue;
 		}
 
-		em = btrfs_get_extent(inode, NULL, 0, cur, end - cur + 1);
+		em = btrfs_get_extent(inode, NULL, 0, cur, len);
 		if (IS_ERR(em)) {
 			ret = PTR_ERR_OR_ZERO(em);
 			goto out_error;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index cddf54bc330c44..b158db44b268a6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3385,15 +3385,6 @@ int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered)
 	return btrfs_finish_one_ordered(ordered);
 }
 
-void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode,
-					  struct page *page, u64 start,
-					  u64 end, bool uptodate)
-{
-	trace_btrfs_writepage_end_io_hook(inode, start, end, uptodate);
-
-	btrfs_mark_ordered_io_finished(inode, page, start, end + 1 - start, uptodate);
-}
-
 /*
  * Verify the checksum for a single sector without any extra action that depend
  * on the type of I/O.
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index a629532283bc33..109e80ed25b669 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -410,6 +410,10 @@ void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode,
 	unsigned long flags;
 	u64 cur = file_offset;
 
+	trace_btrfs_writepage_end_io_hook(inode, file_offset,
+					  file_offset + num_bytes - 1,
+					  uptodate);
+
 	spin_lock_irqsave(&tree->lock, flags);
 	while (cur < file_offset + num_bytes) {
 		u64 entry_end;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/23] btrfs: remove end_extent_writepage
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (3 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents Christoph Hellwig
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

end_extent_writepage is a small helper that combines a call to
btrfs_mark_ordered_io_finished with conditional error-only calls to
btrfs_page_clear_uptodate and mapping_set_error with a somewhat
unfortunate calling convention that passes and inclusive end instead
of the len expected by the underlying functions.

Remove end_extent_writepage and open code it in the 4 callers. Out
of those two already are error-only and thus don't need the extra
conditional, and one already has the mapping_set_error, so a duplicate
call can be avoided.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c | 44 +++++++++++++++-----------------------------
 fs/btrfs/extent_io.h |  2 --
 fs/btrfs/inode.c     | 42 ++++++++++++++++++++++--------------------
 3 files changed, 37 insertions(+), 51 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index af05237dc2f186..5a4f5fc09a2354 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -466,29 +466,6 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
 		btrfs_subpage_end_reader(fs_info, page, start, len);
 }
 
-/* lots and lots of room for performance fixes in the end_bio funcs */
-
-void end_extent_writepage(struct page *page, int err, u64 start, u64 end)
-{
-	struct btrfs_inode *inode;
-	const bool uptodate = (err == 0);
-	int ret = 0;
-	u32 len = end + 1 - start;
-
-	ASSERT(end + 1 - start <= U32_MAX);
-	ASSERT(page && page->mapping);
-	inode = BTRFS_I(page->mapping->host);
-	btrfs_mark_ordered_io_finished(inode, page, start, len, uptodate);
-
-	if (!uptodate) {
-		const struct btrfs_fs_info *fs_info = inode->root->fs_info;
-
-		btrfs_page_clear_uptodate(fs_info, page, start, len);
-		ret = err < 0 ? err : -EIO;
-		mapping_set_error(page->mapping, ret);
-	}
-}
-
 /*
  * after a writepage IO is done, we need to:
  * clear the uptodate bits on error
@@ -1431,7 +1408,6 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl
 	struct folio *folio = page_folio(page);
 	struct inode *inode = page->mapping->host;
 	const u64 page_start = page_offset(page);
-	const u64 page_end = page_start + PAGE_SIZE - 1;
 	int ret;
 	int nr = 0;
 	size_t pg_offset;
@@ -1475,8 +1451,13 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl
 		set_page_writeback(page);
 		end_page_writeback(page);
 	}
-	if (ret)
-		end_extent_writepage(page, ret, page_start, page_end);
+	if (ret) {
+		btrfs_mark_ordered_io_finished(BTRFS_I(inode), page, page_start,
+					       PAGE_SIZE, !ret);
+		btrfs_page_clear_uptodate(btrfs_sb(inode->i_sb), page,
+					  page_start, PAGE_SIZE);
+		mapping_set_error(page->mapping, ret);
+	}
 	unlock_page(page);
 	ASSERT(ret <= 0);
 	return ret;
@@ -2194,6 +2175,7 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 
 	while (cur <= end) {
 		u64 cur_end = min(round_down(cur, PAGE_SIZE) + PAGE_SIZE - 1, end);
+		u32 cur_len = cur_end + 1 - cur;
 		struct page *page;
 		int nr = 0;
 
@@ -2217,9 +2199,13 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 			set_page_writeback(page);
 			end_page_writeback(page);
 		}
-		if (ret)
-			end_extent_writepage(page, ret, cur, cur_end);
-		btrfs_page_unlock_writer(fs_info, page, cur, cur_end + 1 - cur);
+		if (ret) {
+			btrfs_mark_ordered_io_finished(BTRFS_I(inode), page,
+						       cur, cur_len, !ret);
+			btrfs_page_clear_uptodate(fs_info, page, cur, cur_len);
+			mapping_set_error(page->mapping, ret);
+		}
+		btrfs_page_unlock_writer(fs_info, page, cur, cur_len);
 		if (ret < 0) {
 			found_error = true;
 			first_error = ret;
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 285754154fdc5c..8d11e17c0be9fa 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -276,8 +276,6 @@ void btrfs_clear_buffer_dirty(struct btrfs_trans_handle *trans,
 
 int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array);
 
-void end_extent_writepage(struct page *page, int err, u64 start, u64 end);
-
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 bool find_lock_delalloc_range(struct inode *inode,
 			     struct page *locked_page, u64 *start,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b158db44b268a6..d746b0fe0f994b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -426,11 +426,10 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode,
 
 	while (index <= end_index) {
 		/*
-		 * For locked page, we will call end_extent_writepage() on it
-		 * in run_delalloc_range() for the error handling.  That
-		 * end_extent_writepage() function will call
-		 * btrfs_mark_ordered_io_finished() to clear page Ordered and
-		 * run the ordered extent accounting.
+		 * For locked page, we will call btrfs_mark_ordered_io_finished
+		 * through btrfs_mark_ordered_io_finished() on it
+		 * in run_delalloc_range() for the error handling, which will
+		 * clear page Ordered and run the ordered extent accounting.
 		 *
 		 * Here we can't just clear the Ordered bit, or
 		 * btrfs_mark_ordered_io_finished() would skip the accounting
@@ -1160,11 +1159,16 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
 		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
 		if (locked_page) {
 			const u64 page_start = page_offset(locked_page);
-			const u64 page_end = page_start + PAGE_SIZE - 1;
 
 			set_page_writeback(locked_page);
 			end_page_writeback(locked_page);
-			end_extent_writepage(locked_page, ret, page_start, page_end);
+			btrfs_mark_ordered_io_finished(inode, locked_page,
+						       page_start, PAGE_SIZE,
+						       !ret);
+			btrfs_page_clear_uptodate(inode->root->fs_info,
+						  locked_page, page_start,
+						  PAGE_SIZE);
+			mapping_set_error(locked_page->mapping, ret);
 			unlock_page(locked_page);
 		}
 		return ret;
@@ -2841,23 +2845,19 @@ struct btrfs_writepage_fixup {
 
 static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 {
-	struct btrfs_writepage_fixup *fixup;
+	struct btrfs_writepage_fixup *fixup =
+		container_of(work, struct btrfs_writepage_fixup, work);
 	struct btrfs_ordered_extent *ordered;
 	struct extent_state *cached_state = NULL;
 	struct extent_changeset *data_reserved = NULL;
-	struct page *page;
-	struct btrfs_inode *inode;
-	u64 page_start;
-	u64 page_end;
+	struct page *page = fixup->page;
+	struct btrfs_inode *inode = fixup->inode;
+	struct btrfs_fs_info *fs_info = inode->root->fs_info;
+	u64 page_start = page_offset(page);
+	u64 page_end = page_offset(page) + PAGE_SIZE - 1;
 	int ret = 0;
 	bool free_delalloc_space = true;
 
-	fixup = container_of(work, struct btrfs_writepage_fixup, work);
-	page = fixup->page;
-	inode = fixup->inode;
-	page_start = page_offset(page);
-	page_end = page_offset(page) + PAGE_SIZE - 1;
-
 	/*
 	 * This is similar to page_mkwrite, we need to reserve the space before
 	 * we take the page lock.
@@ -2950,10 +2950,12 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 		 * to reflect the errors and clean the page.
 		 */
 		mapping_set_error(page->mapping, ret);
-		end_extent_writepage(page, ret, page_start, page_end);
+		btrfs_mark_ordered_io_finished(inode, page, page_start,
+					       PAGE_SIZE, !ret);
+		btrfs_page_clear_uptodate(fs_info, page, page_start, PAGE_SIZE);
 		clear_page_dirty_for_io(page);
 	}
-	btrfs_page_clear_checked(inode->root->fs_info, page, page_start, PAGE_SIZE);
+	btrfs_page_clear_checked(fs_info, page, page_start, PAGE_SIZE);
 	unlock_page(page);
 	put_page(page);
 	kfree(fixup);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (4 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 05/23] btrfs: remove end_extent_writepage Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  8:54   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range Christoph Hellwig
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Move the printk that is supposed to help to debug failures in
submit_one_async_extent into submit_one_async_extent and make it
coniditonal on actually having an error condition instead of spamming
the log unconditionally.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d746b0fe0f994b..0f709f766b6a94 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1181,11 +1181,11 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
 	return ret;
 }
 
-static int submit_one_async_extent(struct btrfs_inode *inode,
-				   struct async_chunk *async_chunk,
-				   struct async_extent *async_extent,
-				   u64 *alloc_hint)
+static void submit_one_async_extent(struct async_chunk *async_chunk,
+				    struct async_extent *async_extent,
+				    u64 *alloc_hint)
 {
+	struct btrfs_inode *inode = async_chunk->inode;
 	struct extent_io_tree *io_tree = &inode->io_tree;
 	struct btrfs_root *root = inode->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -1279,7 +1279,7 @@ static int submit_one_async_extent(struct btrfs_inode *inode,
 	if (async_chunk->blkcg_css)
 		kthread_associate_blkcg(NULL);
 	kfree(async_extent);
-	return ret;
+	return;
 
 out_free_reserve:
 	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
@@ -1293,7 +1293,13 @@ static int submit_one_async_extent(struct btrfs_inode *inode,
 				     PAGE_UNLOCK | PAGE_START_WRITEBACK |
 				     PAGE_END_WRITEBACK);
 	free_async_extent_pages(async_extent);
-	goto done;
+	if (async_chunk->blkcg_css)
+		kthread_associate_blkcg(NULL);
+	btrfs_debug(fs_info,
+"async extent submission failed root=%lld inode=%llu start=%llu len=%llu ret=%d",
+		    root->root_key.objectid, btrfs_ino(inode), start,
+		    async_extent->ram_size, ret);
+	kfree(async_extent);
 }
 
 /*
@@ -1303,28 +1309,15 @@ static int submit_one_async_extent(struct btrfs_inode *inode,
  */
 static noinline void submit_compressed_extents(struct async_chunk *async_chunk)
 {
-	struct btrfs_inode *inode = async_chunk->inode;
-	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct async_extent *async_extent;
 	u64 alloc_hint = 0;
-	int ret = 0;
 
 	while (!list_empty(&async_chunk->extents)) {
-		u64 extent_start;
-		u64 ram_size;
-
 		async_extent = list_entry(async_chunk->extents.next,
 					  struct async_extent, list);
 		list_del(&async_extent->list);
-		extent_start = async_extent->start;
-		ram_size = async_extent->ram_size;
 
-		ret = submit_one_async_extent(inode, async_chunk, async_extent,
-					      &alloc_hint);
-		btrfs_debug(fs_info,
-"async extent submission failed root=%lld inode=%llu start=%llu len=%llu ret=%d",
-			    inode->root->root_key.objectid,
-			    btrfs_ino(inode), extent_start, ram_size, ret);
+		submit_one_async_extent(async_chunk, async_extent, &alloc_hint);
 	}
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (5 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  8:56   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range Christoph Hellwig
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

The return value from submit_uncompressed_range is ignored, and that's
fine because the error reporting happens through the mapping and
ordered_extent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0f709f766b6a94..c6845b0591b77e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1126,9 +1126,9 @@ static void free_async_extent_pages(struct async_extent *async_extent)
 	async_extent->pages = NULL;
 }
 
-static int submit_uncompressed_range(struct btrfs_inode *inode,
-				     struct async_extent *async_extent,
-				     struct page *locked_page)
+static void submit_uncompressed_range(struct btrfs_inode *inode,
+				      struct async_extent *async_extent,
+				      struct page *locked_page)
 {
 	u64 start = async_extent->start;
 	u64 end = async_extent->start + async_extent->ram_size - 1;
@@ -1153,7 +1153,7 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
 			     &nr_written, NULL, CFR_KEEP_LOCKED);
 	/* Inline extent inserted, page gets unlocked and everything is done */
 	if (page_started)
-		return 0;
+		return;
 
 	if (ret < 0) {
 		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
@@ -1171,14 +1171,13 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
 			mapping_set_error(locked_page->mapping, ret);
 			unlock_page(locked_page);
 		}
-		return ret;
+		return;
 	}
 
 	/* All pages will be unlocked, including @locked_page */
 	wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
-	ret = extent_write_locked_range(&inode->vfs_inode, start, end, &wbc);
+	extent_write_locked_range(&inode->vfs_inode, start, end, &wbc);
 	wbc_detach_inode(&wbc);
-	return ret;
 }
 
 static void submit_one_async_extent(struct async_chunk *async_chunk,
@@ -1215,7 +1214,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk,
 
 	/* We have fall back to uncompressed write */
 	if (!async_extent->pages) {
-		ret = submit_uncompressed_range(inode, async_extent, locked_page);
+		submit_uncompressed_range(inode, async_extent, locked_page);
 		goto done;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (6 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  8:56   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 09/23] btrfs: improve the delalloc_to_write calculation in writepage_delalloc Christoph Hellwig
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

The return value from extent_write_locked_range is ignored, and that's
fine because the error reporting happens through the mapping and
ordered_extent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c | 13 +++----------
 fs/btrfs/extent_io.h |  4 ++--
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 5a4f5fc09a2354..e32ec41bade681 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2152,11 +2152,10 @@ static int extent_write_cache_pages(struct address_space *mapping,
  * already been ran (aka, ordered extent inserted) and all pages are still
  * locked.
  */
-int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			      struct writeback_control *wbc)
+void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
+			       struct writeback_control *wbc)
 {
 	bool found_error = false;
-	int first_error = 0;
 	int ret = 0;
 	struct address_space *mapping = inode->i_mapping;
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2206,20 +2205,14 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 			mapping_set_error(page->mapping, ret);
 		}
 		btrfs_page_unlock_writer(fs_info, page, cur, cur_len);
-		if (ret < 0) {
+		if (ret < 0)
 			found_error = true;
-			first_error = ret;
-		}
 next_page:
 		put_page(page);
 		cur = cur_end + 1;
 	}
 
 	submit_write_bio(&bio_ctrl, found_error ? ret : 0);
-
-	if (found_error)
-		return first_error;
-	return ret;
 }
 
 int extent_writepages(struct address_space *mapping,
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 8d11e17c0be9fa..0312022bbf4b7a 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -177,8 +177,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask);
 int try_release_extent_buffer(struct page *page);
 
 int btrfs_read_folio(struct file *file, struct folio *folio);
-int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			      struct writeback_control *wbc);
+void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
+			       struct writeback_control *wbc);
 int extent_writepages(struct address_space *mapping,
 		      struct writeback_control *wbc);
 int btree_write_cache_pages(struct address_space *mapping,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/23] btrfs: improve the delalloc_to_write calculation in writepage_delalloc
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (7 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range Christoph Hellwig
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Currently writepage_delalloc adds to delalloc_to_write in every loop
operation.  That is not only more work than doing it once after the
loop, but can also over-increment the counter due to rounding errors
when a new loop iteration starts with an offset into a page.

Add a new page_start variable instead of recaculation that value over
and over, move the delalloc_to_write calculation out of the loop, use
the DIV_ROUND_UP helper instead of open coding it and remove the pointless
found local variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index e32ec41bade681..aa2f88365ad05a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1164,8 +1164,10 @@ static inline void contiguous_readpages(struct page *pages[], int nr_pages,
 static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 		struct page *page, struct writeback_control *wbc)
 {
-	const u64 page_end = page_offset(page) + PAGE_SIZE - 1;
-	u64 delalloc_start = page_offset(page);
+	const u64 page_start = page_offset(page);
+	const u64 page_end = page_start + PAGE_SIZE - 1;
+	u64 delalloc_start = page_start;
+	u64 delalloc_end = page_end;
 	u64 delalloc_to_write = 0;
 	/* How many pages are started by btrfs_run_delalloc_range() */
 	unsigned long nr_written = 0;
@@ -1173,13 +1175,9 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 	int page_started = 0;
 
 	while (delalloc_start < page_end) {
-		u64 delalloc_end = page_end;
-		bool found;
-
-		found = find_lock_delalloc_range(&inode->vfs_inode, page,
-					       &delalloc_start,
-					       &delalloc_end);
-		if (!found) {
+		delalloc_end = page_end;
+		if (!find_lock_delalloc_range(&inode->vfs_inode, page,
+					      &delalloc_start, &delalloc_end)) {
 			delalloc_start = delalloc_end + 1;
 			continue;
 		}
@@ -1188,14 +1186,15 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 		if (ret)
 			return ret;
 
-		/*
-		 * delalloc_end is already one less than the total length, so
-		 * we don't subtract one from PAGE_SIZE
-		 */
-		delalloc_to_write += (delalloc_end - delalloc_start +
-				      PAGE_SIZE) >> PAGE_SHIFT;
 		delalloc_start = delalloc_end + 1;
 	}
+
+	/*
+	 * delalloc_end is already one less than the total length, so
+	 * we don't subtract one from PAGE_SIZE
+	 */
+	delalloc_to_write +=
+		DIV_ROUND_UP(delalloc_end + 1 - page_start, PAGE_SIZE);
 	if (wbc->nr_to_write < delalloc_to_write) {
 		int thresh = 8192;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (8 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 09/23] btrfs: improve the delalloc_to_write calculation in writepage_delalloc Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  9:00   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent Christoph Hellwig
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Instead of a separate page_started argument that tells the callers that
btrfs_run_delalloc_range already started writeback by itself, overload
the return value with a positive 1 in additio to 0 and a negative error
code to indicate that is has already started writeback, and remove the
nr_written argument as that caller can calculate it directly based on
the range, and in fact already does so for the case where writeback
wasn't started yet.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/btrfs_inode.h |  3 +-
 fs/btrfs/extent_io.c   | 30 +++++++--------
 fs/btrfs/inode.c       | 87 ++++++++++++++----------------------------
 3 files changed, 44 insertions(+), 76 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 90e60ad9db6200..bda1fdbba666aa 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -498,8 +498,7 @@ int btrfs_prealloc_file_range_trans(struct inode *inode,
 				    u64 start, u64 num_bytes, u64 min_size,
 				    loff_t actual_len, u64 *alloc_hint);
 int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page,
-			     u64 start, u64 end, int *page_started,
-			     unsigned long *nr_written, struct writeback_control *wbc);
+			     u64 start, u64 end, struct writeback_control *wbc);
 int btrfs_writepage_cow_fixup(struct page *page);
 int btrfs_encoded_io_compression_from_extent(struct btrfs_fs_info *fs_info,
 					     int compress_type);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index aa2f88365ad05a..6befffd76e8808 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1169,10 +1169,7 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 	u64 delalloc_start = page_start;
 	u64 delalloc_end = page_end;
 	u64 delalloc_to_write = 0;
-	/* How many pages are started by btrfs_run_delalloc_range() */
-	unsigned long nr_written = 0;
-	int ret;
-	int page_started = 0;
+	int ret = 0;
 
 	while (delalloc_start < page_end) {
 		delalloc_end = page_end;
@@ -1181,9 +1178,10 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 			delalloc_start = delalloc_end + 1;
 			continue;
 		}
+
 		ret = btrfs_run_delalloc_range(inode, page, delalloc_start,
-				delalloc_end, &page_started, &nr_written, wbc);
-		if (ret)
+					       delalloc_end, wbc);
+		if (ret < 0)
 			return ret;
 
 		delalloc_start = delalloc_end + 1;
@@ -1195,6 +1193,16 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 	 */
 	delalloc_to_write +=
 		DIV_ROUND_UP(delalloc_end + 1 - page_start, PAGE_SIZE);
+
+	/*
+	 * If btrfs_run_dealloc_range() already started I/O and unlocked
+	 * the pages, we just need to account for them here.
+	 */
+	if (ret == 1) {
+		wbc->nr_to_write -= delalloc_to_write;
+		return 1;
+	}
+
 	if (wbc->nr_to_write < delalloc_to_write) {
 		int thresh = 8192;
 
@@ -1204,16 +1212,6 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode,
 					 thresh);
 	}
 
-	/* Did btrfs_run_dealloc_range() already unlock and start the IO? */
-	if (page_started) {
-		/*
-		 * We've unlocked the page, so we can't update the mapping's
-		 * writeback index, just update nr_to_write.
-		 */
-		wbc->nr_to_write -= nr_written;
-		return 1;
-	}
-
 	return 0;
 }
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c6845b0591b77e..8185e95ad12a19 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -129,8 +129,7 @@ static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback);
 #define CFR_NOINLINE		(1 << 1)
 static noinline int cow_file_range(struct btrfs_inode *inode,
 				   struct page *locked_page,
-				   u64 start, u64 end, int *page_started,
-				   unsigned long *nr_written, u64 *done_offset,
+				   u64 start, u64 end, u64 *done_offset,
 				   u32 flags);
 static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
 				       u64 len, u64 orig_start, u64 block_start,
@@ -1132,8 +1131,6 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 {
 	u64 start = async_extent->start;
 	u64 end = async_extent->start + async_extent->ram_size - 1;
-	unsigned long nr_written = 0;
-	int page_started = 0;
 	int ret;
 	struct writeback_control wbc = {
 		.sync_mode		= WB_SYNC_ALL,
@@ -1149,10 +1146,10 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 	 * Also we call cow_file_range() with @unlock_page == 0, so that we
 	 * can directly submit them without interruption.
 	 */
-	ret = cow_file_range(inode, locked_page, start, end, &page_started,
-			     &nr_written, NULL, CFR_KEEP_LOCKED);
+	ret = cow_file_range(inode, locked_page, start, end, NULL,
+			     CFR_KEEP_LOCKED);
 	/* Inline extent inserted, page gets unlocked and everything is done */
-	if (page_started)
+	if (ret == 1)
 		return;
 
 	if (ret < 0) {
@@ -1363,8 +1360,8 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
  *
  * When this function fails, it unlocks all pages except @locked_page.
  *
- * When this function successfully creates an inline extent, it sets page_started
- * to 1 and unlocks all pages including locked_page and starts I/O on them.
+ * When this function successfully creates an inline extent, it returns 1 and
+ * unlocks all pages including locked_page and starts I/O on them.
  * (In reality inline extents are limited to a single page, so locked_page is
  * the only page handled anyway).
  *
@@ -1381,10 +1378,8 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
  * example.
  */
 static noinline int cow_file_range(struct btrfs_inode *inode,
-				   struct page *locked_page,
-				   u64 start, u64 end, int *page_started,
-				   unsigned long *nr_written, u64 *done_offset,
-				   u32 flags)
+				   struct page *locked_page, u64 start, u64 end,
+				   u64 *done_offset, u32 flags)
 {
 	struct btrfs_root *root = inode->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -1444,9 +1439,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 				     EXTENT_DELALLOC_NEW | EXTENT_DEFRAG |
 				     EXTENT_DO_ACCOUNTING, PAGE_UNLOCK |
 				     PAGE_START_WRITEBACK | PAGE_END_WRITEBACK);
-			*nr_written = *nr_written +
-			     (end - start + PAGE_SIZE) / PAGE_SIZE;
-			*page_started = 1;
 			/*
 			 * locked_page is locked by the caller of
 			 * writepage_delalloc(), not locked by
@@ -1456,11 +1448,11 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 			 * as it doesn't have any subpage::writers recorded.
 			 *
 			 * Here we manually unlock the page, since the caller
-			 * can't use page_started to determine if it's an
-			 * inline extent or a compressed extent.
+			 * can't determine if it's an inline extent or a
+			 * compressed extent.
 			 */
 			unlock_page(locked_page);
-			goto out;
+			return 1;
 		} else if (ret < 0) {
 			goto out_unlock;
 		}
@@ -1574,7 +1566,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 		if (ret)
 			goto out_unlock;
 	}
-out:
 	return ret;
 
 out_drop_extent_cache:
@@ -1725,10 +1716,8 @@ static noinline void async_cow_free(struct btrfs_work *work)
 }
 
 static bool run_delalloc_compressed(struct btrfs_inode *inode,
-				    struct writeback_control *wbc,
-				    struct page *locked_page,
-				    u64 start, u64 end, int *page_started,
-				    unsigned long *nr_written)
+				    struct page *locked_page, u64 start,
+				    u64 end, struct writeback_control *wbc)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct cgroup_subsys_state *blkcg_css = wbc_blkcg_css(wbc);
@@ -1810,34 +1799,25 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode,
 
 		btrfs_queue_work(fs_info->delalloc_workers, &async_chunk[i].work);
 
-		*nr_written += nr_pages;
 		start = cur_end + 1;
 	}
-	*page_started = 1;
 	return true;
 }
 
 static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 				       struct page *locked_page, u64 start,
-				       u64 end, int *page_started,
-				       unsigned long *nr_written,
-				       struct writeback_control *wbc)
+				       u64 end, struct writeback_control *wbc)
 {
 	u64 done_offset = end;
 	int ret;
 	bool locked_page_done = false;
 
 	while (start <= end) {
-		ret = cow_file_range(inode, locked_page, start, end, page_started,
-				     nr_written, &done_offset, CFR_KEEP_LOCKED);
+		ret = cow_file_range(inode, locked_page, start, end,
+				     &done_offset, CFR_KEEP_LOCKED);
 		if (ret && ret != -EAGAIN)
 			return ret;
 
-		if (*page_started) {
-			ASSERT(ret == 0);
-			return 0;
-		}
-
 		if (ret == 0)
 			done_offset = end;
 
@@ -1858,9 +1838,7 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 		start = done_offset + 1;
 	}
 
-	*page_started = 1;
-
-	return 0;
+	return 1;
 }
 
 static noinline int csum_exist_in_range(struct btrfs_fs_info *fs_info,
@@ -1893,8 +1871,6 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
 	const bool is_reloc_ino = btrfs_is_data_reloc_root(inode->root);
 	const u64 range_bytes = end + 1 - start;
 	struct extent_io_tree *io_tree = &inode->io_tree;
-	int page_started = 0;
-	unsigned long nr_written;
 	u64 range_start = start;
 	u64 count;
 	int ret;
@@ -1955,9 +1931,9 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
 	 * is written out and unlocked directly and a normal nocow extent
 	 * doesn't work.
 	 */
-	ret = cow_file_range(inode, locked_page, start, end, &page_started,
-			     &nr_written, NULL, CFR_NOINLINE);
-	ASSERT(!page_started);
+	ret = cow_file_range(inode, locked_page, start, end, NULL,
+			     CFR_NOINLINE);
+	ASSERT(ret != 1);
 	return ret;
 }
 
@@ -2393,15 +2369,14 @@ static bool should_nocow(struct btrfs_inode *inode, u64 start, u64 end)
  * being touched for the first time.
  */
 int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page,
-		u64 start, u64 end, int *page_started, unsigned long *nr_written,
-		struct writeback_control *wbc)
+			     u64 start, u64 end, struct writeback_control *wbc)
 {
-	int ret = 0;
 	const bool zoned = btrfs_is_zoned(inode->root->fs_info);
+	int ret;
 
 	/*
-	 * The range must cover part of the @locked_page, or the returned
-	 * @page_started can confuse the caller.
+	 * The range must cover part of the @locked_page, or a return of 1
+	 * can confuse the caller.
 	 */
 	ASSERT(!(end <= page_offset(locked_page) ||
 		 start >= page_offset(locked_page) + PAGE_SIZE));
@@ -2421,20 +2396,16 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 
 	if (btrfs_inode_can_compress(inode) &&
 	    inode_need_compress(inode, start, end) &&
-	    run_delalloc_compressed(inode, wbc, locked_page, start,
-				    end, page_started, nr_written))
-		goto out;
+	    run_delalloc_compressed(inode, locked_page, start, end, wbc))
+		return 1;
 
 	if (zoned)
-		ret = run_delalloc_zoned(inode, locked_page, start, end,
-					 page_started, nr_written, wbc);
+		ret = run_delalloc_zoned(inode, locked_page, start, end, wbc);
 	else
-		ret = cow_file_range(inode, locked_page, start, end,
-				     page_started, nr_written, NULL, 0);
+		ret = cow_file_range(inode, locked_page, start, end, NULL, 0);
 
 out:
-	ASSERT(ret <= 0);
-	if (ret)
+	if (ret < 0)
 		btrfs_cleanup_ordered_extents(inode, locked_page, start,
 					      end - start + 1);
 	return ret;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (9 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  9:07   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 12/23] btrfs: don't clear async_chunk->inode in async_cow_start Christoph Hellwig
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Instead of checking for a NULL !pages and explaining this with a cryptic
comment, just check the compression type for BTRFS_COMPRESS_NONE to make
the check self-explanatory.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8185e95ad12a19..6197b33fb0b23b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1209,8 +1209,7 @@ static void submit_one_async_extent(struct async_chunk *async_chunk,
 	}
 	lock_extent(io_tree, start, end, NULL);
 
-	/* We have fall back to uncompressed write */
-	if (!async_extent->pages) {
+	if (async_extent->compress_type == BTRFS_COMPRESS_NONE) {
 		submit_uncompressed_range(inode, async_extent, locked_page);
 		goto done;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 12/23] btrfs: don't clear async_chunk->inode in async_cow_start
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (10 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range Christoph Hellwig
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Now that the ->inode check isn't needed in submit_compressed_extents
any more, there is no reason to clear the field early.  Always keep
the inode around until the work item is finished and remove the special
casing, and the counting of compressed extents in compress_file_range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 6197b33fb0b23b..f8fbcd359a304d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -832,7 +832,7 @@ static inline void inode_should_defrag(struct btrfs_inode *inode,
  * are written in the same order that the flusher thread sent them
  * down.
  */
-static noinline int compress_file_range(struct async_chunk *async_chunk)
+static noinline void compress_file_range(struct async_chunk *async_chunk)
 {
 	struct btrfs_inode *inode = async_chunk->inode;
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
@@ -850,7 +850,6 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 	int i;
 	int will_compress;
 	int compress_type = fs_info->compress_type;
-	int compressed_extents = 0;
 	int redirty = 0;
 
 	inode_should_defrag(inode, start, end, end - start + 1, SZ_16K);
@@ -1027,7 +1026,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 				}
 				kfree(pages);
 			}
-			return 0;
+			return;
 		}
 	}
 
@@ -1046,8 +1045,6 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 		 */
 		total_in = round_up(total_in, fs_info->sectorsize);
 		if (total_compressed + blocksize <= total_in) {
-			compressed_extents++;
-
 			/*
 			 * The async work queues will take care of doing actual
 			 * allocation on disk for these compressed pages, and
@@ -1063,7 +1060,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 				cond_resched();
 				goto again;
 			}
-			return compressed_extents;
+			return;
 		}
 	}
 	if (pages) {
@@ -1104,9 +1101,6 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 		extent_range_redirty_for_io(&inode->vfs_inode, start, end);
 	add_async_extent(async_chunk, start, end - start + 1, 0, NULL, 0,
 			 BTRFS_COMPRESS_NONE);
-	compressed_extents++;
-
-	return compressed_extents;
 }
 
 static void free_async_extent_pages(struct async_extent *async_extent)
@@ -1659,15 +1653,9 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 static noinline void async_cow_start(struct btrfs_work *work)
 {
 	struct async_chunk *async_chunk;
-	int compressed_extents;
 
 	async_chunk = container_of(work, struct async_chunk, work);
-
-	compressed_extents = compress_file_range(async_chunk);
-	if (compressed_extents == 0) {
-		btrfs_add_delayed_iput(async_chunk->inode);
-		async_chunk->inode = NULL;
-	}
+	compress_file_range(async_chunk);
 }
 
 /*
@@ -1704,8 +1692,7 @@ static noinline void async_cow_free(struct btrfs_work *work)
 	struct async_cow *async_cow;
 
 	async_chunk = container_of(work, struct async_chunk, work);
-	if (async_chunk->inode)
-		btrfs_add_delayed_iput(async_chunk->inode);
+	btrfs_add_delayed_iput(async_chunk->inode);
 	if (async_chunk->blkcg_css)
 		css_put(async_chunk->blkcg_css);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (11 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 12/23] btrfs: don't clear async_chunk->inode in async_cow_start Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-04  9:09   ` Johannes Thumshirn
  2023-06-28 15:31 ` [PATCH 14/23] btrfs: merge submit_compressed_extents and async_cow_submit Christoph Hellwig
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

There is no good reason to have the simple async_cow_start wrapper,
merge the argument conversion into the main compress_file_range function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 43 ++++++++++++++++---------------------------
 1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f8fbcd359a304d..1e1d6584e1abaa 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -816,24 +816,22 @@ static inline void inode_should_defrag(struct btrfs_inode *inode,
 }
 
 /*
- * we create compressed extents in two phases.  The first
- * phase compresses a range of pages that have already been
- * locked (both pages and state bits are locked).
+ * Work queue call back to started compression on a file and pages.
  *
- * This is done inside an ordered work queue, and the compression
- * is spread across many cpus.  The actual IO submission is step
- * two, and the ordered work queue takes care of making sure that
- * happens in the same order things were put onto the queue by
- * writepages and friends.
+ * This is done inside an ordered work queue, and the compression is spread
+ * across many cpus.  The actual IO submission is step two, and the ordered work
+ * queue takes care of making sure that happens in the same order things were
+ * put onto the queue by writepages and friends.
  *
- * If this code finds it can't get good compression, it puts an
- * entry onto the work queue to write the uncompressed bytes.  This
- * makes sure that both compressed inodes and uncompressed inodes
- * are written in the same order that the flusher thread sent them
- * down.
+ * If this code finds it can't get good compression, it puts an entry onto the
+ * work queue to write the uncompressed bytes.  This makes sure that both
+ * compressed inodes and uncompressed inodes are written in the same order that
+ * the flusher thread sent them down.
  */
-static noinline void compress_file_range(struct async_chunk *async_chunk)
+static void compress_file_range(struct btrfs_work *work)
 {
+	struct async_chunk *async_chunk =
+		container_of(work, struct async_chunk, work);
 	struct btrfs_inode *inode = async_chunk->inode;
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct address_space *mapping = inode->vfs_inode.i_mapping;
@@ -1648,18 +1646,9 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 }
 
 /*
- * work queue call back to started compression on a file and pages
- */
-static noinline void async_cow_start(struct btrfs_work *work)
-{
-	struct async_chunk *async_chunk;
-
-	async_chunk = container_of(work, struct async_chunk, work);
-	compress_file_range(async_chunk);
-}
-
-/*
- * work queue call back to submit previously compressed pages
+ * Phase two of compressed writeback.  This is the ordered portion of the code,
+ * which only gets called in the order the work was queued.  We walk all the
+ * async extents created by compress_file_range and send them down to the disk.
  */
 static noinline void async_cow_submit(struct btrfs_work *work)
 {
@@ -1777,7 +1766,7 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode,
 			async_chunk[i].blkcg_css = NULL;
 		}
 
-		btrfs_init_work(&async_chunk[i].work, async_cow_start,
+		btrfs_init_work(&async_chunk[i].work, compress_file_range,
 				async_cow_submit, async_cow_free);
 
 		nr_pages = DIV_ROUND_UP(cur_end - start, PAGE_SIZE);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 14/23] btrfs: merge submit_compressed_extents and async_cow_submit
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (12 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 15/23] btrfs: streamline compress_file_range Christoph Hellwig
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

The code in submit_compressed_extents just loops over the async_extents,
and doesn't need to be conditional on an inode being present, as there
won't be any async_extent in the list if we created and inline extent.
Merge the two functions to simplify the logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 39 ++++++++++-----------------------------
 1 file changed, 10 insertions(+), 29 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1e1d6584e1abaa..09f8c6f2f4bf88 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1289,25 +1289,6 @@ static void submit_one_async_extent(struct async_chunk *async_chunk,
 	kfree(async_extent);
 }
 
-/*
- * Phase two of compressed writeback.  This is the ordered portion of the code,
- * which only gets called in the order the work was queued.  We walk all the
- * async extents created by compress_file_range and send them down to the disk.
- */
-static noinline void submit_compressed_extents(struct async_chunk *async_chunk)
-{
-	struct async_extent *async_extent;
-	u64 alloc_hint = 0;
-
-	while (!list_empty(&async_chunk->extents)) {
-		async_extent = list_entry(async_chunk->extents.next,
-					  struct async_extent, list);
-		list_del(&async_extent->list);
-
-		submit_one_async_extent(async_chunk, async_extent, &alloc_hint);
-	}
-}
-
 static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
 				      u64 num_bytes)
 {
@@ -1650,24 +1631,24 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
  * which only gets called in the order the work was queued.  We walk all the
  * async extents created by compress_file_range and send them down to the disk.
  */
-static noinline void async_cow_submit(struct btrfs_work *work)
+static noinline void submit_compressed_extents(struct btrfs_work *work)
 {
 	struct async_chunk *async_chunk = container_of(work, struct async_chunk,
 						     work);
 	struct btrfs_fs_info *fs_info = btrfs_work_owner(work);
+	struct async_extent *async_extent;
 	unsigned long nr_pages;
+	u64 alloc_hint = 0;
 
 	nr_pages = (async_chunk->end - async_chunk->start + PAGE_SIZE) >>
 		PAGE_SHIFT;
 
-	/*
-	 * ->inode could be NULL if async_chunk_start has failed to compress,
-	 * in which case we don't have anything to submit, yet we need to
-	 * always adjust ->async_delalloc_pages as its paired with the init
-	 * happening in run_delalloc_compressed
-	 */
-	if (async_chunk->inode)
-		submit_compressed_extents(async_chunk);
+	while (!list_empty(&async_chunk->extents)) {
+		async_extent = list_entry(async_chunk->extents.next,
+					  struct async_extent, list);
+		list_del(&async_extent->list);
+		submit_one_async_extent(async_chunk, async_extent, &alloc_hint);
+	}
 
 	/* atomic_sub_return implies a barrier */
 	if (atomic_sub_return(nr_pages, &fs_info->async_delalloc_pages) <
@@ -1767,7 +1748,7 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode,
 		}
 
 		btrfs_init_work(&async_chunk[i].work, compress_file_range,
-				async_cow_submit, async_cow_free);
+				submit_compressed_extents, async_cow_free);
 
 		nr_pages = DIV_ROUND_UP(cur_end - start, PAGE_SIZE);
 		atomic_add(nr_pages, &fs_info->async_delalloc_pages);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 15/23] btrfs: streamline compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (13 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 14/23] btrfs: merge submit_compressed_extents and async_cow_submit Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range Christoph Hellwig
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Reorder compress_file_range so that the main compression flow happens
straight line and not in branches.  To do this ensure that pages is
always zeroed before a page allocation happens, which allows the
cleanup_and_bail_uncompressed label to clean up the page allocations
as needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 165 +++++++++++++++++++++++------------------------
 1 file changed, 81 insertions(+), 84 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 09f8c6f2f4bf88..e7c05d07ff50f8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -841,10 +841,11 @@ static void compress_file_range(struct btrfs_work *work)
 	u64 actual_end;
 	u64 i_size;
 	int ret = 0;
-	struct page **pages = NULL;
+	struct page **pages;
 	unsigned long nr_pages;
 	unsigned long total_compressed = 0;
 	unsigned long total_in = 0;
+	unsigned int poff;
 	int i;
 	int will_compress;
 	int compress_type = fs_info->compress_type;
@@ -867,6 +868,7 @@ static void compress_file_range(struct btrfs_work *work)
 	actual_end = min_t(u64, i_size, end + 1);
 again:
 	will_compress = 0;
+	pages = NULL;
 	nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
 	nr_pages = min_t(unsigned long, nr_pages, BTRFS_MAX_COMPRESSED_PAGES);
 
@@ -910,66 +912,62 @@ static void compress_file_range(struct btrfs_work *work)
 	ret = 0;
 
 	/*
-	 * we do compression for mount -o compress and when the
-	 * inode has not been flagged as nocompress.  This flag can
-	 * change at any time if we discover bad compression ratios.
+	 * We do compression for mount -o compress and when the inode has not
+	 * been flagged as nocompress.  This flag can change at any time if we
+	 * discover bad compression ratios.
 	 */
-	if (inode_need_compress(inode, start, end)) {
-		WARN_ON(pages);
-		pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
-		if (!pages) {
-			/* just bail out to the uncompressed code */
-			nr_pages = 0;
-			goto cont;
-		}
+	if (!inode_need_compress(inode, start, end))
+		goto cont;
 
-		if (inode->defrag_compress)
-			compress_type = inode->defrag_compress;
-		else if (inode->prop_compress)
-			compress_type = inode->prop_compress;
+	pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
+	if (!pages) {
+		/* just bail out to the uncompressed code */
+		nr_pages = 0;
+		goto cont;
+	}
 
-		/*
-		 * we need to call clear_page_dirty_for_io on each
-		 * page in the range.  Otherwise applications with the file
-		 * mmap'd can wander in and change the page contents while
-		 * we are compressing them.
-		 *
-		 * If the compression fails for any reason, we set the pages
-		 * dirty again later on.
-		 *
-		 * Note that the remaining part is redirtied, the start pointer
-		 * has moved, the end is the original one.
-		 */
-		if (!redirty) {
-			extent_range_clear_dirty_for_io(&inode->vfs_inode, start, end);
-			redirty = 1;
-		}
+	if (inode->defrag_compress)
+		compress_type = inode->defrag_compress;
+	else if (inode->prop_compress)
+		compress_type = inode->prop_compress;
+
+	/*
+	 * We need to call clear_page_dirty_for_io on each page in the range.
+	 * Otherwise applications with the file mmap'd can wander in and change
+	 * the page contents while we are compressing them.
+	 *
+	 * If the compression fails for any reason, we set the pages dirty again
+	 * later on.
+	 *
+	 * Note that the remaining part is redirtied, the start pointer has
+	 * moved, the end is the original one.
+	 */
+	if (!redirty) {
+		extent_range_clear_dirty_for_io(&inode->vfs_inode, start, end);
+		redirty = 1;
+	}
 
-		/* Compression level is applied here and only here */
-		ret = btrfs_compress_pages(
-			compress_type | (fs_info->compress_level << 4),
-					   mapping, start,
-					   pages,
-					   &nr_pages,
-					   &total_in,
-					   &total_compressed);
+	/* Compression level is applied here and only here */
+	ret = btrfs_compress_pages(compress_type |
+				   (fs_info->compress_level << 4),
+				   mapping, start, pages, &nr_pages, &total_in,
+				   &total_compressed);
+	if (ret)
+		goto cont;
 
-		if (!ret) {
-			unsigned long offset = offset_in_page(total_compressed);
-			struct page *page = pages[nr_pages - 1];
+	/*
+	 * Zero the tail end of the last page, as we might be sending it down
+	 * to disk.
+	 */
+	poff = offset_in_page(total_compressed);
+	if (poff)
+		memzero_page(pages[nr_pages - 1], poff, PAGE_SIZE - poff);
+	will_compress = 1;
 
-			/* zero the tail end of the last page, we might be
-			 * sending it down to disk
-			 */
-			if (offset)
-				memzero_page(page, offset, PAGE_SIZE - offset);
-			will_compress = 1;
-		}
-	}
 cont:
 	/*
 	 * Check cow_file_range() for why we don't even try to create inline
-	 * extent for subpage case.
+	 * extent for the subpage case.
 	 */
 	if (start == 0 && fs_info->sectorsize == PAGE_SIZE) {
 		/* lets try to make an inline extent */
@@ -1028,39 +1026,38 @@ static void compress_file_range(struct btrfs_work *work)
 		}
 	}
 
-	if (will_compress) {
-		/*
-		 * we aren't doing an inline extent round the compressed size
-		 * up to a block size boundary so the allocator does sane
-		 * things
-		 */
-		total_compressed = ALIGN(total_compressed, blocksize);
+	if (!will_compress)
+		goto cleanup_and_bail_uncompressed;
 
-		/*
-		 * one last check to make sure the compression is really a
-		 * win, compare the page count read with the blocks on disk,
-		 * compression must free at least one sector size
-		 */
-		total_in = round_up(total_in, fs_info->sectorsize);
-		if (total_compressed + blocksize <= total_in) {
-			/*
-			 * The async work queues will take care of doing actual
-			 * allocation on disk for these compressed pages, and
-			 * will submit them to the elevator.
-			 */
-			add_async_extent(async_chunk, start, total_in,
-					total_compressed, pages, nr_pages,
-					compress_type);
-
-			if (start + total_in < end) {
-				start += total_in;
-				pages = NULL;
-				cond_resched();
-				goto again;
-			}
-			return;
-		}
+	/*
+	 * We aren't doing an inline extent. Round the compressed size up to a
+	 * block size boundary so the allocator does sane things.
+	 */
+	total_compressed = ALIGN(total_compressed, blocksize);
+
+	/*
+	 * One last check to make sure the compression is really a win, compare
+	 * the page count read with the blocks on disk, compression must free at
+	 * least one sector.
+	 */
+	total_in = round_up(total_in, fs_info->sectorsize);
+	if (total_compressed + blocksize > total_in)
+		goto cleanup_and_bail_uncompressed;
+
+	/*
+	 * The async work queues will take care of doing actual allocation on
+	 * disk for these compressed pages, and will submit the bios.
+	 */
+	add_async_extent(async_chunk, start, total_in, total_compressed, pages,
+			 nr_pages, compress_type);
+	if (start + total_in < end) {
+		start += total_in;
+		cond_resched();
+		goto again;
 	}
+	return;
+
+cleanup_and_bail_uncompressed:
 	if (pages) {
 		/*
 		 * the compression code ran but failed to make things smaller,
@@ -1081,7 +1078,7 @@ static void compress_file_range(struct btrfs_work *work)
 			inode->flags |= BTRFS_INODE_NOCOMPRESS;
 		}
 	}
-cleanup_and_bail_uncompressed:
+
 	/*
 	 * No compression, but we still need to write the pages in the file
 	 * we've been given so far.  redirty the locked page if it corresponds
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (14 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 15/23] btrfs: streamline compress_file_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-14 13:47   ` Josef Bacik
  2023-06-28 15:31 ` [PATCH 17/23] btrfs: use a separate label for the incompressible case " Christoph Hellwig
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Currently the logic whether to compress or not in compress_file_range is
a bit convoluted because it tries to share code for creating inline
extents for the compressible [1] path and the bail to uncompressed path.

But the latter isn't needed at all, because cow_file_range as called by
submit_uncompressed_range will already create inline extents as needed,
so there is no need to have special handling for it if we can live with
the fact that it will be called a bit later in the ->ordered_func of the
workqueue instead of right now.

[1] there is undocumented logic that creates an uncompressed inline
extent outside of the shall not compress logic if total_in is too small.
This logic isn't explained in comments or any commit log I could find,
so I've preserved it.  Documentation explaining it would be appreciated
if anyone understands this code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 51 ++++++++++++++++--------------------------------
 1 file changed, 17 insertions(+), 34 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e7c05d07ff50f8..560682a5d9d7aa 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -847,7 +847,6 @@ static void compress_file_range(struct btrfs_work *work)
 	unsigned long total_in = 0;
 	unsigned int poff;
 	int i;
-	int will_compress;
 	int compress_type = fs_info->compress_type;
 	int redirty = 0;
 
@@ -867,7 +866,6 @@ static void compress_file_range(struct btrfs_work *work)
 	barrier();
 	actual_end = min_t(u64, i_size, end + 1);
 again:
-	will_compress = 0;
 	pages = NULL;
 	nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
 	nr_pages = min_t(unsigned long, nr_pages, BTRFS_MAX_COMPRESSED_PAGES);
@@ -917,14 +915,12 @@ static void compress_file_range(struct btrfs_work *work)
 	 * discover bad compression ratios.
 	 */
 	if (!inode_need_compress(inode, start, end))
-		goto cont;
+		goto cleanup_and_bail_uncompressed;
 
 	pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
-	if (!pages) {
+	if (!pages)
 		/* just bail out to the uncompressed code */
-		nr_pages = 0;
-		goto cont;
-	}
+		goto cleanup_and_bail_uncompressed;
 
 	if (inode->defrag_compress)
 		compress_type = inode->defrag_compress;
@@ -953,7 +949,7 @@ static void compress_file_range(struct btrfs_work *work)
 				   mapping, start, pages, &nr_pages, &total_in,
 				   &total_compressed);
 	if (ret)
-		goto cont;
+		goto cleanup_and_bail_uncompressed;
 
 	/*
 	 * Zero the tail end of the last page, as we might be sending it down
@@ -962,24 +958,22 @@ static void compress_file_range(struct btrfs_work *work)
 	poff = offset_in_page(total_compressed);
 	if (poff)
 		memzero_page(pages[nr_pages - 1], poff, PAGE_SIZE - poff);
-	will_compress = 1;
 
-cont:
 	/*
+	 * Try to create an inline extent.
+	 *
+	 * If we didn't compress the entire range, try to create an uncompressed
+	 * inline extent, else a compressed one.
+	 *
 	 * Check cow_file_range() for why we don't even try to create inline
 	 * extent for the subpage case.
 	 */
 	if (start == 0 && fs_info->sectorsize == PAGE_SIZE) {
-		/* lets try to make an inline extent */
-		if (ret || total_in < actual_end) {
-			/* we didn't compress the entire range, try
-			 * to make an uncompressed inline extent.
-			 */
-			ret = cow_file_range_inline(inode, actual_end,
-						    0, BTRFS_COMPRESS_NONE,
-						    NULL, false);
+		if (total_in < actual_end) {
+			ret = cow_file_range_inline(inode, actual_end, 0,
+						    BTRFS_COMPRESS_NONE, NULL,
+						    false);
 		} else {
-			/* try making a compressed inline extent */
 			ret = cow_file_range_inline(inode, actual_end,
 						    total_compressed,
 						    compress_type, pages,
@@ -1009,26 +1003,15 @@ static void compress_file_range(struct btrfs_work *work)
 						     PAGE_UNLOCK |
 						     PAGE_START_WRITEBACK |
 						     PAGE_END_WRITEBACK);
-
-			/*
-			 * Ensure we only free the compressed pages if we have
-			 * them allocated, as we can still reach here with
-			 * inode_need_compress() == false.
-			 */
-			if (pages) {
-				for (i = 0; i < nr_pages; i++) {
-					WARN_ON(pages[i]->mapping);
-					put_page(pages[i]);
-				}
-				kfree(pages);
+			for (i = 0; i < nr_pages; i++) {
+				WARN_ON(pages[i]->mapping);
+				put_page(pages[i]);
 			}
+			kfree(pages);
 			return;
 		}
 	}
 
-	if (!will_compress)
-		goto cleanup_and_bail_uncompressed;
-
 	/*
 	 * We aren't doing an inline extent. Round the compressed size up to a
 	 * block size boundary so the allocator does sane things.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 17/23] btrfs: use a separate label for the incompressible case in compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (15 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 18/23] btrfs: share the code to free the page array " Christoph Hellwig
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

compress_file_range can fail to compress either because of resource or
alignment constraints or because the data is incompressible.  In the latter
case the inode is marked so that compression isn't tried again.  Currently
that check is based on the condition that the pages array has been allocated
which is rather cryptic.  Use a separate label to clearly distinguish this
case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 560682a5d9d7aa..00aabc088a9deb 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -949,7 +949,7 @@ static void compress_file_range(struct btrfs_work *work)
 				   mapping, start, pages, &nr_pages, &total_in,
 				   &total_compressed);
 	if (ret)
-		goto cleanup_and_bail_uncompressed;
+		goto mark_incompressible;
 
 	/*
 	 * Zero the tail end of the last page, as we might be sending it down
@@ -1025,7 +1025,7 @@ static void compress_file_range(struct btrfs_work *work)
 	 */
 	total_in = round_up(total_in, fs_info->sectorsize);
 	if (total_compressed + blocksize > total_in)
-		goto cleanup_and_bail_uncompressed;
+		goto mark_incompressible;
 
 	/*
 	 * The async work queues will take care of doing actual allocation on
@@ -1040,6 +1040,9 @@ static void compress_file_range(struct btrfs_work *work)
 	}
 	return;
 
+mark_incompressible:
+	if (!btrfs_test_opt(fs_info, FORCE_COMPRESS) && !inode->prop_compress)
+		inode->flags |= BTRFS_INODE_NOCOMPRESS;
 cleanup_and_bail_uncompressed:
 	if (pages) {
 		/*
@@ -1054,12 +1057,6 @@ static void compress_file_range(struct btrfs_work *work)
 		pages = NULL;
 		total_compressed = 0;
 		nr_pages = 0;
-
-		/* flag the file so we don't compress in the future */
-		if (!btrfs_test_opt(fs_info, FORCE_COMPRESS) &&
-		    !(inode->prop_compress)) {
-			inode->flags |= BTRFS_INODE_NOCOMPRESS;
-		}
 	}
 
 	/*
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 18/23] btrfs: share the code to free the page array in compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (16 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 17/23] btrfs: use a separate label for the incompressible case " Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 19/23] btrfs: don't redirty pages " Christoph Hellwig
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

compress_file_range has two code blocks to free the page array for the
compressed data.  Share the code using a goto label.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 30 +++++++++---------------------
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 00aabc088a9deb..8f3a72f3f897a1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1003,12 +1003,7 @@ static void compress_file_range(struct btrfs_work *work)
 						     PAGE_UNLOCK |
 						     PAGE_START_WRITEBACK |
 						     PAGE_END_WRITEBACK);
-			for (i = 0; i < nr_pages; i++) {
-				WARN_ON(pages[i]->mapping);
-				put_page(pages[i]);
-			}
-			kfree(pages);
-			return;
+			goto free_pages;
 		}
 	}
 
@@ -1044,21 +1039,6 @@ static void compress_file_range(struct btrfs_work *work)
 	if (!btrfs_test_opt(fs_info, FORCE_COMPRESS) && !inode->prop_compress)
 		inode->flags |= BTRFS_INODE_NOCOMPRESS;
 cleanup_and_bail_uncompressed:
-	if (pages) {
-		/*
-		 * the compression code ran but failed to make things smaller,
-		 * free any pages it allocated and our page pointer array
-		 */
-		for (i = 0; i < nr_pages; i++) {
-			WARN_ON(pages[i]->mapping);
-			put_page(pages[i]);
-		}
-		kfree(pages);
-		pages = NULL;
-		total_compressed = 0;
-		nr_pages = 0;
-	}
-
 	/*
 	 * No compression, but we still need to write the pages in the file
 	 * we've been given so far.  redirty the locked page if it corresponds
@@ -1076,6 +1056,14 @@ static void compress_file_range(struct btrfs_work *work)
 		extent_range_redirty_for_io(&inode->vfs_inode, start, end);
 	add_async_extent(async_chunk, start, end - start + 1, 0, NULL, 0,
 			 BTRFS_COMPRESS_NONE);
+free_pages:
+	if (pages) {
+		for (i = 0; i < nr_pages; i++) {
+			WARN_ON(pages[i]->mapping);
+			put_page(pages[i]);
+		}
+		kfree(pages);
+	}
 }
 
 static void free_async_extent_pages(struct async_extent *async_extent)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 19/23] btrfs: don't redirty pages in compress_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (17 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 18/23] btrfs: share the code to free the page array " Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-07-20 11:41   ` David Sterba
  2023-06-28 15:31 ` [PATCH 20/23] btrfs: refactor the zoned device handling in cow_file_range Christoph Hellwig
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

compress_file_range needs to clear the dirty bit before handing off work
to the compression worker threads to prevent processes coming in through
mmap and changing the file contents while the compression is accessing
the data (See commit 4adaa611020f ("Btrfs: fix race between mmap writes
and compression").

But when compress_file_range decides to not compress the data, it falls
back to submit_uncompressed_range which uses extent_write_locked_range
to write the uncompressed data.  extent_write_locked_range currently
expects all pages to be marked dirty so that it can clear the dirty
bit itself, and thus compress_file_range has to redirty the page range.

Redirtying the page range is rather inefficient and also pointless,
so instead pass a pages_dirty parameter to extent_write_locked_range
and skip the redirty game entirely.

Note that compress_file_range was even redirtying the locked_page twice
given that extent_range_clear_dirty_for_io already redirties all pages
in the range, which must include locked_page if there is one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c | 29 +++++------------------------
 fs/btrfs/extent_io.h |  3 +--
 fs/btrfs/inode.c     | 43 +++++++++----------------------------------
 3 files changed, 15 insertions(+), 60 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6befffd76e8808..e74153c02d84c7 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -181,22 +181,6 @@ void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end)
 	}
 }
 
-void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end)
-{
-	struct address_space *mapping = inode->i_mapping;
-	unsigned long index = start >> PAGE_SHIFT;
-	unsigned long end_index = end >> PAGE_SHIFT;
-	struct folio *folio;
-
-	while (index <= end_index) {
-		folio = filemap_get_folio(mapping, index);
-		filemap_dirty_folio(mapping, folio);
-		folio_account_redirty(folio);
-		index += folio_nr_pages(folio);
-		folio_put(folio);
-	}
-}
-
 static void process_one_page(struct btrfs_fs_info *fs_info,
 			     struct page *page, struct page *locked_page,
 			     unsigned long page_ops, u64 start, u64 end)
@@ -2150,7 +2134,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
  * locked.
  */
 void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			       struct writeback_control *wbc)
+			       struct writeback_control *wbc, bool pages_dirty)
 {
 	bool found_error = false;
 	int ret = 0;
@@ -2176,14 +2160,11 @@ void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 		int nr = 0;
 
 		page = find_get_page(mapping, cur >> PAGE_SHIFT);
-		/*
-		 * All pages in the range are locked since
-		 * btrfs_run_delalloc_range(), thus there is no way to clear
-		 * the page dirty flag.
-		 */
 		ASSERT(PageLocked(page));
-		ASSERT(PageDirty(page));
-		clear_page_dirty_for_io(page);
+		if (pages_dirty) {
+			ASSERT(PageDirty(page));
+			clear_page_dirty_for_io(page);
+		}
 
 		ret = __extent_writepage_io(BTRFS_I(inode), page, &bio_ctrl,
 					    i_size, &nr);
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 0312022bbf4b7a..2678906e87c506 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -178,7 +178,7 @@ int try_release_extent_buffer(struct page *page);
 
 int btrfs_read_folio(struct file *file, struct folio *folio);
 void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			       struct writeback_control *wbc);
+			       struct writeback_control *wbc, bool pages_dirty);
 int extent_writepages(struct address_space *mapping,
 		      struct writeback_control *wbc);
 int btree_write_cache_pages(struct address_space *mapping,
@@ -265,7 +265,6 @@ void set_extent_buffer_dirty(struct extent_buffer *eb);
 void set_extent_buffer_uptodate(struct extent_buffer *eb);
 void clear_extent_buffer_uptodate(struct extent_buffer *eb);
 void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end);
-void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end);
 void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 				  struct page *locked_page,
 				  u32 bits_to_clear, unsigned long page_ops);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8f3a72f3f897a1..556f63e8496ff8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -848,10 +848,16 @@ static void compress_file_range(struct btrfs_work *work)
 	unsigned int poff;
 	int i;
 	int compress_type = fs_info->compress_type;
-	int redirty = 0;
 
 	inode_should_defrag(inode, start, end, end - start + 1, SZ_16K);
 
+	/*
+	 * We need to call clear_page_dirty_for_io on each page in the range.
+	 * Otherwise applications with the file mmap'd can wander in and change
+	 * the page contents while we are compressing them.
+	 */
+	extent_range_clear_dirty_for_io(&inode->vfs_inode, start, end);
+
 	/*
 	 * We need to save i_size before now because it could change in between
 	 * us evaluating the size and assigning it.  This is because we lock and
@@ -927,22 +933,6 @@ static void compress_file_range(struct btrfs_work *work)
 	else if (inode->prop_compress)
 		compress_type = inode->prop_compress;
 
-	/*
-	 * We need to call clear_page_dirty_for_io on each page in the range.
-	 * Otherwise applications with the file mmap'd can wander in and change
-	 * the page contents while we are compressing them.
-	 *
-	 * If the compression fails for any reason, we set the pages dirty again
-	 * later on.
-	 *
-	 * Note that the remaining part is redirtied, the start pointer has
-	 * moved, the end is the original one.
-	 */
-	if (!redirty) {
-		extent_range_clear_dirty_for_io(&inode->vfs_inode, start, end);
-		redirty = 1;
-	}
-
 	/* Compression level is applied here and only here */
 	ret = btrfs_compress_pages(compress_type |
 				   (fs_info->compress_level << 4),
@@ -1039,21 +1029,6 @@ static void compress_file_range(struct btrfs_work *work)
 	if (!btrfs_test_opt(fs_info, FORCE_COMPRESS) && !inode->prop_compress)
 		inode->flags |= BTRFS_INODE_NOCOMPRESS;
 cleanup_and_bail_uncompressed:
-	/*
-	 * No compression, but we still need to write the pages in the file
-	 * we've been given so far.  redirty the locked page if it corresponds
-	 * to our extent and set things up for the async work queue to run
-	 * cow_file_range to do the normal delalloc dance.
-	 */
-	if (async_chunk->locked_page &&
-	    (page_offset(async_chunk->locked_page) >= start &&
-	     page_offset(async_chunk->locked_page)) <= end) {
-		__set_page_dirty_nobuffers(async_chunk->locked_page);
-		/* unlocked later on in the async handlers */
-	}
-
-	if (redirty)
-		extent_range_redirty_for_io(&inode->vfs_inode, start, end);
 	add_async_extent(async_chunk, start, end - start + 1, 0, NULL, 0,
 			 BTRFS_COMPRESS_NONE);
 free_pages:
@@ -1130,7 +1105,7 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 
 	/* All pages will be unlocked, including @locked_page */
 	wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
-	extent_write_locked_range(&inode->vfs_inode, start, end, &wbc);
+	extent_write_locked_range(&inode->vfs_inode, start, end, &wbc, false);
 	wbc_detach_inode(&wbc);
 }
 
@@ -1755,7 +1730,7 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 		}
 		locked_page_done = true;
 		extent_write_locked_range(&inode->vfs_inode, start, done_offset,
-					  wbc);
+					  wbc, true);
 		start = done_offset + 1;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 20/23] btrfs: refactor the zoned device handling in cow_file_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (18 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 19/23] btrfs: don't redirty pages " Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 21/23] btrfs: don't redirty locked_page in run_delalloc_zoned Christoph Hellwig
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Handling of the done_offset to cow_file_range is a bit confusing, as
it is not updated at all when the function succeeds, and the -EAGAIN
status is used bother for the case where we need to wait for a zone
finish and the one where the allocation was partially successful.

Change the calling convention so that done_offset is always updated,
and 0 is returned if some allocation was successful (partial allocation
can still only happen for zoned devices), and waiting for a zone
finish is done internally in cow_file_range instead of the caller.

Also write a big fat comment explaining the logic.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 58 ++++++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 556f63e8496ff8..2a4b62398ee7a3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1364,7 +1364,8 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 			 * compressed extent.
 			 */
 			unlock_page(locked_page);
-			return 1;
+			ret = 1;
+			goto done;
 		} else if (ret < 0) {
 			goto out_unlock;
 		}
@@ -1395,6 +1396,31 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 		ret = btrfs_reserve_extent(root, cur_alloc_size, cur_alloc_size,
 					   min_alloc_size, 0, alloc_hint,
 					   &ins, 1, 1);
+		if (ret == -EAGAIN) {
+			/*
+			 * btrfs_reserve_extent only returns -EAGAIN for zoned
+			 * file systems, which is an indication that there are
+			 * no active zones to allocate from at the moment.
+			 *
+			 * If this is the first loop iteration, wait for at
+			 * least one zone to finish before retrying the
+			 * allocation.  Otherwise ask the caller to write out
+			 * the already allocated blocks before coming back to
+			 * us, or return -ENOSPC if it can't handle retries.
+			 */
+			ASSERT(btrfs_is_zoned(fs_info));
+			if (start == orig_start) {
+				wait_on_bit_io(&inode->root->fs_info->flags,
+					       BTRFS_FS_NEED_ZONE_FINISH,
+					       TASK_UNINTERRUPTIBLE);
+				continue;
+			}
+			if (done_offset) {
+				*done_offset = start - 1;
+				return 0;
+			}
+			ret = -ENOSPC;
+		}
 		if (ret < 0)
 			goto out_unlock;
 		cur_alloc_size = ins.offset;
@@ -1478,6 +1504,9 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 		if (ret)
 			goto out_unlock;
 	}
+done:
+	if (done_offset)
+		*done_offset = end;
 	return ret;
 
 out_drop_extent_cache:
@@ -1486,21 +1515,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
 	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
 	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1);
 out_unlock:
-	/*
-	 * If done_offset is non-NULL and ret == -EAGAIN, we expect the
-	 * caller to write out the successfully allocated region and retry.
-	 */
-	if (done_offset && ret == -EAGAIN) {
-		if (orig_start < start)
-			*done_offset = start - 1;
-		else
-			*done_offset = start;
-		return ret;
-	} else if (ret == -EAGAIN) {
-		/* Convert to -ENOSPC since the caller cannot retry. */
-		ret = -ENOSPC;
-	}
-
 	/*
 	 * Now, we have three regions to clean up:
 	 *
@@ -1711,19 +1725,9 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 	while (start <= end) {
 		ret = cow_file_range(inode, locked_page, start, end,
 				     &done_offset, CFR_KEEP_LOCKED);
-		if (ret && ret != -EAGAIN)
+		if (ret)
 			return ret;
 
-		if (ret == 0)
-			done_offset = end;
-
-		if (done_offset == start) {
-			wait_on_bit_io(&inode->root->fs_info->flags,
-				       BTRFS_FS_NEED_ZONE_FINISH,
-				       TASK_UNINTERRUPTIBLE);
-			continue;
-		}
-
 		if (!locked_page_done) {
 			__set_page_dirty_nobuffers(locked_page);
 			account_page_redirty(locked_page);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 21/23] btrfs: don't redirty locked_page in run_delalloc_zoned
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (19 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 20/23] btrfs: refactor the zoned device handling in cow_file_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 22/23] btrfs: fix zoned handling in submit_uncompressed_range Christoph Hellwig
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

extent_write_locked_range currently expects that either all or no
pages are dirty when it is called.  Bur run_delalloc_zoned is called
directly in the writepages path, and has the dirty bit cleared only
for locked_page and which the extent_write_cache_pages currently
operates.  It currently works around this by redirtying locked_page,
but that is a bit inefficient and cumbersome.  Pass a locked_page
argument to run_delalloc_zoned so that clearing the dirty bit can
be skipped on just that page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/extent_io.c |  7 ++++---
 fs/btrfs/extent_io.h |  5 +++--
 fs/btrfs/inode.c     | 13 ++++---------
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index e74153c02d84c7..efcac0b56b8252 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2133,8 +2133,9 @@ static int extent_write_cache_pages(struct address_space *mapping,
  * already been ran (aka, ordered extent inserted) and all pages are still
  * locked.
  */
-void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			       struct writeback_control *wbc, bool pages_dirty)
+void extent_write_locked_range(struct inode *inode, struct page *locked_page,
+			       u64 start, u64 end, struct writeback_control *wbc,
+			       bool pages_dirty)
 {
 	bool found_error = false;
 	int ret = 0;
@@ -2161,7 +2162,7 @@ void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 
 		page = find_get_page(mapping, cur >> PAGE_SHIFT);
 		ASSERT(PageLocked(page));
-		if (pages_dirty) {
+		if (pages_dirty && page != locked_page) {
 			ASSERT(PageDirty(page));
 			clear_page_dirty_for_io(page);
 		}
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 2678906e87c506..c01f9c5ddc13c0 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -177,8 +177,9 @@ int try_release_extent_mapping(struct page *page, gfp_t mask);
 int try_release_extent_buffer(struct page *page);
 
 int btrfs_read_folio(struct file *file, struct folio *folio);
-void extent_write_locked_range(struct inode *inode, u64 start, u64 end,
-			       struct writeback_control *wbc, bool pages_dirty);
+void extent_write_locked_range(struct inode *inode, struct page *locked_page,
+			       u64 start, u64 end, struct writeback_control *wbc,
+			       bool pages_dirty);
 int extent_writepages(struct address_space *mapping,
 		      struct writeback_control *wbc);
 int btree_write_cache_pages(struct address_space *mapping,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2a4b62398ee7a3..ae5166d33253a5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1105,7 +1105,8 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 
 	/* All pages will be unlocked, including @locked_page */
 	wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
-	extent_write_locked_range(&inode->vfs_inode, start, end, &wbc, false);
+	extent_write_locked_range(&inode->vfs_inode, NULL, start, end, &wbc,
+				  false);
 	wbc_detach_inode(&wbc);
 }
 
@@ -1720,7 +1721,6 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 {
 	u64 done_offset = end;
 	int ret;
-	bool locked_page_done = false;
 
 	while (start <= end) {
 		ret = cow_file_range(inode, locked_page, start, end,
@@ -1728,13 +1728,8 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 		if (ret)
 			return ret;
 
-		if (!locked_page_done) {
-			__set_page_dirty_nobuffers(locked_page);
-			account_page_redirty(locked_page);
-		}
-		locked_page_done = true;
-		extent_write_locked_range(&inode->vfs_inode, start, done_offset,
-					  wbc, true);
+		extent_write_locked_range(&inode->vfs_inode, locked_page, start,
+					  done_offset, wbc, true);
 		start = done_offset + 1;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 22/23] btrfs: fix zoned handling in submit_uncompressed_range
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (20 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 21/23] btrfs: don't redirty locked_page in run_delalloc_zoned Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:31 ` [PATCH 23/23] mm: remove folio_account_redirty Christoph Hellwig
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

For zoned file systems we need to use run_delalloc_zoned to submit
writeback, as we need to write out partial allocations when running into
zone active limits.

submit_uncompressed_range currently always calls cow_file_range to
allocate blocks and thus misses the active zone limits handling.  Fix
this by passing the pages_dirty argument to run_delalloc_zoned and always
using it from submit_uncompressed_range as it does the right thing for
zoned and non-zone file systems.

To account for the fact that run_delalloc_zoned is now also used for
non-zoned file systems rename it to run_delalloc_cow, and add comment
describing it.

Fixes: 42c011000963 ("btrfs: zoned: introduce dedicated data write path for zoned filesystems")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/inode.c | 52 +++++++++++++++++++-----------------------------
 1 file changed, 20 insertions(+), 32 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ae5166d33253a5..2079bf48629b59 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -125,12 +125,10 @@ static struct kmem_cache *btrfs_inode_cachep;
 static int btrfs_setsize(struct inode *inode, struct iattr *attr);
 static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback);
 
-#define CFR_KEEP_LOCKED		(1 << 0)
-#define CFR_NOINLINE		(1 << 1)
-static noinline int cow_file_range(struct btrfs_inode *inode,
-				   struct page *locked_page,
-				   u64 start, u64 end, u64 *done_offset,
-				   u32 flags);
+static noinline int run_delalloc_cow(struct btrfs_inode *inode,
+				     struct page *locked_page, u64 start,
+				     u64 end, struct writeback_control *wbc,
+				     bool pages_dirty);
 static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
 				       u64 len, u64 orig_start, u64 block_start,
 				       u64 block_len, u64 orig_block_len,
@@ -1071,19 +1069,9 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 		.no_cgroup_owner	= 1,
 	};
 
-	/*
-	 * Call cow_file_range() to run the delalloc range directly, since we
-	 * won't go to NOCOW or async path again.
-	 *
-	 * Also we call cow_file_range() with @unlock_page == 0, so that we
-	 * can directly submit them without interruption.
-	 */
-	ret = cow_file_range(inode, locked_page, start, end, NULL,
-			     CFR_KEEP_LOCKED);
-	/* Inline extent inserted, page gets unlocked and everything is done */
-	if (ret == 1)
-		return;
-
+	wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
+	ret = run_delalloc_cow(inode, locked_page, start, end, &wbc, false);
+	wbc_detach_inode(&wbc);
 	if (ret < 0) {
 		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
 		if (locked_page) {
@@ -1100,14 +1088,7 @@ static void submit_uncompressed_range(struct btrfs_inode *inode,
 			mapping_set_error(locked_page->mapping, ret);
 			unlock_page(locked_page);
 		}
-		return;
 	}
-
-	/* All pages will be unlocked, including @locked_page */
-	wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
-	extent_write_locked_range(&inode->vfs_inode, NULL, start, end, &wbc,
-				  false);
-	wbc_detach_inode(&wbc);
 }
 
 static void submit_one_async_extent(struct async_chunk *async_chunk,
@@ -1290,6 +1271,8 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
  * btrfs_cleanup_ordered_extents(). See btrfs_run_delalloc_range() for
  * example.
  */
+#define CFR_KEEP_LOCKED		(1 << 0)
+#define CFR_NOINLINE		(1 << 1)
 static noinline int cow_file_range(struct btrfs_inode *inode,
 				   struct page *locked_page, u64 start, u64 end,
 				   u64 *done_offset, u32 flags)
@@ -1715,9 +1698,14 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode,
 	return true;
 }
 
-static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
-				       struct page *locked_page, u64 start,
-				       u64 end, struct writeback_control *wbc)
+/*
+ * Run the delalloc range from start to end, and write back any dirty pages
+ * covered by the range.
+ */
+static noinline int run_delalloc_cow(struct btrfs_inode *inode,
+				     struct page *locked_page, u64 start,
+				     u64 end, struct writeback_control *wbc,
+				     bool pages_dirty)
 {
 	u64 done_offset = end;
 	int ret;
@@ -1727,9 +1715,8 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
 				     &done_offset, CFR_KEEP_LOCKED);
 		if (ret)
 			return ret;
-
 		extent_write_locked_range(&inode->vfs_inode, locked_page, start,
-					  done_offset, wbc, true);
+					  done_offset, wbc, pages_dirty);
 		start = done_offset + 1;
 	}
 
@@ -2295,7 +2282,8 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
 		return 1;
 
 	if (zoned)
-		ret = run_delalloc_zoned(inode, locked_page, start, end, wbc);
+		ret = run_delalloc_cow(inode, locked_page, start, end, wbc,
+				       true);
 	else
 		ret = cow_file_range(inode, locked_page, start, end, NULL, 0);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 23/23] mm: remove folio_account_redirty
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (21 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 22/23] btrfs: fix zoned handling in submit_uncompressed_range Christoph Hellwig
@ 2023-06-28 15:31 ` Christoph Hellwig
  2023-06-28 15:45   ` Matthew Wilcox
  2023-07-14 13:49 ` btrfs compressed writeback cleanups Josef Bacik
  2023-07-20 11:47 ` David Sterba
  24 siblings, 1 reply; 40+ messages in thread
From: Christoph Hellwig @ 2023-06-28 15:31 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Fold folio_account_redirty into folio_redirty_for_writepage now
that all other users except for the also unused account_page_redirty
wrapper are gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/writeback.h |  5 ----
 mm/page-writeback.c       | 49 +++++++++++----------------------------
 2 files changed, 14 insertions(+), 40 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index fba937999fbfd3..083387c00f0c8b 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -375,11 +375,6 @@ void tag_pages_for_writeback(struct address_space *mapping,
 			     pgoff_t start, pgoff_t end);
 
 bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio);
-void folio_account_redirty(struct folio *folio);
-static inline void account_page_redirty(struct page *page)
-{
-	folio_account_redirty(page_folio(page));
-}
 bool folio_redirty_for_writepage(struct writeback_control *, struct folio *);
 bool redirty_page_for_writepage(struct writeback_control *, struct page *);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index db794399900734..56074637ef4fe0 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1193,7 +1193,7 @@ static void wb_update_write_bandwidth(struct bdi_writeback *wb,
 	 * write_bandwidth = ---------------------------------------------------
 	 *                                          period
 	 *
-	 * @written may have decreased due to folio_account_redirty().
+	 * @written may have decreased due to folio_redirty_for_writepage().
 	 * Avoid underflowing @bw calculation.
 	 */
 	bw = written - min(written, wb->written_stamp);
@@ -2709,37 +2709,6 @@ bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio)
 }
 EXPORT_SYMBOL(filemap_dirty_folio);
 
-/**
- * folio_account_redirty - Manually account for redirtying a page.
- * @folio: The folio which is being redirtied.
- *
- * Most filesystems should call folio_redirty_for_writepage() instead
- * of this fuction.  If your filesystem is doing writeback outside the
- * context of a writeback_control(), it can call this when redirtying
- * a folio, to de-account the dirty counters (NR_DIRTIED, WB_DIRTIED,
- * tsk->nr_dirtied), so that they match the written counters (NR_WRITTEN,
- * WB_WRITTEN) in long term. The mismatches will lead to systematic errors
- * in balanced_dirty_ratelimit and the dirty pages position control.
- */
-void folio_account_redirty(struct folio *folio)
-{
-	struct address_space *mapping = folio->mapping;
-
-	if (mapping && mapping_can_writeback(mapping)) {
-		struct inode *inode = mapping->host;
-		struct bdi_writeback *wb;
-		struct wb_lock_cookie cookie = {};
-		long nr = folio_nr_pages(folio);
-
-		wb = unlocked_inode_to_wb_begin(inode, &cookie);
-		current->nr_dirtied -= nr;
-		node_stat_mod_folio(folio, NR_DIRTIED, -nr);
-		wb_stat_mod(wb, WB_DIRTIED, -nr);
-		unlocked_inode_to_wb_end(inode, &cookie);
-	}
-}
-EXPORT_SYMBOL(folio_account_redirty);
-
 /**
  * folio_redirty_for_writepage - Decline to write a dirty folio.
  * @wbc: The writeback control.
@@ -2755,13 +2724,23 @@ EXPORT_SYMBOL(folio_account_redirty);
 bool folio_redirty_for_writepage(struct writeback_control *wbc,
 		struct folio *folio)
 {
-	bool ret;
+	struct address_space *mapping = folio->mapping;
 	long nr = folio_nr_pages(folio);
+	bool ret;
 
 	wbc->pages_skipped += nr;
-	ret = filemap_dirty_folio(folio->mapping, folio);
-	folio_account_redirty(folio);
+	ret = filemap_dirty_folio(mapping, folio);
+	if (mapping && mapping_can_writeback(mapping)) {
+		struct inode *inode = mapping->host;
+		struct bdi_writeback *wb;
+		struct wb_lock_cookie cookie = {};
 
+		wb = unlocked_inode_to_wb_begin(inode, &cookie);
+		current->nr_dirtied -= nr;
+		node_stat_mod_folio(folio, NR_DIRTIED, -nr);
+		wb_stat_mod(wb, WB_DIRTIED, -nr);
+		unlocked_inode_to_wb_end(inode, &cookie);
+	}
 	return ret;
 }
 EXPORT_SYMBOL(folio_redirty_for_writepage);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 23/23] mm: remove folio_account_redirty
  2023-06-28 15:31 ` [PATCH 23/23] mm: remove folio_account_redirty Christoph Hellwig
@ 2023-06-28 15:45   ` Matthew Wilcox
  0 siblings, 0 replies; 40+ messages in thread
From: Matthew Wilcox @ 2023-06-28 15:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:44PM +0200, Christoph Hellwig wrote:
> Fold folio_account_redirty into folio_redirty_for_writepage now
> that all other users except for the also unused account_page_redirty
> wrapper are gone.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/23] btrfs: pass a flags argument to cow_file_range
  2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
@ 2023-07-04  8:47   ` Johannes Thumshirn
  2023-07-20 11:22   ` David Sterba
  1 sibling, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  8:47 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered
  2023-06-28 15:31 ` [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered Christoph Hellwig
@ 2023-07-04  8:50   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  8:50 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents
  2023-06-28 15:31 ` [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents Christoph Hellwig
@ 2023-07-04  8:54   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  8:54 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range
  2023-06-28 15:31 ` [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range Christoph Hellwig
@ 2023-07-04  8:56   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  8:56 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range
  2023-06-28 15:31 ` [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range Christoph Hellwig
@ 2023-07-04  8:56   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  8:56 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range
  2023-06-28 15:31 ` [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range Christoph Hellwig
@ 2023-07-04  9:00   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  9:00 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

On 28.06.23 17:35, Christoph Hellwig wrote:
> btrfs_run_delalloc_range already started writeback by itself, overload
> the return value with a positive 1 in additio to 0 and a negative error                               addition ~^

For the rest of the patch, I don't feel competent enough, I'm sorry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent
  2023-06-28 15:31 ` [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent Christoph Hellwig
@ 2023-07-04  9:07   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  9:07 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range
  2023-06-28 15:31 ` [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range Christoph Hellwig
@ 2023-07-04  9:09   ` Johannes Thumshirn
  0 siblings, 0 replies; 40+ messages in thread
From: Johannes Thumshirn @ 2023-07-04  9:09 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Matthew Wilcox, linux-btrfs, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range
  2023-06-28 15:31 ` [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range Christoph Hellwig
@ 2023-07-14 13:47   ` Josef Bacik
  0 siblings, 0 replies; 40+ messages in thread
From: Josef Bacik @ 2023-07-14 13:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, David Sterba, Matthew Wilcox, linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:37PM +0200, Christoph Hellwig wrote:
> Currently the logic whether to compress or not in compress_file_range is
> a bit convoluted because it tries to share code for creating inline
> extents for the compressible [1] path and the bail to uncompressed path.
> 
> But the latter isn't needed at all, because cow_file_range as called by
> submit_uncompressed_range will already create inline extents as needed,
> so there is no need to have special handling for it if we can live with
> the fact that it will be called a bit later in the ->ordered_func of the
> workqueue instead of right now.
> 
> [1] there is undocumented logic that creates an uncompressed inline
> extent outside of the shall not compress logic if total_in is too small.
> This logic isn't explained in comments or any commit log I could find,
> so I've preserved it.  Documentation explaining it would be appreciated
> if anyone understands this code.
> 

Looks like it's just a optimization, no reason to allocate a whole async extent
if we can just inline a page and be done.  It's not necessarily required, but
could use a comment.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: btrfs compressed writeback cleanups
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (22 preceding siblings ...)
  2023-06-28 15:31 ` [PATCH 23/23] mm: remove folio_account_redirty Christoph Hellwig
@ 2023-07-14 13:49 ` Josef Bacik
  2023-07-20 11:47 ` David Sterba
  24 siblings, 0 replies; 40+ messages in thread
From: Josef Bacik @ 2023-07-14 13:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, David Sterba, Matthew Wilcox, linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:21PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series is a prequel to another big set of writeback patches, and
> mostly deals with compressed writeback, which does a few things very
> different from other writeback code.  The biggest results are the removal
> of the magic redirtying when handing off to the worker thread, and a fix
> for out of zone active resources handling when using compressed writeback
> on a zoned file system, but mostly it just removes a whole bunch of code.
> 
> Note that the first 5 patches have been out on the btrfs list as
> standalone submissions for a while, but they are included for completeness
> so that this series can be easily applied to the btrfs misc-next tree.
> 

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to the whole series.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/23] btrfs: pass a flags argument to cow_file_range
  2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
  2023-07-04  8:47   ` Johannes Thumshirn
@ 2023-07-20 11:22   ` David Sterba
  2023-07-20 13:25     ` Christoph Hellwig
  1 sibling, 1 reply; 40+ messages in thread
From: David Sterba @ 2023-07-20 11:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, Josef Bacik, David Sterba, Matthew Wilcox,
	linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:22PM +0200, Christoph Hellwig wrote:
> The int used as bool unlock is not a very good way to describe the
> behavior, and the next patch will have to add another beahvior modifier.
> Switch to pass a flag instead, with an inital CFR_KEEP_LOCKED flag that
> specifies the pages should always be kept locked.  This is the inverse
> of the old unlock argument for the reason that it requires a flag for
> the exceptional behavior.

Int is the wrong type but I'm not sure that for two flags we should use
a bit flags. Two bool parameters are IMHO fine and "CFR" does not mean
anything, it's really only relevant for the function.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 19/23] btrfs: don't redirty pages in compress_file_range
  2023-06-28 15:31 ` [PATCH 19/23] btrfs: don't redirty pages " Christoph Hellwig
@ 2023-07-20 11:41   ` David Sterba
  2023-07-20 13:26     ` Christoph Hellwig
  0 siblings, 1 reply; 40+ messages in thread
From: David Sterba @ 2023-07-20 11:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, Josef Bacik, David Sterba, Matthew Wilcox,
	linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:40PM +0200, Christoph Hellwig wrote:
> compress_file_range needs to clear the dirty bit before handing off work
> to the compression worker threads to prevent processes coming in through
> mmap and changing the file contents while the compression is accessing
> the data (See commit 4adaa611020f ("Btrfs: fix race between mmap writes
> and compression").
> 
> But when compress_file_range decides to not compress the data, it falls
> back to submit_uncompressed_range which uses extent_write_locked_range
> to write the uncompressed data.  extent_write_locked_range currently
> expects all pages to be marked dirty so that it can clear the dirty
> bit itself, and thus compress_file_range has to redirty the page range.
> 
> Redirtying the page range is rather inefficient and also pointless,
> so instead pass a pages_dirty parameter to extent_write_locked_range
> and skip the redirty game entirely.
> 
> Note that compress_file_range was even redirtying the locked_page twice
> given that extent_range_clear_dirty_for_io already redirties all pages
> in the range, which must include locked_page if there is one.

This is probably the only scary patch in the series. I don't see
anything obviously wrong, the reditrying logic added due to the mmap
case is preserved.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: btrfs compressed writeback cleanups
  2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
                   ` (23 preceding siblings ...)
  2023-07-14 13:49 ` btrfs compressed writeback cleanups Josef Bacik
@ 2023-07-20 11:47 ` David Sterba
  24 siblings, 0 replies; 40+ messages in thread
From: David Sterba @ 2023-07-20 11:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, Josef Bacik, David Sterba, Matthew Wilcox,
	linux-btrfs, linux-fsdevel

On Wed, Jun 28, 2023 at 05:31:21PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series is a prequel to another big set of writeback patches, and
> mostly deals with compressed writeback, which does a few things very
> different from other writeback code.  The biggest results are the removal
> of the magic redirtying when handing off to the worker thread, and a fix
> for out of zone active resources handling when using compressed writeback
> on a zoned file system, but mostly it just removes a whole bunch of code.
> 
> Note that the first 5 patches have been out on the btrfs list as
> standalone submissions for a while, but they are included for completeness
> so that this series can be easily applied to the btrfs misc-next tree.

Besides the minor comment for patch 1 the rest looks good, rolls back
the evolution of the compression code to a good state. The flags to bool
is trivial change so I'd rather do it locally and you don't need to
resend. Thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/23] btrfs: pass a flags argument to cow_file_range
  2023-07-20 11:22   ` David Sterba
@ 2023-07-20 13:25     ` Christoph Hellwig
  0 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-07-20 13:25 UTC (permalink / raw)
  To: David Sterba
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba,
	Matthew Wilcox, linux-btrfs, linux-fsdevel

On Thu, Jul 20, 2023 at 01:22:36PM +0200, David Sterba wrote:
> On Wed, Jun 28, 2023 at 05:31:22PM +0200, Christoph Hellwig wrote:
> > The int used as bool unlock is not a very good way to describe the
> > behavior, and the next patch will have to add another beahvior modifier.
> > Switch to pass a flag instead, with an inital CFR_KEEP_LOCKED flag that
> > specifies the pages should always be kept locked.  This is the inverse
> > of the old unlock argument for the reason that it requires a flag for
> > the exceptional behavior.
> 
> Int is the wrong type but I'm not sure that for two flags we should use
> a bit flags. Two bool parameters are IMHO fine and "CFR" does not mean
> anything, it's really only relevant for the function.

CFR stands for cow_file_range.  The good news is that with my
(huge) stack of pending pages both flags will eventually go away.
The bad news is that for an intermediate change I actually need a third
one.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 19/23] btrfs: don't redirty pages in compress_file_range
  2023-07-20 11:41   ` David Sterba
@ 2023-07-20 13:26     ` Christoph Hellwig
  0 siblings, 0 replies; 40+ messages in thread
From: Christoph Hellwig @ 2023-07-20 13:26 UTC (permalink / raw)
  To: David Sterba
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba,
	Matthew Wilcox, linux-btrfs, linux-fsdevel

On Thu, Jul 20, 2023 at 01:41:11PM +0200, David Sterba wrote:
> > Note that compress_file_range was even redirtying the locked_page twice
> > given that extent_range_clear_dirty_for_io already redirties all pages
> > in the range, which must include locked_page if there is one.
> 
> This is probably the only scary patch in the series. I don't see
> anything obviously wrong, the reditrying logic added due to the mmap
> case is preserved.

You'll be even more scared when I finally get to clearing the dirty
patch and setting the writeback bit in the proper place so that
the BTRFS_INODE_HAS_ASYNC_EXTENT can go away :)  But that's still
dozends of patches down the patch stack..

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-07-20 13:26 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 15:31 btrfs compressed writeback cleanups Christoph Hellwig
2023-06-28 15:31 ` [PATCH 01/23] btrfs: pass a flags argument to cow_file_range Christoph Hellwig
2023-07-04  8:47   ` Johannes Thumshirn
2023-07-20 11:22   ` David Sterba
2023-07-20 13:25     ` Christoph Hellwig
2023-06-28 15:31 ` [PATCH 02/23] btrfs: don't create inline extents in fallback_to_cow Christoph Hellwig
2023-06-28 15:31 ` [PATCH 03/23] btrfs: split page locking out of __process_pages_contig Christoph Hellwig
2023-06-28 15:31 ` [PATCH 04/23] btrfs: remove btrfs_writepage_endio_finish_ordered Christoph Hellwig
2023-07-04  8:50   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 05/23] btrfs: remove end_extent_writepage Christoph Hellwig
2023-06-28 15:31 ` [PATCH 06/23] btrfs: reduce debug spam from submit_compressed_extents Christoph Hellwig
2023-07-04  8:54   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 07/23] btrfs: remove the return value from submit_uncompressed_range Christoph Hellwig
2023-07-04  8:56   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 08/23] btrfs: remove the return value from extent_write_locked_range Christoph Hellwig
2023-07-04  8:56   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 09/23] btrfs: improve the delalloc_to_write calculation in writepage_delalloc Christoph Hellwig
2023-06-28 15:31 ` [PATCH 10/23] btrfs: reduce the number of arguments to btrfs_run_delalloc_range Christoph Hellwig
2023-07-04  9:00   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 11/23] btrfs: clean up the check for uncompressed ranges in submit_one_async_extent Christoph Hellwig
2023-07-04  9:07   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 12/23] btrfs: don't clear async_chunk->inode in async_cow_start Christoph Hellwig
2023-06-28 15:31 ` [PATCH 13/23] btrfs: merge async_cow_start and compress_file_range Christoph Hellwig
2023-07-04  9:09   ` Johannes Thumshirn
2023-06-28 15:31 ` [PATCH 14/23] btrfs: merge submit_compressed_extents and async_cow_submit Christoph Hellwig
2023-06-28 15:31 ` [PATCH 15/23] btrfs: streamline compress_file_range Christoph Hellwig
2023-06-28 15:31 ` [PATCH 16/23] btrfs: further simplify the compress or not logic in compress_file_range Christoph Hellwig
2023-07-14 13:47   ` Josef Bacik
2023-06-28 15:31 ` [PATCH 17/23] btrfs: use a separate label for the incompressible case " Christoph Hellwig
2023-06-28 15:31 ` [PATCH 18/23] btrfs: share the code to free the page array " Christoph Hellwig
2023-06-28 15:31 ` [PATCH 19/23] btrfs: don't redirty pages " Christoph Hellwig
2023-07-20 11:41   ` David Sterba
2023-07-20 13:26     ` Christoph Hellwig
2023-06-28 15:31 ` [PATCH 20/23] btrfs: refactor the zoned device handling in cow_file_range Christoph Hellwig
2023-06-28 15:31 ` [PATCH 21/23] btrfs: don't redirty locked_page in run_delalloc_zoned Christoph Hellwig
2023-06-28 15:31 ` [PATCH 22/23] btrfs: fix zoned handling in submit_uncompressed_range Christoph Hellwig
2023-06-28 15:31 ` [PATCH 23/23] mm: remove folio_account_redirty Christoph Hellwig
2023-06-28 15:45   ` Matthew Wilcox
2023-07-14 13:49 ` btrfs compressed writeback cleanups Josef Bacik
2023-07-20 11:47 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).