linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs I/O completion cleanup and single device I/O optimizations v3
@ 2022-08-06  8:03 Christoph Hellwig
  2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
                   ` (11 more replies)
  0 siblings, 12 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Hi all,

this series cleans up the btrfs_bio API, most prominently by splitting
the end_io handler for the highlevel bio from the low-level bio
bi_end_io, which are really confusingly coupled in the current code.
Once that is done it then optimizes the bio submission to not allocate
a btrfs_io_context for I/Os that just go to a single device.

Changes since v2:
 - fix a small comment typo

Changes since v1:
 - add two previously submitted patches skipped from an earlier
   series
 - merged one of those patches with one from this series
 - split one of the patches in this series into three
 - improve various commit logs

Diffstat:
 compression.c    |   43 ++-----
 ctree.h          |    1 
 dev-replace.c    |    5 
 disk-io.c        |   16 +-
 extent-io-tree.h |    4 
 extent_io.c      |  117 +++----------------
 extent_io.h      |    3 
 inode.c          |   57 ++++-----
 raid56.c         |   45 +------
 raid56.h         |    4 
 scrub.c          |    7 -
 super.c          |    6 
 volumes.c        |  337 ++++++++++++++++++++++++++++++++++++++-----------------
 volumes.h        |   20 ++-
 14 files changed, 340 insertions(+), 325 deletions(-)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-08  5:06   ` Anand Jain
  2022-08-08 11:24   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c Christoph Hellwig
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

btrfs never uses bio integrity data itself, so don't allocate
the integrity pools for btrfs_bioset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/extent_io.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6e8e936a8a1ef..ca8b79d991f5e 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -255,14 +255,8 @@ int __init extent_io_init(void)
 			BIOSET_NEED_BVECS))
 		goto free_buffer_cache;
 
-	if (bioset_integrity_create(&btrfs_bioset, BIO_POOL_SIZE))
-		goto free_bioset;
-
 	return 0;
 
-free_bioset:
-	bioset_exit(&btrfs_bioset);
-
 free_buffer_cache:
 	kmem_cache_destroy(extent_buffer_cache);
 	extent_buffer_cache = NULL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
  2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-10 14:42   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc Christoph Hellwig
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

volumes.c is the place that implements the storage layer using the
btrfs_bio structure, so move the bio_set and allocation helpers there
as well.

To make up for the new initialization boilerplate, merge the two
init/exit helpers in extent_io.c into a single one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/extent-io-tree.h |  4 +--
 fs/btrfs/extent_io.c      | 74 ++++-----------------------------------
 fs/btrfs/extent_io.h      |  3 --
 fs/btrfs/super.c          |  6 ++--
 fs/btrfs/volumes.c        | 58 ++++++++++++++++++++++++++++++
 fs/btrfs/volumes.h        |  3 ++
 6 files changed, 72 insertions(+), 76 deletions(-)

diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h
index c3eb52dbe61cc..819763ace0aca 100644
--- a/fs/btrfs/extent-io-tree.h
+++ b/fs/btrfs/extent-io-tree.h
@@ -96,8 +96,8 @@ struct extent_state {
 #endif
 };
 
-int __init extent_state_cache_init(void);
-void __cold extent_state_cache_exit(void);
+int __init volumes_init(void);
+void __cold volumes_exit(void);
 
 void extent_io_tree_init(struct btrfs_fs_info *fs_info,
 			 struct extent_io_tree *tree, unsigned int owner,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ca8b79d991f5e..068208244d925 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -33,7 +33,6 @@
 
 static struct kmem_cache *extent_state_cache;
 static struct kmem_cache *extent_buffer_cache;
-static struct bio_set btrfs_bioset;
 
 static inline bool extent_state_in_tree(const struct extent_state *state)
 {
@@ -232,41 +231,23 @@ static void submit_write_bio(struct extent_page_data *epd, int ret)
 	}
 }
 
-int __init extent_state_cache_init(void)
+int __init extent_io_init(void)
 {
 	extent_state_cache = kmem_cache_create("btrfs_extent_state",
 			sizeof(struct extent_state), 0,
 			SLAB_MEM_SPREAD, NULL);
 	if (!extent_state_cache)
 		return -ENOMEM;
-	return 0;
-}
 
-int __init extent_io_init(void)
-{
 	extent_buffer_cache = kmem_cache_create("btrfs_extent_buffer",
 			sizeof(struct extent_buffer), 0,
 			SLAB_MEM_SPREAD, NULL);
-	if (!extent_buffer_cache)
+	if (!extent_buffer_cache) {
+		kmem_cache_destroy(extent_state_cache);
 		return -ENOMEM;
-
-	if (bioset_init(&btrfs_bioset, BIO_POOL_SIZE,
-			offsetof(struct btrfs_bio, bio),
-			BIOSET_NEED_BVECS))
-		goto free_buffer_cache;
+	}
 
 	return 0;
-
-free_buffer_cache:
-	kmem_cache_destroy(extent_buffer_cache);
-	extent_buffer_cache = NULL;
-	return -ENOMEM;
-}
-
-void __cold extent_state_cache_exit(void)
-{
-	btrfs_extent_state_leak_debug_check();
-	kmem_cache_destroy(extent_state_cache);
 }
 
 void __cold extent_io_exit(void)
@@ -277,7 +258,8 @@ void __cold extent_io_exit(void)
 	 */
 	rcu_barrier();
 	kmem_cache_destroy(extent_buffer_cache);
-	bioset_exit(&btrfs_bioset);
+	btrfs_extent_state_leak_debug_check();
+	kmem_cache_destroy(extent_state_cache);
 }
 
 /*
@@ -3156,50 +3138,6 @@ int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array)
 	return 0;
 }
 
-/*
- * Initialize the members up to but not including 'bio'. Use after allocating a
- * new bio by bio_alloc_bioset as it does not initialize the bytes outside of
- * 'bio' because use of __GFP_ZERO is not supported.
- */
-static inline void btrfs_bio_init(struct btrfs_bio *bbio)
-{
-	memset(bbio, 0, offsetof(struct btrfs_bio, bio));
-}
-
-/*
- * Allocate a btrfs_io_bio, with @nr_iovecs as maximum number of iovecs.
- *
- * The bio allocation is backed by bioset and does not fail.
- */
-struct bio *btrfs_bio_alloc(unsigned int nr_iovecs)
-{
-	struct bio *bio;
-
-	ASSERT(0 < nr_iovecs && nr_iovecs <= BIO_MAX_VECS);
-	bio = bio_alloc_bioset(NULL, nr_iovecs, 0, GFP_NOFS, &btrfs_bioset);
-	btrfs_bio_init(btrfs_bio(bio));
-	return bio;
-}
-
-struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
-{
-	struct bio *bio;
-	struct btrfs_bio *bbio;
-
-	ASSERT(offset <= UINT_MAX && size <= UINT_MAX);
-
-	/* this will never fail when it's backed by a bioset */
-	bio = bio_alloc_clone(orig->bi_bdev, orig, GFP_NOFS, &btrfs_bioset);
-	ASSERT(bio);
-
-	bbio = btrfs_bio(bio);
-	btrfs_bio_init(bbio);
-
-	bio_trim(bio, offset >> 9, size >> 9);
-	bbio->iter = bio->bi_iter;
-	return bio;
-}
-
 /**
  * Attempt to add a page to bio
  *
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 4bc72a87b9a99..69a86ae6fd508 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -60,7 +60,6 @@ enum {
 struct btrfs_bio;
 struct btrfs_root;
 struct btrfs_inode;
-struct btrfs_io_bio;
 struct btrfs_fs_info;
 struct io_failure_record;
 struct extent_io_tree;
@@ -242,8 +241,6 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 				  u32 bits_to_clear, unsigned long page_ops);
 
 int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array);
-struct bio *btrfs_bio_alloc(unsigned int nr_iovecs);
-struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
 
 void end_extent_writepage(struct page *page, int err, u64 start, u64 end);
 int btrfs_repair_eb_io_failure(const struct extent_buffer *eb, int mirror_num);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4c7089b1681b3..96b22ee7b12ce 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2664,7 +2664,7 @@ static int __init init_btrfs_fs(void)
 	if (err)
 		goto free_cachep;
 
-	err = extent_state_cache_init();
+	err = volumes_init();
 	if (err)
 		goto free_extent_io;
 
@@ -2723,7 +2723,7 @@ static int __init init_btrfs_fs(void)
 free_extent_map:
 	extent_map_exit();
 free_extent_state_cache:
-	extent_state_cache_exit();
+	volumes_exit();
 free_extent_io:
 	extent_io_exit();
 free_cachep:
@@ -2744,7 +2744,7 @@ static void __exit exit_btrfs_fs(void)
 	btrfs_prelim_ref_exit();
 	ordered_data_exit();
 	extent_map_exit();
-	extent_state_cache_exit();
+	volumes_exit();
 	extent_io_exit();
 	btrfs_interface_exit();
 	unregister_filesystem(&btrfs_fs_type);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 8c64dda69404e..bb614bb890709 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -34,6 +34,8 @@
 #include "discard.h"
 #include "zoned.h"
 
+static struct bio_set btrfs_bioset;
+
 #define BTRFS_BLOCK_GROUP_STRIPE_MASK	(BTRFS_BLOCK_GROUP_RAID0 | \
 					 BTRFS_BLOCK_GROUP_RAID10 | \
 					 BTRFS_BLOCK_GROUP_RAID56_MASK)
@@ -6606,6 +6608,48 @@ int btrfs_map_sblock(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 	return __btrfs_map_block(fs_info, op, logical, length, bioc_ret, 0, 1);
 }
 
+/*
+ * Initialize a btrfs_bio structure.  This skips the embedded bio itself as it
+ * is already initialized by the block layer.
+ */
+static inline void btrfs_bio_init(struct btrfs_bio *bbio)
+{
+	memset(bbio, 0, offsetof(struct btrfs_bio, bio));
+}
+
+/*
+ * Allocate a btrfs_bio structure.  The btrfs_bio is the main I/O container for
+ * btrfs, and is used for all I/O submitted through btrfs_submit_bio.
+ *
+ * Just like the underlying bio_alloc_bioset it will no fail as it is backed by
+ * a mempool.
+ */
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs)
+{
+	struct bio *bio;
+
+	bio = bio_alloc_bioset(NULL, nr_vecs, 0, GFP_NOFS, &btrfs_bioset);
+	btrfs_bio_init(btrfs_bio(bio));
+	return bio;
+}
+
+struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
+{
+	struct bio *bio;
+	struct btrfs_bio *bbio;
+
+	ASSERT(offset <= UINT_MAX && size <= UINT_MAX);
+
+	bio = bio_alloc_clone(orig->bi_bdev, orig, GFP_NOFS, &btrfs_bioset);
+	bbio = btrfs_bio(bio);
+	btrfs_bio_init(bbio);
+
+	bio_trim(bio, offset >> 9, size >> 9);
+	bbio->iter = bio->bi_iter;
+	return bio;
+
+}
+
 static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_io_context *bioc)
 {
 	if (bioc->orig_bio->bi_opf & REQ_META)
@@ -8283,3 +8327,17 @@ bool btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical)
 
 	return true;
 }
+
+int __init volumes_init(void)
+{
+	if (bioset_init(&btrfs_bioset, BIO_POOL_SIZE,
+			offsetof(struct btrfs_bio, bio),
+			BIOSET_NEED_BVECS))
+		return -ENOMEM;
+	return 0;
+}
+
+void __cold volumes_exit(void)
+{
+	bioset_exit(&btrfs_bioset);
+}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 5639961b3626f..fb57b50fb60b1 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -393,6 +393,9 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio)
 	return container_of(bio, struct btrfs_bio, bio);
 }
 
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs);
+struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
+
 static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)
 {
 	if (bbio->csum != bbio->csum_inline) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
  2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
  2022-08-06  8:03 ` [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-08 11:20   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios Christoph Hellwig
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Pass the operation to btrfs_bio_alloc, matching what bio_alloc_bioset
set does.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/compression.c | 3 +--
 fs/btrfs/extent_io.c   | 6 ++----
 fs/btrfs/inode.c       | 3 +--
 fs/btrfs/volumes.c     | 4 ++--
 fs/btrfs/volumes.h     | 2 +-
 5 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index f3df9b9b43816..e124096eacddd 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -344,10 +344,9 @@ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_byte
 	struct bio *bio;
 	int ret;
 
-	bio = btrfs_bio_alloc(BIO_MAX_VECS);
+	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
 
 	bio->bi_iter.bi_sector = disk_bytenr >> SECTOR_SHIFT;
-	bio->bi_opf = opf;
 	bio->bi_private = cb;
 	bio->bi_end_io = endio_func;
 
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 068208244d925..fae8c1899226b 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2626,10 +2626,9 @@ int btrfs_repair_one_sector(struct inode *inode, struct btrfs_bio *failed_bbio,
 		return -EIO;
 	}
 
-	repair_bio = btrfs_bio_alloc(1);
+	repair_bio = btrfs_bio_alloc(1, REQ_OP_READ);
 	repair_bbio = btrfs_bio(repair_bio);
 	repair_bbio->file_offset = start;
-	repair_bio->bi_opf = REQ_OP_READ;
 	repair_bio->bi_end_io = failed_bio->bi_end_io;
 	repair_bio->bi_iter.bi_sector = failrec->logical >> 9;
 	repair_bio->bi_private = failed_bio->bi_private;
@@ -3266,7 +3265,7 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 	struct bio *bio;
 	int ret;
 
-	bio = btrfs_bio_alloc(BIO_MAX_VECS);
+	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
 	/*
 	 * For compressed page range, its disk_bytenr is always @disk_bytenr
 	 * passed in, no matter if we have added any range into previous bio.
@@ -3278,7 +3277,6 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 	bio_ctrl->bio = bio;
 	bio_ctrl->compress_type = compress_type;
 	bio->bi_end_io = end_io_func;
-	bio->bi_opf = opf;
 	ret = calc_bio_boundaries(bio_ctrl, inode, file_offset);
 	if (ret < 0)
 		goto error;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8cdb173331c7c..dc2cf58095c2e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10489,12 +10489,11 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
 			size_t bytes = min_t(u64, remaining, PAGE_SIZE);
 
 			if (!bio) {
-				bio = btrfs_bio_alloc(BIO_MAX_VECS);
+				bio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ);
 				bio->bi_iter.bi_sector =
 					(disk_bytenr + cur) >> SECTOR_SHIFT;
 				bio->bi_end_io = btrfs_encoded_read_endio;
 				bio->bi_private = &priv;
-				bio->bi_opf = REQ_OP_READ;
 			}
 
 			if (!bytes ||
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bb614bb890709..a73bac7f42624 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6624,11 +6624,11 @@ static inline void btrfs_bio_init(struct btrfs_bio *bbio)
  * Just like the underlying bio_alloc_bioset it will no fail as it is backed by
  * a mempool.
  */
-struct bio *btrfs_bio_alloc(unsigned int nr_vecs)
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf)
 {
 	struct bio *bio;
 
-	bio = bio_alloc_bioset(NULL, nr_vecs, 0, GFP_NOFS, &btrfs_bioset);
+	bio = bio_alloc_bioset(NULL, nr_vecs, opf, GFP_NOFS, &btrfs_bioset);
 	btrfs_bio_init(btrfs_bio(bio));
 	return bio;
 }
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index fb57b50fb60b1..bd108c7ed1ac3 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -393,7 +393,7 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio)
 	return container_of(bio, struct btrfs_bio, bio);
 }
 
-struct bio *btrfs_bio_alloc(unsigned int nr_vecs);
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf);
 struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
 
 static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-18 11:30   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 05/11] btrfs: remove bioc->stripes_pending Christoph Hellwig
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Stop grabbing an extra bio_counter reference for each clone bio in a
mirrored write and instead just release the one original reference in
btrfs_end_bioc once all the bios for a single btfs_bio have completed
instead of at the end of btrfs_submit_bio once all bios have been
submitted.

This means the reference is now carried by the "upper" btrfs_bio only
instead of each lower bio.

Also remove the now unused btrfs_bio_counter_inc_noblocked helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/ctree.h       | 1 -
 fs/btrfs/dev-replace.c | 5 -----
 fs/btrfs/volumes.c     | 6 ++----
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 51c4804392637..e9a1d0016804a 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4079,7 +4079,6 @@ static inline void btrfs_init_full_stripe_locks_tree(
 
 /* dev-replace.c */
 void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info);
-void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info);
 void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount);
 
 static inline void btrfs_bio_counter_dec(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 488f2105c5d0d..c26b3196a3f17 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -1284,11 +1284,6 @@ int __pure btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace)
 	return 1;
 }
 
-void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
-{
-	percpu_counter_inc(&fs_info->dev_replace.bio_counter);
-}
-
 void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
 {
 	percpu_counter_sub(&fs_info->dev_replace.bio_counter, amount);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a73bac7f42624..014df2e64e67b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6670,6 +6670,8 @@ static void btrfs_end_bioc(struct btrfs_io_context *bioc, bool async)
 	struct bio *orig_bio = bioc->orig_bio;
 	struct btrfs_bio *bbio = btrfs_bio(orig_bio);
 
+	btrfs_bio_counter_dec(bioc->fs_info);
+
 	bbio->mirror_num = bioc->mirror_num;
 	orig_bio->bi_private = bioc->private;
 	orig_bio->bi_end_io = bioc->end_io;
@@ -6717,7 +6719,6 @@ static void btrfs_end_bio(struct bio *bio)
 	if (bio != bioc->orig_bio)
 		bio_put(bio);
 
-	btrfs_bio_counter_dec(bioc->fs_info);
 	if (atomic_dec_and_test(&bioc->stripes_pending))
 		btrfs_end_bioc(bioc, true);
 }
@@ -6772,8 +6773,6 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 		(unsigned long)dev->bdev->bd_dev, rcu_str_deref(dev->name),
 		dev->devid, bio->bi_iter.bi_size);
 
-	btrfs_bio_counter_inc_noblocked(fs_info);
-
 	btrfsic_check_bio(bio);
 	submit_bio(bio);
 }
@@ -6825,7 +6824,6 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 
 		submit_stripe_bio(bioc, bio, dev_nr, should_clone);
 	}
-	btrfs_bio_counter_dec(fs_info);
 }
 
 static bool dev_args_match_fs_devices(const struct btrfs_dev_lookup_args *args,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 05/11] btrfs: remove bioc->stripes_pending
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-18 11:33   ` Anand Jain
  2022-08-18 23:37   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 06/11] btrfs: properly abstract the parity raid bio handling Christoph Hellwig
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

The stripes_pending in the btrfs_io_context counts number of inflight
low-level bios for an upper btrfs_bio.  For reads this is generally
one as reads are never cloned, while for writes we can trivially use
the bio remaining mechanisms that is used for chained bios.

To be able to make use of that mechanism, split out a separate trivial
end_io handler for the cloned bios that does a minimal amount of error
tracking and which then calls bio_endio on the original bio to transfer
control to that, with the remaining counter making sure it is completed
last.  This then allows to merge btrfs_end_bioc into the original bio
bi_end_io handler.  To make this all work all error handling needs to
happen through the bi_end_io handler, which requires a small amount
of reshuffling in submit_stripe_bio so that the bio is cloned already
by the time the suitability of the device is checked.

This reduces the size of the btrfs_io_context and prepares splitting
the btrfs_bio at the stripe boundary.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 94 ++++++++++++++++++++++++----------------------
 fs/btrfs/volumes.h |  1 -
 2 files changed, 50 insertions(+), 45 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 014df2e64e67b..8775f2a635919 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5892,7 +5892,6 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_
 		sizeof(u64) * (total_stripes),
 		GFP_NOFS|__GFP_NOFAIL);
 
-	atomic_set(&bioc->error, 0);
 	refcount_set(&bioc->refs, 1);
 
 	bioc->fs_info = fs_info;
@@ -6647,7 +6646,21 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
 	bio_trim(bio, offset >> 9, size >> 9);
 	bbio->iter = bio->bi_iter;
 	return bio;
+}
+
+static void btrfs_log_dev_io_error(struct bio *bio, struct btrfs_device *dev)
+{
+	if (!dev || !dev->bdev)
+		return;
+	if (bio->bi_status != BLK_STS_IOERR && bio->bi_status != BLK_STS_TARGET)
+		return;
 
+	if (btrfs_op(bio) == BTRFS_MAP_WRITE)
+		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_WRITE_ERRS);
+	if (!(bio->bi_opf & REQ_RAHEAD))
+		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_READ_ERRS);
+	if (bio->bi_opf & REQ_PREFLUSH)
+		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_FLUSH_ERRS);
 }
 
 static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_io_context *bioc)
@@ -6665,62 +6678,54 @@ static void btrfs_end_bio_work(struct work_struct *work)
 	bio_endio(&bbio->bio);
 }
 
-static void btrfs_end_bioc(struct btrfs_io_context *bioc, bool async)
+static void btrfs_end_bio(struct bio *bio)
 {
-	struct bio *orig_bio = bioc->orig_bio;
-	struct btrfs_bio *bbio = btrfs_bio(orig_bio);
+	struct btrfs_io_stripe *stripe = bio->bi_private;
+	struct btrfs_io_context *bioc = stripe->bioc;
+	struct btrfs_bio *bbio = btrfs_bio(bio);
 
 	btrfs_bio_counter_dec(bioc->fs_info);
 
+	if (bio->bi_status) {
+		atomic_inc(&bioc->error);
+		btrfs_log_dev_io_error(bio, stripe->dev);
+	}
+
 	bbio->mirror_num = bioc->mirror_num;
-	orig_bio->bi_private = bioc->private;
-	orig_bio->bi_end_io = bioc->end_io;
+	bio->bi_end_io = bioc->end_io;
+	bio->bi_private = bioc->private;
 
 	/*
 	 * Only send an error to the higher layers if it is beyond the tolerance
 	 * threshold.
 	 */
 	if (atomic_read(&bioc->error) > bioc->max_errors)
-		orig_bio->bi_status = BLK_STS_IOERR;
+		bio->bi_status = BLK_STS_IOERR;
 	else
-		orig_bio->bi_status = BLK_STS_OK;
+		bio->bi_status = BLK_STS_OK;
 
-	if (btrfs_op(orig_bio) == BTRFS_MAP_READ && async) {
+	if (btrfs_op(bio) == BTRFS_MAP_READ) {
 		INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work);
 		queue_work(btrfs_end_io_wq(bioc), &bbio->end_io_work);
 	} else {
-		bio_endio(orig_bio);
+		bio_endio(bio);
 	}
 
 	btrfs_put_bioc(bioc);
 }
 
-static void btrfs_end_bio(struct bio *bio)
+static void btrfs_clone_write_end_io(struct bio *bio)
 {
 	struct btrfs_io_stripe *stripe = bio->bi_private;
-	struct btrfs_io_context *bioc = stripe->bioc;
 
 	if (bio->bi_status) {
-		atomic_inc(&bioc->error);
-		if (bio->bi_status == BLK_STS_IOERR ||
-		    bio->bi_status == BLK_STS_TARGET) {
-			if (btrfs_op(bio) == BTRFS_MAP_WRITE)
-				btrfs_dev_stat_inc_and_print(stripe->dev,
-						BTRFS_DEV_STAT_WRITE_ERRS);
-			else if (!(bio->bi_opf & REQ_RAHEAD))
-				btrfs_dev_stat_inc_and_print(stripe->dev,
-						BTRFS_DEV_STAT_READ_ERRS);
-			if (bio->bi_opf & REQ_PREFLUSH)
-				btrfs_dev_stat_inc_and_print(stripe->dev,
-						BTRFS_DEV_STAT_FLUSH_ERRS);
-		}
+		atomic_inc(&stripe->bioc->error);
+		btrfs_log_dev_io_error(bio, stripe->dev);
 	}
 
-	if (bio != bioc->orig_bio)
-		bio_put(bio);
-
-	if (atomic_dec_and_test(&bioc->stripes_pending))
-		btrfs_end_bioc(bioc, true);
+	/* Pass on control to the original bio this one was cloned from */
+	bio_endio(stripe->bioc->orig_bio);
+	bio_put(bio);
 }
 
 static void submit_stripe_bio(struct btrfs_io_context *bioc,
@@ -6731,28 +6736,30 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 	u64 physical = bioc->stripes[dev_nr].physical;
 	struct bio *bio;
 
-	if (!dev || !dev->bdev ||
-	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
-	    (btrfs_op(orig_bio) == BTRFS_MAP_WRITE &&
-	     !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))) {
-		atomic_inc(&bioc->error);
-		if (atomic_dec_and_test(&bioc->stripes_pending))
-			btrfs_end_bioc(bioc, false);
-		return;
-	}
-
 	if (clone) {
-		bio = bio_alloc_clone(dev->bdev, orig_bio, GFP_NOFS, &fs_bio_set);
+		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
+		bio_inc_remaining(orig_bio);
+		bio->bi_end_io = btrfs_clone_write_end_io;
 	} else {
 		bio = orig_bio;
-		bio_set_dev(bio, dev->bdev);
 		btrfs_bio(bio)->device = dev;
+		bio->bi_end_io = btrfs_end_bio;
 	}
 
 	bioc->stripes[dev_nr].bioc = bioc;
 	bio->bi_private = &bioc->stripes[dev_nr];
-	bio->bi_end_io = btrfs_end_bio;
 	bio->bi_iter.bi_sector = physical >> 9;
+
+	if (!dev || !dev->bdev ||
+	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
+	    (btrfs_op(bio) == BTRFS_MAP_WRITE &&
+	     !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))) {
+		bio_io_error(bio);
+		return;
+	}
+
+	bio_set_dev(bio, dev->bdev);
+
 	/*
 	 * For zone append writing, bi_sector must point the beginning of the
 	 * zone
@@ -6801,7 +6808,6 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 	bioc->orig_bio = bio;
 	bioc->private = bio->bi_private;
 	bioc->end_io = bio->bi_end_io;
-	atomic_set(&bioc->stripes_pending, total_devs);
 
 	if ((bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
 	    ((btrfs_op(bio) == BTRFS_MAP_WRITE) || (mirror_num > 1))) {
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index bd108c7ed1ac3..9e984a9922c59 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -454,7 +454,6 @@ struct btrfs_discard_stripe {
  */
 struct btrfs_io_context {
 	refcount_t refs;
-	atomic_t stripes_pending;
 	struct btrfs_fs_info *fs_info;
 	u64 map_type; /* get from map_lookup->type */
 	bio_end_io_t *end_io;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 06/11] btrfs: properly abstract the parity raid bio handling
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 05/11] btrfs: remove bioc->stripes_pending Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-06  8:03 ` [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler Christoph Hellwig
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

The parity raid write/recover functionality is currently not very well
abstracted from the bio submission and completion handling in volumes.c:

 - the raid56 code directly completes the original btrfs_bio fed into
   btrfs_submit_bio instead of dispatching back to volumes.c
 - the raid56 code consumes the bioc and bio_counter references taken
   by volumes.c, which also leads to ugly special casing of the calls
   from the scrub code into the raid56 code

To fix this up supply a bi_end_io handler that calls back into the
volumes.c machinery, which then puts the bioc, decrements the bio_counter
and completes the original bio, and updates the scrub code to also
take ownership of the bioc and bio_counter in all cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/raid56.c  | 45 +++++++--------------------------------------
 fs/btrfs/raid56.h  |  4 +---
 fs/btrfs/scrub.c   |  7 +++++--
 fs/btrfs/volumes.c | 18 +++++++++++++++++-
 4 files changed, 30 insertions(+), 44 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1afe32d5ab017..d767814653249 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -275,7 +275,6 @@ static void merge_rbio(struct btrfs_raid_bio *dest,
 	/* Also inherit the bitmaps from @victim. */
 	bitmap_or(&dest->dbitmap, &victim->dbitmap, &dest->dbitmap,
 		  dest->stripe_nsectors);
-	dest->generic_bio_cnt += victim->generic_bio_cnt;
 	bio_list_init(&victim->bio_list);
 }
 
@@ -814,8 +813,6 @@ static void rbio_orig_end_io(struct btrfs_raid_bio *rbio, blk_status_t err)
 	struct bio *cur = bio_list_get(&rbio->bio_list);
 	struct bio *extra;
 
-	if (rbio->generic_bio_cnt)
-		btrfs_bio_counter_sub(rbio->bioc->fs_info, rbio->generic_bio_cnt);
 	/*
 	 * Clear the data bitmap, as the rbio may be cached for later usage.
 	 * do this before before unlock_stripe() so there will be no new bio
@@ -946,6 +943,7 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_fs_info *fs_info,
 	spin_lock_init(&rbio->bio_list_lock);
 	INIT_LIST_HEAD(&rbio->stripe_cache);
 	INIT_LIST_HEAD(&rbio->hash_list);
+	btrfs_get_bioc(bioc);
 	rbio->bioc = bioc;
 	rbio->nr_pages = num_pages;
 	rbio->nr_sectors = num_sectors;
@@ -1813,15 +1811,12 @@ void raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc)
 
 	rbio = alloc_rbio(fs_info, bioc);
 	if (IS_ERR(rbio)) {
-		btrfs_put_bioc(bioc);
 		ret = PTR_ERR(rbio);
-		goto out_dec_counter;
+		goto fail;
 	}
 	rbio->operation = BTRFS_RBIO_WRITE;
 	rbio_add_bio(rbio, bio);
 
-	rbio->generic_bio_cnt = 1;
-
 	/*
 	 * don't plug on full rbios, just get them out the door
 	 * as quickly as we can
@@ -1829,7 +1824,7 @@ void raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc)
 	if (rbio_is_full(rbio)) {
 		ret = full_stripe_write(rbio);
 		if (ret)
-			goto out_dec_counter;
+			goto fail;
 		return;
 	}
 
@@ -1844,13 +1839,12 @@ void raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc)
 	} else {
 		ret = __raid56_parity_write(rbio);
 		if (ret)
-			goto out_dec_counter;
+			goto fail;
 	}
 
 	return;
 
-out_dec_counter:
-	btrfs_bio_counter_dec(fs_info);
+fail:
 	bio->bi_status = errno_to_blk_status(ret);
 	bio_endio(bio);
 }
@@ -2198,18 +2192,11 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio)
  * of the drive.
  */
 void raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc,
-			   int mirror_num, bool generic_io)
+			   int mirror_num)
 {
 	struct btrfs_fs_info *fs_info = bioc->fs_info;
 	struct btrfs_raid_bio *rbio;
 
-	if (generic_io) {
-		ASSERT(bioc->mirror_num == mirror_num);
-		btrfs_bio(bio)->mirror_num = mirror_num;
-	} else {
-		btrfs_get_bioc(bioc);
-	}
-
 	rbio = alloc_rbio(fs_info, bioc);
 	if (IS_ERR(rbio)) {
 		bio->bi_status = errno_to_blk_status(PTR_ERR(rbio));
@@ -2225,14 +2212,11 @@ void raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc,
 "%s could not find the bad stripe in raid56 so that we cannot recover any more (bio has logical %llu len %llu, bioc has map_type %llu)",
 			   __func__, bio->bi_iter.bi_sector << 9,
 			   (u64)bio->bi_iter.bi_size, bioc->map_type);
-		kfree(rbio);
+		__free_raid_bio(rbio);
 		bio->bi_status = BLK_STS_IOERR;
 		goto out_end_bio;
 	}
 
-	if (generic_io)
-		rbio->generic_bio_cnt = 1;
-
 	/*
 	 * Loop retry:
 	 * for 'mirror == 2', reconstruct from all other stripes.
@@ -2261,8 +2245,6 @@ void raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc,
 	return;
 
 out_end_bio:
-	btrfs_bio_counter_dec(fs_info);
-	btrfs_put_bioc(bioc);
 	bio_endio(bio);
 }
 
@@ -2326,13 +2308,6 @@ struct btrfs_raid_bio *raid56_parity_alloc_scrub_rbio(struct bio *bio,
 	ASSERT(i < rbio->real_stripes);
 
 	bitmap_copy(&rbio->dbitmap, dbitmap, stripe_nsectors);
-
-	/*
-	 * We have already increased bio_counter when getting bioc, record it
-	 * so we can free it at rbio_orig_end_io().
-	 */
-	rbio->generic_bio_cnt = 1;
-
 	return rbio;
 }
 
@@ -2772,12 +2747,6 @@ raid56_alloc_missing_rbio(struct bio *bio, struct btrfs_io_context *bioc)
 		return NULL;
 	}
 
-	/*
-	 * When we get bioc, we have already increased bio_counter, record it
-	 * so we can free it at rbio_orig_end_io()
-	 */
-	rbio->generic_bio_cnt = 1;
-
 	return rbio;
 }
 
diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h
index 6f48f9e4c8694..91d5c0adad151 100644
--- a/fs/btrfs/raid56.h
+++ b/fs/btrfs/raid56.h
@@ -89,8 +89,6 @@ struct btrfs_raid_bio {
 	 */
 	int bio_list_bytes;
 
-	int generic_bio_cnt;
-
 	refcount_t refs;
 
 	atomic_t stripes_pending;
@@ -166,7 +164,7 @@ static inline int nr_data_stripes(const struct map_lookup *map)
 struct btrfs_device;
 
 void raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc,
-			   int mirror_num, bool generic_io);
+			   int mirror_num);
 void raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc);
 
 void raid56_add_scrub_pages(struct btrfs_raid_bio *rbio, struct page *page,
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index d05025034b0aa..e1aad8eaab64c 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1389,7 +1389,7 @@ static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
 	bio->bi_private = &done;
 	bio->bi_end_io = scrub_bio_wait_endio;
 	raid56_parity_recover(bio, sector->recover->bioc,
-			      sector->sblock->sectors[0]->mirror_num, false);
+			      sector->sblock->sectors[0]->mirror_num);
 
 	wait_for_completion_io(&done);
 	return blk_status_to_errno(bio->bi_status);
@@ -2093,6 +2093,7 @@ static void scrub_missing_raid56_end_io(struct bio *bio)
 	struct scrub_block *sblock = bio->bi_private;
 	struct btrfs_fs_info *fs_info = sblock->sctx->fs_info;
 
+	btrfs_bio_counter_dec(fs_info);
 	if (bio->bi_status)
 		sblock->no_io_error_seen = 0;
 
@@ -2195,6 +2196,7 @@ static void scrub_missing_raid56_pages(struct scrub_block *sblock)
 	scrub_block_get(sblock);
 	scrub_pending_bio_inc(sctx);
 	raid56_submit_missing_rbio(rbio);
+	btrfs_put_bioc(bioc);
 	return;
 
 rbio_out:
@@ -2765,6 +2767,7 @@ static void scrub_parity_bio_endio_worker(struct work_struct *work)
 						    work);
 	struct scrub_ctx *sctx = sparity->sctx;
 
+	btrfs_bio_counter_dec(sctx->fs_info);
 	scrub_free_parity(sparity);
 	scrub_pending_bio_dec(sctx);
 }
@@ -2815,6 +2818,7 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 					      sparity->scrub_dev,
 					      &sparity->dbitmap,
 					      sparity->nsectors);
+	btrfs_put_bioc(bioc);
 	if (!rbio)
 		goto rbio_out;
 
@@ -2826,7 +2830,6 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 	bio_put(bio);
 bioc_out:
 	btrfs_bio_counter_dec(fs_info);
-	btrfs_put_bioc(bioc);
 	bitmap_or(&sparity->ebitmap, &sparity->ebitmap, &sparity->dbitmap,
 		  sparity->nsectors);
 	spin_lock(&sctx->stat_lock);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 8775f2a635919..0b2eea9ccf094 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6678,6 +6678,20 @@ static void btrfs_end_bio_work(struct work_struct *work)
 	bio_endio(&bbio->bio);
 }
 
+static void btrfs_raid56_end_io(struct bio *bio)
+{
+	struct btrfs_io_context *bioc = bio->bi_private;
+	struct btrfs_bio *bbio = btrfs_bio(bio);
+
+	btrfs_bio_counter_dec(bioc->fs_info);
+	bbio->mirror_num = bioc->mirror_num;
+	bio->bi_end_io = bioc->end_io;
+	bio->bi_private = bioc->private;
+	bio->bi_end_io(bio);
+
+	btrfs_put_bioc(bioc);
+}
+
 static void btrfs_end_bio(struct bio *bio)
 {
 	struct btrfs_io_stripe *stripe = bio->bi_private;
@@ -6811,10 +6825,12 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 
 	if ((bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
 	    ((btrfs_op(bio) == BTRFS_MAP_WRITE) || (mirror_num > 1))) {
+		bio->bi_private = bioc;
+		bio->bi_end_io = btrfs_raid56_end_io;
 		if (btrfs_op(bio) == BTRFS_MAP_WRITE)
 			raid56_parity_write(bio, bioc);
 		else
-			raid56_parity_recover(bio, bioc, mirror_num, true);
+			raid56_parity_recover(bio, bioc, mirror_num);
 		return;
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 06/11] btrfs: properly abstract the parity raid bio handling Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-19  6:20   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Currently btrfs_bio end I/O handling is a bit of a mess.  The bi_end_io
handler and bi_private pointer of the embedded struct bio are both used
to handle the completion of the high-level btrfs_bio and for the I/O
completion for the low-level device that the embedded bio ends up being
sent to.  To support this bi_end_io and bi_private are saved into the
btrfs_io_context structure and then restored after the bio sent to the
underlying device has completed the actual I/O.

Untangle this by adding an end I/O handler and private data to struct
btrfs_bio for the highlevel btrfs_bio based completions, and leave the
actual bio bi_end_io handler and bi_private pointer entirely to the
low-level device I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/compression.c | 42 +++++++++++++------------------
 fs/btrfs/disk-io.c     | 16 ++++++------
 fs/btrfs/extent_io.c   | 35 +++++++++++++-------------
 fs/btrfs/inode.c       | 56 +++++++++++++++++++-----------------------
 fs/btrfs/volumes.c     | 30 +++++++++++-----------
 fs/btrfs/volumes.h     | 20 ++++++++++++---
 6 files changed, 96 insertions(+), 103 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index e124096eacddd..a2b43d137bd6b 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -152,9 +152,7 @@ static void finish_compressed_bio_read(struct compressed_bio *cb)
 	}
 
 	/* Do io completion on the original bio */
-	if (cb->status != BLK_STS_OK)
-		cb->orig_bio->bi_status = cb->status;
-	bio_endio(cb->orig_bio);
+	btrfs_bio_end_io(btrfs_bio(cb->orig_bio), cb->status);
 
 	/* Finally free the cb struct */
 	kfree(cb->compressed_pages);
@@ -166,16 +164,15 @@ static void finish_compressed_bio_read(struct compressed_bio *cb)
  * before decompressing it into the original bio and freeing the uncompressed
  * pages.
  */
-static void end_compressed_bio_read(struct bio *bio)
+static void end_compressed_bio_read(struct btrfs_bio *bbio)
 {
-	struct compressed_bio *cb = bio->bi_private;
+	struct compressed_bio *cb = bbio->private;
 	struct inode *inode = cb->inode;
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_inode *bi = BTRFS_I(inode);
 	bool csum = !(bi->flags & BTRFS_INODE_NODATASUM) &&
 		    !test_bit(BTRFS_FS_STATE_NO_CSUMS, &fs_info->fs_state);
-	blk_status_t status = bio->bi_status;
-	struct btrfs_bio *bbio = btrfs_bio(bio);
+	blk_status_t status = bbio->bio.bi_status;
 	struct bvec_iter iter;
 	struct bio_vec bv;
 	u32 offset;
@@ -209,7 +206,7 @@ static void end_compressed_bio_read(struct bio *bio)
 	if (refcount_dec_and_test(&cb->pending_ios))
 		finish_compressed_bio_read(cb);
 	btrfs_bio_free_csum(bbio);
-	bio_put(bio);
+	bio_put(&bbio->bio);
 }
 
 /*
@@ -301,20 +298,20 @@ static void btrfs_finish_compressed_write_work(struct work_struct *work)
  * This also calls the writeback end hooks for the file pages so that metadata
  * and checksums can be updated in the file.
  */
-static void end_compressed_bio_write(struct bio *bio)
+static void end_compressed_bio_write(struct btrfs_bio *bbio)
 {
-	struct compressed_bio *cb = bio->bi_private;
+	struct compressed_bio *cb = bbio->private;
 
-	if (bio->bi_status)
-		cb->status = bio->bi_status;
+	if (bbio->bio.bi_status)
+		cb->status = bbio->bio.bi_status;
 
 	if (refcount_dec_and_test(&cb->pending_ios)) {
 		struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb);
 
-		btrfs_record_physical_zoned(cb->inode, cb->start, bio);
+		btrfs_record_physical_zoned(cb->inode, cb->start, &bbio->bio);
 		queue_work(fs_info->compressed_write_workers, &cb->write_end_work);
 	}
-	bio_put(bio);
+	bio_put(&bbio->bio);
 }
 
 /*
@@ -335,7 +332,8 @@ static void end_compressed_bio_write(struct bio *bio)
 
 
 static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_bytenr,
-					unsigned int opf, bio_end_io_t endio_func,
+					unsigned int opf,
+					btrfs_bio_end_io_t endio_func,
 					u64 *next_stripe_start)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb);
@@ -344,11 +342,8 @@ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_byte
 	struct bio *bio;
 	int ret;
 
-	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
-
+	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf, endio_func, cb);
 	bio->bi_iter.bi_sector = disk_bytenr >> SECTOR_SHIFT;
-	bio->bi_private = cb;
-	bio->bi_end_io = endio_func;
 
 	em = btrfs_get_chunk_map(fs_info, disk_bytenr, fs_info->sectorsize);
 	if (IS_ERR(em)) {
@@ -477,8 +472,7 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 			if (!skip_sum) {
 				ret = btrfs_csum_one_bio(inode, bio, start, true);
 				if (ret) {
-					bio->bi_status = ret;
-					bio_endio(bio);
+					btrfs_bio_end_io(btrfs_bio(bio), ret);
 					break;
 				}
 			}
@@ -798,8 +792,7 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 
 			ret = btrfs_lookup_bio_sums(inode, comp_bio, NULL);
 			if (ret) {
-				comp_bio->bi_status = ret;
-				bio_endio(comp_bio);
+				btrfs_bio_end_io(btrfs_bio(comp_bio), ret);
 				break;
 			}
 
@@ -825,8 +818,7 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 	kfree(cb);
 out:
 	free_extent_map(em);
-	bio->bi_status = ret;
-	bio_endio(bio);
+	btrfs_bio_end_io(btrfs_bio(bio), ret);
 	return;
 }
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6268dafeeb2db..394cbb28b5a3f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -647,16 +647,14 @@ static void run_one_async_start(struct btrfs_work *work)
  */
 static void run_one_async_done(struct btrfs_work *work)
 {
-	struct async_submit_bio *async;
-	struct inode *inode;
-
-	async = container_of(work, struct  async_submit_bio, work);
-	inode = async->inode;
+	struct async_submit_bio *async =
+		container_of(work, struct  async_submit_bio, work);
+	struct inode *inode = async->inode;
+	struct btrfs_bio *bbio = btrfs_bio(async->bio);
 
 	/* If an error occurred we just want to clean up the bio and move on */
 	if (async->status) {
-		async->bio->bi_status = async->status;
-		bio_endio(async->bio);
+		btrfs_bio_end_io(bbio, async->status);
 		return;
 	}
 
@@ -757,6 +755,7 @@ static bool should_async_write(struct btrfs_fs_info *fs_info,
 void btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio, int mirror_num)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	struct btrfs_bio *bbio = btrfs_bio(bio);
 	blk_status_t ret;
 
 	bio->bi_opf |= REQ_META;
@@ -776,8 +775,7 @@ void btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio, int mirror_
 
 	ret = btree_csum_one_bio(bio);
 	if (ret) {
-		bio->bi_status = ret;
-		bio_endio(bio);
+		btrfs_bio_end_io(bbio, ret);
 		return;
 	}
 
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index fae8c1899226b..25ab388c36cde 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -206,7 +206,7 @@ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
 		btrfs_submit_data_read_bio(inode, bio, mirror_num,
 					   bio_ctrl->compress_type);
 
-	/* The bio is owned by the bi_end_io handler now */
+	/* The bio is owned by the end_io handler now */
 	bio_ctrl->bio = NULL;
 }
 
@@ -222,9 +222,8 @@ static void submit_write_bio(struct extent_page_data *epd, int ret)
 
 	if (ret) {
 		ASSERT(ret < 0);
-		bio->bi_status = errno_to_blk_status(ret);
-		bio_endio(bio);
-		/* The bio is owned by the bi_end_io handler now */
+		btrfs_bio_end_io(btrfs_bio(bio), errno_to_blk_status(ret));
+		/* The bio is owned by the end_io handler now */
 		epd->bio_ctrl.bio = NULL;
 	} else {
 		submit_one_bio(&epd->bio_ctrl);
@@ -2626,12 +2625,11 @@ int btrfs_repair_one_sector(struct inode *inode, struct btrfs_bio *failed_bbio,
 		return -EIO;
 	}
 
-	repair_bio = btrfs_bio_alloc(1, REQ_OP_READ);
+	repair_bio = btrfs_bio_alloc(1, REQ_OP_READ, failed_bbio->end_io,
+				     failed_bbio->private);
 	repair_bbio = btrfs_bio(repair_bio);
 	repair_bbio->file_offset = start;
-	repair_bio->bi_end_io = failed_bio->bi_end_io;
 	repair_bio->bi_iter.bi_sector = failrec->logical >> 9;
-	repair_bio->bi_private = failed_bio->bi_private;
 
 	if (failed_bbio->csum) {
 		const u32 csum_size = fs_info->csum_size;
@@ -2798,8 +2796,9 @@ void end_extent_writepage(struct page *page, int err, u64 start, u64 end)
  * Scheduling is not allowed, so the extent state tree is expected
  * to have one and only one object corresponding to this IO.
  */
-static void end_bio_extent_writepage(struct bio *bio)
+static void end_bio_extent_writepage(struct btrfs_bio *bbio)
 {
+	struct bio *bio = &bbio->bio;
 	int error = blk_status_to_errno(bio->bi_status);
 	struct bio_vec *bvec;
 	u64 start;
@@ -2963,10 +2962,10 @@ static struct extent_buffer *find_extent_buffer_readpage(
  * Scheduling is not allowed, so the extent state tree is expected
  * to have one and only one object corresponding to this IO.
  */
-static void end_bio_extent_readpage(struct bio *bio)
+static void end_bio_extent_readpage(struct btrfs_bio *bbio)
 {
+	struct bio *bio = &bbio->bio;
 	struct bio_vec *bvec;
-	struct btrfs_bio *bbio = btrfs_bio(bio);
 	struct extent_io_tree *tree, *failure_tree;
 	struct processed_extent processed = { 0 };
 	/*
@@ -3257,7 +3256,7 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 			 struct btrfs_bio_ctrl *bio_ctrl,
 			 struct writeback_control *wbc,
 			 unsigned int opf,
-			 bio_end_io_t end_io_func,
+			 btrfs_bio_end_io_t end_io_func,
 			 u64 disk_bytenr, u32 offset, u64 file_offset,
 			 enum btrfs_compression_type compress_type)
 {
@@ -3265,7 +3264,7 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 	struct bio *bio;
 	int ret;
 
-	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
+	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf, end_io_func, NULL);
 	/*
 	 * For compressed page range, its disk_bytenr is always @disk_bytenr
 	 * passed in, no matter if we have added any range into previous bio.
@@ -3276,7 +3275,6 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 		bio->bi_iter.bi_sector = (disk_bytenr + offset) >> SECTOR_SHIFT;
 	bio_ctrl->bio = bio;
 	bio_ctrl->compress_type = compress_type;
-	bio->bi_end_io = end_io_func;
 	ret = calc_bio_boundaries(bio_ctrl, inode, file_offset);
 	if (ret < 0)
 		goto error;
@@ -3315,8 +3313,7 @@ static int alloc_new_bio(struct btrfs_inode *inode,
 	return 0;
 error:
 	bio_ctrl->bio = NULL;
-	bio->bi_status = errno_to_blk_status(ret);
-	bio_endio(bio);
+	btrfs_bio_end_io(btrfs_bio(bio), errno_to_blk_status(ret));
 	return ret;
 }
 
@@ -3339,7 +3336,7 @@ static int submit_extent_page(unsigned int opf,
 			      struct btrfs_bio_ctrl *bio_ctrl,
 			      struct page *page, u64 disk_bytenr,
 			      size_t size, unsigned long pg_offset,
-			      bio_end_io_t end_io_func,
+			      btrfs_bio_end_io_t end_io_func,
 			      enum btrfs_compression_type compress_type,
 			      bool force_bio_submit)
 {
@@ -4336,8 +4333,9 @@ static struct extent_buffer *find_extent_buffer_nolock(
  * Unlike end_bio_extent_buffer_writepage(), we only call end_page_writeback()
  * after all extent buffers in the page has finished their writeback.
  */
-static void end_bio_subpage_eb_writepage(struct bio *bio)
+static void end_bio_subpage_eb_writepage(struct btrfs_bio *bbio)
 {
+	struct bio *bio = &bbio->bio;
 	struct btrfs_fs_info *fs_info;
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
@@ -4393,8 +4391,9 @@ static void end_bio_subpage_eb_writepage(struct bio *bio)
 	bio_put(bio);
 }
 
-static void end_bio_extent_buffer_writepage(struct bio *bio)
+static void end_bio_extent_buffer_writepage(struct btrfs_bio *bbio)
 {
+	struct bio *bio = &bbio->bio;
 	struct bio_vec *bvec;
 	struct extent_buffer *eb;
 	int done;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index dc2cf58095c2e..b933b13a39d2a 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2701,8 +2701,10 @@ void btrfs_submit_data_write_bio(struct inode *inode, struct bio *bio, int mirro
 	if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
 		ret = extract_ordered_extent(bi, bio,
 				page_offset(bio_first_bvec_all(bio)->bv_page));
-		if (ret)
-			goto out;
+		if (ret) {
+			btrfs_bio_end_io(btrfs_bio(bio), ret);
+			return;
+		}
 	}
 
 	/*
@@ -2722,16 +2724,12 @@ void btrfs_submit_data_write_bio(struct inode *inode, struct bio *bio, int mirro
 			return;
 
 		ret = btrfs_csum_one_bio(bi, bio, (u64)-1, false);
-		if (ret)
-			goto out;
+		if (ret) {
+			btrfs_bio_end_io(btrfs_bio(bio), ret);
+			return;
+		}
 	}
 	btrfs_submit_bio(fs_info, bio, mirror_num);
-	return;
-out:
-	if (ret) {
-		bio->bi_status = ret;
-		bio_endio(bio);
-	}
 }
 
 void btrfs_submit_data_read_bio(struct inode *inode, struct bio *bio,
@@ -2758,8 +2756,7 @@ void btrfs_submit_data_read_bio(struct inode *inode, struct bio *bio,
 	 */
 	ret = btrfs_lookup_bio_sums(inode, bio, NULL);
 	if (ret) {
-		bio->bi_status = ret;
-		bio_endio(bio);
+		btrfs_bio_end_io(btrfs_bio(bio), ret);
 		return;
 	}
 
@@ -7974,7 +7971,7 @@ static void submit_dio_repair_bio(struct inode *inode, struct bio *bio,
 				  int mirror_num,
 				  enum btrfs_compression_type compress_type)
 {
-	struct btrfs_dio_private *dip = bio->bi_private;
+	struct btrfs_dio_private *dip = btrfs_bio(bio)->private;
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 
 	BUG_ON(bio_op(bio) == REQ_OP_WRITE);
@@ -8027,10 +8024,10 @@ static blk_status_t btrfs_submit_bio_start_direct_io(struct inode *inode,
 	return btrfs_csum_one_bio(BTRFS_I(inode), bio, dio_file_offset, false);
 }
 
-static void btrfs_end_dio_bio(struct bio *bio)
+static void btrfs_end_dio_bio(struct btrfs_bio *bbio)
 {
-	struct btrfs_dio_private *dip = bio->bi_private;
-	struct btrfs_bio *bbio = btrfs_bio(bio);
+	struct btrfs_dio_private *dip = bbio->private;
+	struct bio *bio = &bbio->bio;
 	blk_status_t err = bio->bi_status;
 
 	if (err)
@@ -8056,7 +8053,7 @@ static void btrfs_submit_dio_bio(struct bio *bio, struct inode *inode,
 				 u64 file_offset, int async_submit)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-	struct btrfs_dio_private *dip = bio->bi_private;
+	struct btrfs_dio_private *dip = btrfs_bio(bio)->private;
 	blk_status_t ret;
 
 	/* Save the original iter for read repair */
@@ -8079,8 +8076,7 @@ static void btrfs_submit_dio_bio(struct bio *bio, struct inode *inode,
 		 */
 		ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, file_offset, false);
 		if (ret) {
-			bio->bi_status = ret;
-			bio_endio(bio);
+			btrfs_bio_end_io(btrfs_bio(bio), ret);
 			return;
 		}
 	} else {
@@ -8163,9 +8159,8 @@ static void btrfs_submit_direct(const struct iomap_iter *iter,
 		 * This will never fail as it's passing GPF_NOFS and
 		 * the allocation is backed by btrfs_bioset.
 		 */
-		bio = btrfs_bio_clone_partial(dio_bio, clone_offset, clone_len);
-		bio->bi_private = dip;
-		bio->bi_end_io = btrfs_end_dio_bio;
+		bio = btrfs_bio_clone_partial(dio_bio, clone_offset, clone_len,
+					      btrfs_end_dio_bio, dip);
 		btrfs_bio(bio)->file_offset = file_offset;
 
 		if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
@@ -10375,7 +10370,7 @@ struct btrfs_encoded_read_private {
 static blk_status_t submit_encoded_read_bio(struct btrfs_inode *inode,
 					    struct bio *bio, int mirror_num)
 {
-	struct btrfs_encoded_read_private *priv = bio->bi_private;
+	struct btrfs_encoded_read_private *priv = btrfs_bio(bio)->private;
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	blk_status_t ret;
 
@@ -10393,7 +10388,7 @@ static blk_status_t submit_encoded_read_bio(struct btrfs_inode *inode,
 static blk_status_t btrfs_encoded_read_verify_csum(struct btrfs_bio *bbio)
 {
 	const bool uptodate = (bbio->bio.bi_status == BLK_STS_OK);
-	struct btrfs_encoded_read_private *priv = bbio->bio.bi_private;
+	struct btrfs_encoded_read_private *priv = bbio->private;
 	struct btrfs_inode *inode = priv->inode;
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	u32 sectorsize = fs_info->sectorsize;
@@ -10421,10 +10416,9 @@ static blk_status_t btrfs_encoded_read_verify_csum(struct btrfs_bio *bbio)
 	return BLK_STS_OK;
 }
 
-static void btrfs_encoded_read_endio(struct bio *bio)
+static void btrfs_encoded_read_endio(struct btrfs_bio *bbio)
 {
-	struct btrfs_encoded_read_private *priv = bio->bi_private;
-	struct btrfs_bio *bbio = btrfs_bio(bio);
+	struct btrfs_encoded_read_private *priv = bbio->private;
 	blk_status_t status;
 
 	status = btrfs_encoded_read_verify_csum(bbio);
@@ -10442,7 +10436,7 @@ static void btrfs_encoded_read_endio(struct bio *bio)
 	if (!atomic_dec_return(&priv->pending))
 		wake_up(&priv->wait);
 	btrfs_bio_free_csum(bbio);
-	bio_put(bio);
+	bio_put(&bbio->bio);
 }
 
 int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
@@ -10489,11 +10483,11 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
 			size_t bytes = min_t(u64, remaining, PAGE_SIZE);
 
 			if (!bio) {
-				bio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ);
+				bio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ,
+						      btrfs_encoded_read_endio,
+						      &priv);
 				bio->bi_iter.bi_sector =
 					(disk_bytenr + cur) >> SECTOR_SHIFT;
-				bio->bi_end_io = btrfs_encoded_read_endio;
-				bio->bi_private = &priv;
 			}
 
 			if (!bytes ||
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0b2eea9ccf094..7c8f84ff659ea 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6611,9 +6611,12 @@ int btrfs_map_sblock(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
  * Initialize a btrfs_bio structure.  This skips the embedded bio itself as it
  * is already initialized by the block layer.
  */
-static inline void btrfs_bio_init(struct btrfs_bio *bbio)
+static inline void btrfs_bio_init(struct btrfs_bio *bbio,
+				  btrfs_bio_end_io_t end_io, void *private)
 {
 	memset(bbio, 0, offsetof(struct btrfs_bio, bio));
+	bbio->end_io = end_io;
+	bbio->private = private;
 }
 
 /*
@@ -6623,16 +6626,18 @@ static inline void btrfs_bio_init(struct btrfs_bio *bbio)
  * Just like the underlying bio_alloc_bioset it will no fail as it is backed by
  * a mempool.
  */
-struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf)
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf,
+			    btrfs_bio_end_io_t end_io, void *private)
 {
 	struct bio *bio;
 
 	bio = bio_alloc_bioset(NULL, nr_vecs, opf, GFP_NOFS, &btrfs_bioset);
-	btrfs_bio_init(btrfs_bio(bio));
+	btrfs_bio_init(btrfs_bio(bio), end_io, private);
 	return bio;
 }
 
-struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
+struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size,
+				    btrfs_bio_end_io_t end_io, void *private)
 {
 	struct bio *bio;
 	struct btrfs_bio *bbio;
@@ -6641,7 +6646,7 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
 
 	bio = bio_alloc_clone(orig->bi_bdev, orig, GFP_NOFS, &btrfs_bioset);
 	bbio = btrfs_bio(bio);
-	btrfs_bio_init(bbio);
+	btrfs_bio_init(bbio, end_io, private);
 
 	bio_trim(bio, offset >> 9, size >> 9);
 	bbio->iter = bio->bi_iter;
@@ -6675,7 +6680,7 @@ static void btrfs_end_bio_work(struct work_struct *work)
 	struct btrfs_bio *bbio =
 		container_of(work, struct btrfs_bio, end_io_work);
 
-	bio_endio(&bbio->bio);
+	bbio->end_io(bbio);
 }
 
 static void btrfs_raid56_end_io(struct bio *bio)
@@ -6685,9 +6690,7 @@ static void btrfs_raid56_end_io(struct bio *bio)
 
 	btrfs_bio_counter_dec(bioc->fs_info);
 	bbio->mirror_num = bioc->mirror_num;
-	bio->bi_end_io = bioc->end_io;
-	bio->bi_private = bioc->private;
-	bio->bi_end_io(bio);
+	bbio->end_io(bbio);
 
 	btrfs_put_bioc(bioc);
 }
@@ -6706,8 +6709,6 @@ static void btrfs_end_bio(struct bio *bio)
 	}
 
 	bbio->mirror_num = bioc->mirror_num;
-	bio->bi_end_io = bioc->end_io;
-	bio->bi_private = bioc->private;
 
 	/*
 	 * Only send an error to the higher layers if it is beyond the tolerance
@@ -6722,7 +6723,7 @@ static void btrfs_end_bio(struct bio *bio)
 		INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work);
 		queue_work(btrfs_end_io_wq(bioc), &bbio->end_io_work);
 	} else {
-		bio_endio(bio);
+		bbio->end_io(bbio);
 	}
 
 	btrfs_put_bioc(bioc);
@@ -6813,15 +6814,12 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 				&map_length, &bioc, mirror_num, 1);
 	if (ret) {
 		btrfs_bio_counter_dec(fs_info);
-		bio->bi_status = errno_to_blk_status(ret);
-		bio_endio(bio);
+		btrfs_bio_end_io(btrfs_bio(bio), errno_to_blk_status(ret));
 		return;
 	}
 
 	total_devs = bioc->num_stripes;
 	bioc->orig_bio = bio;
-	bioc->private = bio->bi_private;
-	bioc->end_io = bio->bi_end_io;
 
 	if ((bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
 	    ((btrfs_op(bio) == BTRFS_MAP_WRITE) || (mirror_num > 1))) {
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 9e984a9922c59..d3667592fec76 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -361,6 +361,8 @@ struct btrfs_fs_devices {
  */
 #define BTRFS_MAX_BIO_SECTORS				(256)
 
+typedef void (*btrfs_bio_end_io_t)(struct btrfs_bio *bbio);
+
 /*
  * Additional info to pass along bio.
  *
@@ -378,6 +380,10 @@ struct btrfs_bio {
 	u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE];
 	struct bvec_iter iter;
 
+	/* End I/O information supplied to btrfs_bio_alloc */
+	btrfs_bio_end_io_t end_io;
+	void *private;
+
 	/* For read end I/O handling */
 	struct work_struct end_io_work;
 
@@ -393,8 +399,16 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio)
 	return container_of(bio, struct btrfs_bio, bio);
 }
 
-struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf);
-struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
+struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf,
+			    btrfs_bio_end_io_t end_io, void *private);
+struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size,
+				    btrfs_bio_end_io_t end_io, void *private);
+
+static inline void btrfs_bio_end_io(struct btrfs_bio *bbio, blk_status_t status)
+{
+	bbio->bio.bi_status = status;
+	bbio->end_io(bbio);
+}
 
 static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)
 {
@@ -456,9 +470,7 @@ struct btrfs_io_context {
 	refcount_t refs;
 	struct btrfs_fs_info *fs_info;
 	u64 map_type; /* get from map_lookup->type */
-	bio_end_io_t *end_io;
 	struct bio *orig_bio;
-	void *private;
 	atomic_t error;
 	int max_errors;
 	int num_stripes;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-19 23:37   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention Christoph Hellwig
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Split out a low-level btrfs_submit_dev_bio helper that just submits
the bio without any cloning decisions or setting up the end I/O handler
for future reuse by a different caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/volumes.c | 51 +++++++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7c8f84ff659ea..5c5beb789594d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6743,28 +6743,8 @@ static void btrfs_clone_write_end_io(struct bio *bio)
 	bio_put(bio);
 }
 
-static void submit_stripe_bio(struct btrfs_io_context *bioc,
-			      struct bio *orig_bio, int dev_nr, bool clone)
+static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)
 {
-	struct btrfs_fs_info *fs_info = bioc->fs_info;
-	struct btrfs_device *dev = bioc->stripes[dev_nr].dev;
-	u64 physical = bioc->stripes[dev_nr].physical;
-	struct bio *bio;
-
-	if (clone) {
-		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
-		bio_inc_remaining(orig_bio);
-		bio->bi_end_io = btrfs_clone_write_end_io;
-	} else {
-		bio = orig_bio;
-		btrfs_bio(bio)->device = dev;
-		bio->bi_end_io = btrfs_end_bio;
-	}
-
-	bioc->stripes[dev_nr].bioc = bioc;
-	bio->bi_private = &bioc->stripes[dev_nr];
-	bio->bi_iter.bi_sector = physical >> 9;
-
 	if (!dev || !dev->bdev ||
 	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
 	    (btrfs_op(bio) == BTRFS_MAP_WRITE &&
@@ -6780,8 +6760,11 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 	 * zone
 	 */
 	if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
+		u64 physical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
+
 		if (btrfs_dev_is_sequential(dev, physical)) {
-			u64 zone_start = round_down(physical, fs_info->zone_size);
+			u64 zone_start = round_down(physical,
+						    dev->fs_info->zone_size);
 
 			bio->bi_iter.bi_sector = zone_start >> SECTOR_SHIFT;
 		} else {
@@ -6789,7 +6772,7 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 			bio->bi_opf |= REQ_OP_WRITE;
 		}
 	}
-	btrfs_debug_in_rcu(fs_info,
+	btrfs_debug_in_rcu(dev->fs_info,
 	"%s: rw %d 0x%x, sector=%llu, dev=%lu (%s id %llu), size=%u",
 		__func__, bio_op(bio), bio->bi_opf, bio->bi_iter.bi_sector,
 		(unsigned long)dev->bdev->bd_dev, rcu_str_deref(dev->name),
@@ -6799,6 +6782,28 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 	submit_bio(bio);
 }
 
+static void submit_stripe_bio(struct btrfs_io_context *bioc,
+			      struct bio *orig_bio, int dev_nr, bool clone)
+{
+	struct bio *bio;
+
+	/* Reuse the bio embedded into the btrfs_bio for the last mirror */
+	if (!clone) {
+		bio = orig_bio;
+		btrfs_bio(bio)->device = bioc->stripes[dev_nr].dev;
+		bio->bi_end_io = btrfs_end_bio;
+	} else {
+		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
+		bio_inc_remaining(orig_bio);
+		bio->bi_end_io = btrfs_clone_write_end_io;
+	}
+
+	bio->bi_private = &bioc->stripes[dev_nr];
+	bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT;
+	bioc->stripes[dev_nr].bioc = bioc;
+	btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio);
+}
+
 void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror_num)
 {
 	u64 logical = bio->bi_iter.bi_sector << 9;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-19 23:53   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional Christoph Hellwig
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Remove the orig_bio argument as it can be derived from the bioc, and
the clone argument as it can be calculated from bioc and dev_nr.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5c5beb789594d..df8e2a5b88bb0 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6782,13 +6782,12 @@ static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)
 	submit_bio(bio);
 }
 
-static void submit_stripe_bio(struct btrfs_io_context *bioc,
-			      struct bio *orig_bio, int dev_nr, bool clone)
+static void submit_stripe_bio(struct btrfs_io_context *bioc, int dev_nr)
 {
-	struct bio *bio;
+	struct bio *orig_bio = bioc->orig_bio, *bio;
 
 	/* Reuse the bio embedded into the btrfs_bio for the last mirror */
-	if (!clone) {
+	if (dev_nr == bioc->num_stripes - 1) {
 		bio = orig_bio;
 		btrfs_bio(bio)->device = bioc->stripes[dev_nr].dev;
 		bio->bi_end_io = btrfs_end_bio;
@@ -6844,11 +6843,8 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 		BUG();
 	}
 
-	for (dev_nr = 0; dev_nr < total_devs; dev_nr++) {
-		const bool should_clone = (dev_nr < total_devs - 1);
-
-		submit_stripe_bio(bioc, bio, dev_nr, should_clone);
-	}
+	for (dev_nr = 0; dev_nr < total_devs; dev_nr++)
+		submit_stripe_bio(bioc, dev_nr);
 }
 
 static bool dev_args_match_fs_devices(const struct btrfs_dev_lookup_args *args,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-20 11:34   ` Anand Jain
  2022-08-06  8:03 ` [PATCH 11/11] btrfs: stop allocation a btrfs_io_context for simple I/O Christoph Hellwig
  2022-08-23  4:33 ` btrfs I/O completion cleanup and single device I/O optimizations v3 Anand Jain
  11 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

There is no need for most of the btrfs_io_context when doing I/O to a
single device.  To support such I/O without the extra btrfs_io_context
allocation, turn the mirror_num argument into a pointer so that it can
be used to output the selected mirror number, and add an optional
argument that points to a btrfs_io_stripe structure, which will be
filled with a single extent if provided by the caller.  In that case
the btrfs_io_context allocation can be skipped as all information for
the single device I/O is provided in the mirror_num argument and the
on-stack btrfs_io_stripe.  A caller that makes use of this new
argument will be added in the next commit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 60 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 46 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index df8e2a5b88bb0..43ed4fb21f8ac 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -249,10 +249,10 @@ static int init_first_rw_device(struct btrfs_trans_handle *trans);
 static int btrfs_relocate_sys_chunks(struct btrfs_fs_info *fs_info);
 static void btrfs_dev_stat_print_on_load(struct btrfs_device *device);
 static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
-			     enum btrfs_map_op op,
-			     u64 logical, u64 *length,
+			     enum btrfs_map_op op, u64 logical, u64 *length,
 			     struct btrfs_io_context **bioc_ret,
-			     int mirror_num, int need_raid_map);
+			     struct btrfs_io_stripe *smap,
+			     int *mirror_num_p, int need_raid_map);
 
 /*
  * Device locking
@@ -6087,7 +6087,7 @@ static int get_extra_mirror_from_replace(struct btrfs_fs_info *fs_info,
 	int ret = 0;
 
 	ret = __btrfs_map_block(fs_info, BTRFS_MAP_GET_READ_MIRRORS,
-				logical, &length, &bioc, 0, 0);
+				logical, &length, &bioc, NULL, NULL, 0);
 	if (ret) {
 		ASSERT(bioc == NULL);
 		return ret;
@@ -6344,11 +6344,19 @@ int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em,
 	return 0;
 }
 
+static void set_stripe(struct btrfs_io_stripe *dst, struct map_lookup *map,
+		       u32 stripe_index, u64 stripe_offset, u64 stripe_nr)
+{
+	dst->dev = map->stripes[stripe_index].dev;
+	dst->physical = map->stripes[stripe_index].physical +
+			stripe_offset + stripe_nr * map->stripe_len;
+}
+
 static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
-			     enum btrfs_map_op op,
-			     u64 logical, u64 *length,
+			     enum btrfs_map_op op, u64 logical, u64 *length,
 			     struct btrfs_io_context **bioc_ret,
-			     int mirror_num, int need_raid_map)
+			     struct btrfs_io_stripe *smap,
+			     int *mirror_num_p, int need_raid_map)
 {
 	struct extent_map *em;
 	struct map_lookup *map;
@@ -6359,6 +6367,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
 	int data_stripes;
 	int i;
 	int ret = 0;
+	int mirror_num = mirror_num_p ? *mirror_num_p : 0;
 	int num_stripes;
 	int max_errors = 0;
 	int tgtdev_indexes = 0;
@@ -6519,6 +6528,29 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
 		tgtdev_indexes = num_stripes;
 	}
 
+	/*
+	 * If this I/O maps to a single device, try to return the device and
+	 * physical block information on the stack instead of allocating an
+	 * I/O context structure.
+	 */
+	if (smap && num_alloc_stripes == 1 &&
+	    !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1) &&
+	    (!need_full_stripe(op) || !dev_replace_is_ongoing ||
+	     !dev_replace->tgtdev)) {
+		if (unlikely(patch_the_first_stripe_for_dev_replace)) {
+			smap->dev = dev_replace->tgtdev;
+			smap->physical = physical_to_patch_in_first_stripe;
+			*mirror_num_p = map->num_stripes + 1;
+		} else {
+			set_stripe(smap, map, stripe_index, stripe_offset,
+				   stripe_nr);
+			*mirror_num_p = mirror_num;
+		}
+		*bioc_ret = NULL;
+		ret = 0;
+		goto out;
+	}
+
 	bioc = alloc_btrfs_io_context(fs_info, num_alloc_stripes, tgtdev_indexes);
 	if (!bioc) {
 		ret = -ENOMEM;
@@ -6526,9 +6558,8 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
 	}
 
 	for (i = 0; i < num_stripes; i++) {
-		bioc->stripes[i].physical = map->stripes[stripe_index].physical +
-			stripe_offset + stripe_nr * map->stripe_len;
-		bioc->stripes[i].dev = map->stripes[stripe_index].dev;
+		set_stripe(&bioc->stripes[i], map, stripe_index, stripe_offset,
+			   stripe_nr);
 		stripe_index++;
 	}
 
@@ -6596,7 +6627,7 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 		      struct btrfs_io_context **bioc_ret, int mirror_num)
 {
 	return __btrfs_map_block(fs_info, op, logical, length, bioc_ret,
-				 mirror_num, 0);
+				 NULL, &mirror_num, 0);
 }
 
 /* For Scrub/replace */
@@ -6604,7 +6635,8 @@ int btrfs_map_sblock(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
 		     u64 logical, u64 *length,
 		     struct btrfs_io_context **bioc_ret)
 {
-	return __btrfs_map_block(fs_info, op, logical, length, bioc_ret, 0, 1);
+	return __btrfs_map_block(fs_info, op, logical, length, bioc_ret,
+				 NULL, NULL, 1);
 }
 
 /*
@@ -6814,8 +6846,8 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 	struct btrfs_io_context *bioc = NULL;
 
 	btrfs_bio_counter_inc_blocked(fs_info);
-	ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical,
-				&map_length, &bioc, mirror_num, 1);
+	ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length,
+				&bioc, NULL, &mirror_num, 1);
 	if (ret) {
 		btrfs_bio_counter_dec(fs_info);
 		btrfs_bio_end_io(btrfs_bio(bio), errno_to_blk_status(ret));
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 11/11] btrfs: stop allocation a btrfs_io_context for simple I/O
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional Christoph Hellwig
@ 2022-08-06  8:03 ` Christoph Hellwig
  2022-08-23  4:33 ` btrfs I/O completion cleanup and single device I/O optimizations v3 Anand Jain
  11 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-06  8:03 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

The I/O context structure is only used to pass the btrfs_device to
the end I/O handler for I/Os that go to a single device.

Stop allocating the I/O context for these cases by passing the optional
btrfs_io_stripe argument to __btrfs_map_block to query the mapping
information and then using a fast path submission and I/O completion
handler.  As the old btrfs_io_context based I/O submission path is
only used for mirrored writes, rename the functions to make that
clear and stop setting the btrfs_bio device and mirror_num field
that is only used for reads.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 94 ++++++++++++++++++++++++++++------------------
 1 file changed, 57 insertions(+), 37 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 43ed4fb21f8ac..70ff11c68f0f6 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6700,11 +6700,12 @@ static void btrfs_log_dev_io_error(struct bio *bio, struct btrfs_device *dev)
 		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_FLUSH_ERRS);
 }
 
-static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_io_context *bioc)
+static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_fs_info *fs_info,
+						struct bio *bio)
 {
-	if (bioc->orig_bio->bi_opf & REQ_META)
-		return bioc->fs_info->endio_meta_workers;
-	return bioc->fs_info->endio_workers;
+	if (bio->bi_opf & REQ_META)
+		return fs_info->endio_meta_workers;
+	return fs_info->endio_workers;
 }
 
 static void btrfs_end_bio_work(struct work_struct *work)
@@ -6715,6 +6716,24 @@ static void btrfs_end_bio_work(struct work_struct *work)
 	bbio->end_io(bbio);
 }
 
+static void btrfs_simple_end_io(struct bio *bio)
+{
+	struct btrfs_fs_info *fs_info = bio->bi_private;
+	struct btrfs_bio *bbio = btrfs_bio(bio);
+
+	btrfs_bio_counter_dec(fs_info);
+
+	if (bio->bi_status)
+		btrfs_log_dev_io_error(bio, bbio->device);
+
+	if (bio_op(bio) == REQ_OP_READ) {
+		INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work);
+		queue_work(btrfs_end_io_wq(fs_info, bio), &bbio->end_io_work);
+	} else {
+		bbio->end_io(bbio);
+	}
+}
+
 static void btrfs_raid56_end_io(struct bio *bio)
 {
 	struct btrfs_io_context *bioc = bio->bi_private;
@@ -6727,7 +6746,7 @@ static void btrfs_raid56_end_io(struct bio *bio)
 	btrfs_put_bioc(bioc);
 }
 
-static void btrfs_end_bio(struct bio *bio)
+static void btrfs_orig_write_end_io(struct bio *bio)
 {
 	struct btrfs_io_stripe *stripe = bio->bi_private;
 	struct btrfs_io_context *bioc = stripe->bioc;
@@ -6740,8 +6759,6 @@ static void btrfs_end_bio(struct bio *bio)
 		btrfs_log_dev_io_error(bio, stripe->dev);
 	}
 
-	bbio->mirror_num = bioc->mirror_num;
-
 	/*
 	 * Only send an error to the higher layers if it is beyond the tolerance
 	 * threshold.
@@ -6751,13 +6768,7 @@ static void btrfs_end_bio(struct bio *bio)
 	else
 		bio->bi_status = BLK_STS_OK;
 
-	if (btrfs_op(bio) == BTRFS_MAP_READ) {
-		INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work);
-		queue_work(btrfs_end_io_wq(bioc), &bbio->end_io_work);
-	} else {
-		bbio->end_io(bbio);
-	}
-
+	bbio->end_io(bbio);
 	btrfs_put_bioc(bioc);
 }
 
@@ -6814,15 +6825,16 @@ static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)
 	submit_bio(bio);
 }
 
-static void submit_stripe_bio(struct btrfs_io_context *bioc, int dev_nr)
+static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr)
 {
 	struct bio *orig_bio = bioc->orig_bio, *bio;
 
+	ASSERT(bio_op(orig_bio) != REQ_OP_READ);
+
 	/* Reuse the bio embedded into the btrfs_bio for the last mirror */
 	if (dev_nr == bioc->num_stripes - 1) {
 		bio = orig_bio;
-		btrfs_bio(bio)->device = bioc->stripes[dev_nr].dev;
-		bio->bi_end_io = btrfs_end_bio;
+		bio->bi_end_io = btrfs_orig_write_end_io;
 	} else {
 		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
 		bio_inc_remaining(orig_bio);
@@ -6840,34 +6852,19 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 	u64 logical = bio->bi_iter.bi_sector << 9;
 	u64 length = bio->bi_iter.bi_size;
 	u64 map_length = length;
-	int ret;
-	int dev_nr;
-	int total_devs;
 	struct btrfs_io_context *bioc = NULL;
+	struct btrfs_io_stripe smap;
+	int ret;
 
 	btrfs_bio_counter_inc_blocked(fs_info);
 	ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length,
-				&bioc, NULL, &mirror_num, 1);
+				&bioc, &smap, &mirror_num, 1);
 	if (ret) {
 		btrfs_bio_counter_dec(fs_info);
 		btrfs_bio_end_io(btrfs_bio(bio), errno_to_blk_status(ret));
 		return;
 	}
 
-	total_devs = bioc->num_stripes;
-	bioc->orig_bio = bio;
-
-	if ((bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
-	    ((btrfs_op(bio) == BTRFS_MAP_WRITE) || (mirror_num > 1))) {
-		bio->bi_private = bioc;
-		bio->bi_end_io = btrfs_raid56_end_io;
-		if (btrfs_op(bio) == BTRFS_MAP_WRITE)
-			raid56_parity_write(bio, bioc);
-		else
-			raid56_parity_recover(bio, bioc, mirror_num);
-		return;
-	}
-
 	if (map_length < length) {
 		btrfs_crit(fs_info,
 			   "mapping failed logical %llu bio len %llu len %llu",
@@ -6875,8 +6872,31 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
 		BUG();
 	}
 
-	for (dev_nr = 0; dev_nr < total_devs; dev_nr++)
-		submit_stripe_bio(bioc, dev_nr);
+	if (!bioc) {
+		/* Single mirror read/write fast path */
+		btrfs_bio(bio)->mirror_num = mirror_num;
+		btrfs_bio(bio)->device = smap.dev;
+		bio->bi_iter.bi_sector = smap.physical >> SECTOR_SHIFT;
+		bio->bi_private = fs_info;
+		bio->bi_end_io = btrfs_simple_end_io;
+		btrfs_submit_dev_bio(smap.dev, bio);
+	} else if (bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
+		/* Parity RAID write or read recovery */
+		bio->bi_private = bioc;
+		bio->bi_end_io = btrfs_raid56_end_io;
+		if (bio_op(bio) == REQ_OP_READ)
+			raid56_parity_recover(bio, bioc, mirror_num);
+		else
+			raid56_parity_write(bio, bioc);
+	} else {
+		/* Write to multiple mirrors */
+		int total_devs = bioc->num_stripes;
+		int dev_nr;
+
+		bioc->orig_bio = bio;
+		for (dev_nr = 0; dev_nr < total_devs; dev_nr++)
+			btrfs_submit_mirrored_bio(bioc, dev_nr);
+	}
 }
 
 static bool dev_args_match_fs_devices(const struct btrfs_dev_lookup_args *args,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset
  2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
@ 2022-08-08  5:06   ` Anand Jain
  2022-08-08  6:47     ` Christoph Hellwig
  2022-08-08 11:24   ` Anand Jain
  1 sibling, 1 reply; 32+ messages in thread
From: Anand Jain @ 2022-08-08  5:06 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 06/08/2022 16:03, Christoph Hellwig wrote:
> btrfs never uses bio integrity data itself, so don't allocate
> the integrity pools for btrfs_bioset.
> 

  (I couldn't catch up with v1 / v2, sorry for the late comments).

  Two questions:

  This patch is a revert of the commit b208c2f7ceaf ("btrfs: Fix crash 
due to not allocating integrity data for a set"). So nowadays, integrity 
data pool allocation is not mandatory?

  Why not complete the support of bio integrity metadata instead?

Thanks, Anand

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/extent_io.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 6e8e936a8a1ef..ca8b79d991f5e 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -255,14 +255,8 @@ int __init extent_io_init(void)
>   			BIOSET_NEED_BVECS))
>   		goto free_buffer_cache;
>   
> -	if (bioset_integrity_create(&btrfs_bioset, BIO_POOL_SIZE))
> -		goto free_bioset;
> -
>   	return 0;
>   
> -free_bioset:
> -	bioset_exit(&btrfs_bioset);
> -
>   free_buffer_cache:
>   	kmem_cache_destroy(extent_buffer_cache);
>   	extent_buffer_cache = NULL;


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset
  2022-08-08  5:06   ` Anand Jain
@ 2022-08-08  6:47     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-08-08  6:47 UTC (permalink / raw)
  To: Anand Jain
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba,
	Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On Mon, Aug 08, 2022 at 01:06:29PM +0800, Anand Jain wrote:
>  This patch is a revert of the commit b208c2f7ceaf ("btrfs: Fix crash due 
> to not allocating integrity data for a set"). So nowadays, integrity data 
> pool allocation is not mandatory?

Yes, the bio-integrity code now handles allocating the integrity
payload without that.

>  Why not complete the support of bio integrity metadata instead?

That is very much an unrelated and complex feature.

Where would you store the t10-pi style checksums in btrfs?  Note that
they are different algorithms from those currently supported by btrfs.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc
  2022-08-06  8:03 ` [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc Christoph Hellwig
@ 2022-08-08 11:20   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-08 11:20 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 8/6/22 16:03, Christoph Hellwig wrote:
> Pass the operation to btrfs_bio_alloc, matching what bio_alloc_bioset
> set does. 
Got it.
bio_alloc_bioset()
  bio_init()
::
	bio->bi_opf = opf;



Reviewed-by: Anand Jain <anand.jain@oracle.com>




> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/compression.c | 3 +--
>   fs/btrfs/extent_io.c   | 6 ++----
>   fs/btrfs/inode.c       | 3 +--
>   fs/btrfs/volumes.c     | 4 ++--
>   fs/btrfs/volumes.h     | 2 +-
>   5 files changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index f3df9b9b43816..e124096eacddd 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -344,10 +344,9 @@ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_byte
>   	struct bio *bio;
>   	int ret;
>   
> -	bio = btrfs_bio_alloc(BIO_MAX_VECS);
> +	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
>   
>   	bio->bi_iter.bi_sector = disk_bytenr >> SECTOR_SHIFT;
> -	bio->bi_opf = opf;
>   	bio->bi_private = cb;
>   	bio->bi_end_io = endio_func;
>   
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 068208244d925..fae8c1899226b 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2626,10 +2626,9 @@ int btrfs_repair_one_sector(struct inode *inode, struct btrfs_bio *failed_bbio,
>   		return -EIO;
>   	}
>   
> -	repair_bio = btrfs_bio_alloc(1);
> +	repair_bio = btrfs_bio_alloc(1, REQ_OP_READ);
>   	repair_bbio = btrfs_bio(repair_bio);
>   	repair_bbio->file_offset = start;
> -	repair_bio->bi_opf = REQ_OP_READ;
>   	repair_bio->bi_end_io = failed_bio->bi_end_io;
>   	repair_bio->bi_iter.bi_sector = failrec->logical >> 9;
>   	repair_bio->bi_private = failed_bio->bi_private;
> @@ -3266,7 +3265,7 @@ static int alloc_new_bio(struct btrfs_inode *inode,
>   	struct bio *bio;
>   	int ret;
>   
> -	bio = btrfs_bio_alloc(BIO_MAX_VECS);
> +	bio = btrfs_bio_alloc(BIO_MAX_VECS, opf);
>   	/*
>   	 * For compressed page range, its disk_bytenr is always @disk_bytenr
>   	 * passed in, no matter if we have added any range into previous bio.
> @@ -3278,7 +3277,6 @@ static int alloc_new_bio(struct btrfs_inode *inode,
>   	bio_ctrl->bio = bio;
>   	bio_ctrl->compress_type = compress_type;
>   	bio->bi_end_io = end_io_func;
> -	bio->bi_opf = opf;
>   	ret = calc_bio_boundaries(bio_ctrl, inode, file_offset);
>   	if (ret < 0)
>   		goto error;
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 8cdb173331c7c..dc2cf58095c2e 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -10489,12 +10489,11 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
>   			size_t bytes = min_t(u64, remaining, PAGE_SIZE);
>   
>   			if (!bio) {
> -				bio = btrfs_bio_alloc(BIO_MAX_VECS);
> +				bio = btrfs_bio_alloc(BIO_MAX_VECS, REQ_OP_READ);
>   				bio->bi_iter.bi_sector =
>   					(disk_bytenr + cur) >> SECTOR_SHIFT;
>   				bio->bi_end_io = btrfs_encoded_read_endio;
>   				bio->bi_private = &priv;
> -				bio->bi_opf = REQ_OP_READ;
>   			}
>   
>   			if (!bytes ||
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bb614bb890709..a73bac7f42624 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6624,11 +6624,11 @@ static inline void btrfs_bio_init(struct btrfs_bio *bbio)
>    * Just like the underlying bio_alloc_bioset it will no fail as it is backed by
>    * a mempool.
>    */
> -struct bio *btrfs_bio_alloc(unsigned int nr_vecs)
> +struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf)
>   {
>   	struct bio *bio;
>   
> -	bio = bio_alloc_bioset(NULL, nr_vecs, 0, GFP_NOFS, &btrfs_bioset);
> +	bio = bio_alloc_bioset(NULL, nr_vecs, opf, GFP_NOFS, &btrfs_bioset);
>   	btrfs_bio_init(btrfs_bio(bio));
>   	return bio;
>   }
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index fb57b50fb60b1..bd108c7ed1ac3 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -393,7 +393,7 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio)
>   	return container_of(bio, struct btrfs_bio, bio);
>   }
>   
> -struct bio *btrfs_bio_alloc(unsigned int nr_vecs);
> +struct bio *btrfs_bio_alloc(unsigned int nr_vecs, unsigned int opf);
>   struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
>   
>   static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset
  2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
  2022-08-08  5:06   ` Anand Jain
@ 2022-08-08 11:24   ` Anand Jain
  1 sibling, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-08 11:24 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 8/6/22 16:03, Christoph Hellwig wrote:
> btrfs never uses bio integrity data itself, so don't allocate
> the integrity pools for btrfs_bioset.
> 

Reviewed-by: Anand Jain <anand.jain@oracle.com>



> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/extent_io.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 6e8e936a8a1ef..ca8b79d991f5e 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -255,14 +255,8 @@ int __init extent_io_init(void)
>   			BIOSET_NEED_BVECS))
>   		goto free_buffer_cache;
>   
> -	if (bioset_integrity_create(&btrfs_bioset, BIO_POOL_SIZE))
> -		goto free_bioset;
> -
>   	return 0;
>   
> -free_bioset:
> -	bioset_exit(&btrfs_bioset);
> -
>   free_buffer_cache:
>   	kmem_cache_destroy(extent_buffer_cache);
>   	extent_buffer_cache = NULL;


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c
  2022-08-06  8:03 ` [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c Christoph Hellwig
@ 2022-08-10 14:42   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-10 14:42 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 06/08/2022 16:03, Christoph Hellwig wrote:
> volumes.c is the place that implements the storage layer using the
> btrfs_bio structure, so move the bio_set and allocation helpers there
> as well.
> 
> To make up for the new initialization boilerplate, merge the two
> init/exit helpers in extent_io.c into a single one.
> 


Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks, Anand


> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/extent-io-tree.h |  4 +--
>   fs/btrfs/extent_io.c      | 74 ++++-----------------------------------
>   fs/btrfs/extent_io.h      |  3 --
>   fs/btrfs/super.c          |  6 ++--
>   fs/btrfs/volumes.c        | 58 ++++++++++++++++++++++++++++++
>   fs/btrfs/volumes.h        |  3 ++
>   6 files changed, 72 insertions(+), 76 deletions(-)
> 
> diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h
> index c3eb52dbe61cc..819763ace0aca 100644
> --- a/fs/btrfs/extent-io-tree.h
> +++ b/fs/btrfs/extent-io-tree.h
> @@ -96,8 +96,8 @@ struct extent_state {
>   #endif
>   };
>   
> -int __init extent_state_cache_init(void);
> -void __cold extent_state_cache_exit(void);
> +int __init volumes_init(void);
> +void __cold volumes_exit(void);
>   
>   void extent_io_tree_init(struct btrfs_fs_info *fs_info,
>   			 struct extent_io_tree *tree, unsigned int owner,
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index ca8b79d991f5e..068208244d925 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -33,7 +33,6 @@
>   
>   static struct kmem_cache *extent_state_cache;
>   static struct kmem_cache *extent_buffer_cache;
> -static struct bio_set btrfs_bioset;
>   
>   static inline bool extent_state_in_tree(const struct extent_state *state)
>   {
> @@ -232,41 +231,23 @@ static void submit_write_bio(struct extent_page_data *epd, int ret)
>   	}
>   }
>   
> -int __init extent_state_cache_init(void)
> +int __init extent_io_init(void)
>   {
>   	extent_state_cache = kmem_cache_create("btrfs_extent_state",
>   			sizeof(struct extent_state), 0,
>   			SLAB_MEM_SPREAD, NULL);
>   	if (!extent_state_cache)
>   		return -ENOMEM;
> -	return 0;
> -}
>   
> -int __init extent_io_init(void)
> -{
>   	extent_buffer_cache = kmem_cache_create("btrfs_extent_buffer",
>   			sizeof(struct extent_buffer), 0,
>   			SLAB_MEM_SPREAD, NULL);
> -	if (!extent_buffer_cache)
> +	if (!extent_buffer_cache) {
> +		kmem_cache_destroy(extent_state_cache);
>   		return -ENOMEM;
> -
> -	if (bioset_init(&btrfs_bioset, BIO_POOL_SIZE,
> -			offsetof(struct btrfs_bio, bio),
> -			BIOSET_NEED_BVECS))
> -		goto free_buffer_cache;
> +	}
>   
>   	return 0;
> -
> -free_buffer_cache:
> -	kmem_cache_destroy(extent_buffer_cache);
> -	extent_buffer_cache = NULL;
> -	return -ENOMEM;
> -}
> -
> -void __cold extent_state_cache_exit(void)
> -{
> -	btrfs_extent_state_leak_debug_check();
> -	kmem_cache_destroy(extent_state_cache);
>   }
>   
>   void __cold extent_io_exit(void)
> @@ -277,7 +258,8 @@ void __cold extent_io_exit(void)
>   	 */
>   	rcu_barrier();
>   	kmem_cache_destroy(extent_buffer_cache);
> -	bioset_exit(&btrfs_bioset);
> +	btrfs_extent_state_leak_debug_check();
> +	kmem_cache_destroy(extent_state_cache);
>   }
>   
>   /*
> @@ -3156,50 +3138,6 @@ int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array)
>   	return 0;
>   }
>   
> -/*
> - * Initialize the members up to but not including 'bio'. Use after allocating a
> - * new bio by bio_alloc_bioset as it does not initialize the bytes outside of
> - * 'bio' because use of __GFP_ZERO is not supported.
> - */
> -static inline void btrfs_bio_init(struct btrfs_bio *bbio)
> -{
> -	memset(bbio, 0, offsetof(struct btrfs_bio, bio));
> -}
> -
> -/*
> - * Allocate a btrfs_io_bio, with @nr_iovecs as maximum number of iovecs.
> - *
> - * The bio allocation is backed by bioset and does not fail.
> - */
> -struct bio *btrfs_bio_alloc(unsigned int nr_iovecs)
> -{
> -	struct bio *bio;
> -
> -	ASSERT(0 < nr_iovecs && nr_iovecs <= BIO_MAX_VECS);

  This check was removed
> -	bio = bio_alloc_bioset(NULL, nr_iovecs, 0, GFP_NOFS, &btrfs_bioset);
> -	btrfs_bio_init(btrfs_bio(bio));
> -	return bio;
> -}
> -
> -struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
> -{
> -	struct bio *bio;
> -	struct btrfs_bio *bbio;
> -
> -	ASSERT(offset <= UINT_MAX && size <= UINT_MAX);
> -
> -	/* this will never fail when it's backed by a bioset */
> -	bio = bio_alloc_clone(orig->bi_bdev, orig, GFP_NOFS, &btrfs_bioset);
> -	ASSERT(bio);
> -
> -	bbio = btrfs_bio(bio);
> -	btrfs_bio_init(bbio);
> -
> -	bio_trim(bio, offset >> 9, size >> 9);
> -	bbio->iter = bio->bi_iter;
> -	return bio;
> -}
> -
>   /**
>    * Attempt to add a page to bio
>    *
> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> index 4bc72a87b9a99..69a86ae6fd508 100644
> --- a/fs/btrfs/extent_io.h
> +++ b/fs/btrfs/extent_io.h
> @@ -60,7 +60,6 @@ enum {
>   struct btrfs_bio;
>   struct btrfs_root;
>   struct btrfs_inode;
> -struct btrfs_io_bio;
>   struct btrfs_fs_info;
>   struct io_failure_record;
>   struct extent_io_tree;
> @@ -242,8 +241,6 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
>   				  u32 bits_to_clear, unsigned long page_ops);
>   
>   int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array);
> -struct bio *btrfs_bio_alloc(unsigned int nr_iovecs);
> -struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
>   
>   void end_extent_writepage(struct page *page, int err, u64 start, u64 end);
>   int btrfs_repair_eb_io_failure(const struct extent_buffer *eb, int mirror_num);
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 4c7089b1681b3..96b22ee7b12ce 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -2664,7 +2664,7 @@ static int __init init_btrfs_fs(void)
>   	if (err)
>   		goto free_cachep;
>   
> -	err = extent_state_cache_init();
> +	err = volumes_init();
>   	if (err)
>   		goto free_extent_io;
>   
> @@ -2723,7 +2723,7 @@ static int __init init_btrfs_fs(void)
>   free_extent_map:
>   	extent_map_exit();
>   free_extent_state_cache:
> -	extent_state_cache_exit();
> +	volumes_exit();
>   free_extent_io:
>   	extent_io_exit();
>   free_cachep:
> @@ -2744,7 +2744,7 @@ static void __exit exit_btrfs_fs(void)
>   	btrfs_prelim_ref_exit();
>   	ordered_data_exit();
>   	extent_map_exit();
> -	extent_state_cache_exit();
> +	volumes_exit();
>   	extent_io_exit();
>   	btrfs_interface_exit();
>   	unregister_filesystem(&btrfs_fs_type);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 8c64dda69404e..bb614bb890709 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -34,6 +34,8 @@
>   #include "discard.h"
>   #include "zoned.h"
>   
> +static struct bio_set btrfs_bioset;
> +
>   #define BTRFS_BLOCK_GROUP_STRIPE_MASK	(BTRFS_BLOCK_GROUP_RAID0 | \
>   					 BTRFS_BLOCK_GROUP_RAID10 | \
>   					 BTRFS_BLOCK_GROUP_RAID56_MASK)
> @@ -6606,6 +6608,48 @@ int btrfs_map_sblock(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
>   	return __btrfs_map_block(fs_info, op, logical, length, bioc_ret, 0, 1);
>   }
>   
> +/*
> + * Initialize a btrfs_bio structure.  This skips the embedded bio itself as it
> + * is already initialized by the block layer.
> + */
> +static inline void btrfs_bio_init(struct btrfs_bio *bbio)
> +{
> +	memset(bbio, 0, offsetof(struct btrfs_bio, bio));
> +}
> +
> +/*
> + * Allocate a btrfs_bio structure.  The btrfs_bio is the main I/O container for
> + * btrfs, and is used for all I/O submitted through btrfs_submit_bio.
> + *
> + * Just like the underlying bio_alloc_bioset it will no fail as it is backed by
> + * a mempool.
> + */
> +struct bio *btrfs_bio_alloc(unsigned int nr_vecs)
> +{
> +	struct bio *bio;
> +
> +	bio = bio_alloc_bioset(NULL, nr_vecs, 0, GFP_NOFS, &btrfs_bioset);
> +	btrfs_bio_init(btrfs_bio(bio));
> +	return bio;
> +}
> +
> +struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
> +{
> +	struct bio *bio;
> +	struct btrfs_bio *bbio;
> +
> +	ASSERT(offset <= UINT_MAX && size <= UINT_MAX);
> +
> +	bio = bio_alloc_clone(orig->bi_bdev, orig, GFP_NOFS, &btrfs_bioset);
> +	bbio = btrfs_bio(bio);
> +	btrfs_bio_init(bbio);
> +
> +	bio_trim(bio, offset >> 9, size >> 9);
> +	bbio->iter = bio->bi_iter;
> +	return bio;
> +
> +}
> +
>   static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_io_context *bioc)
>   {
>   	if (bioc->orig_bio->bi_opf & REQ_META)
> @@ -8283,3 +8327,17 @@ bool btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical)
>   
>   	return true;
>   }
> +
> +int __init volumes_init(void)
> +{
> +	if (bioset_init(&btrfs_bioset, BIO_POOL_SIZE,
> +			offsetof(struct btrfs_bio, bio),
> +			BIOSET_NEED_BVECS))
> +		return -ENOMEM;
> +	return 0;
> +}
> +
> +void __cold volumes_exit(void)
> +{
> +	bioset_exit(&btrfs_bioset);
> +}
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 5639961b3626f..fb57b50fb60b1 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -393,6 +393,9 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio)
>   	return container_of(bio, struct btrfs_bio, bio);
>   }
>   
> +struct bio *btrfs_bio_alloc(unsigned int nr_vecs);
> +struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size);
> +
>   static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio)
>   {
>   	if (bbio->csum != bbio->csum_inline) {


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios
  2022-08-06  8:03 ` [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios Christoph Hellwig
@ 2022-08-18 11:30   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-18 11:30 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 8/6/22 16:03, Christoph Hellwig wrote:
> Stop grabbing an extra bio_counter reference for each clone bio in a
> mirrored write and instead just release the one original reference in
> btrfs_end_bioc once all the bios for a single btfs_bio have completed
> instead of at the end of btrfs_submit_bio once all bios have been
> submitted.
> 
> This means the reference is now carried by the "upper" btrfs_bio only
> instead of each lower bio.
> 
> Also remove the now unused btrfs_bio_counter_inc_noblocked helper.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>

> ---
>   fs/btrfs/ctree.h       | 1 -
>   fs/btrfs/dev-replace.c | 5 -----
>   fs/btrfs/volumes.c     | 6 ++----
>   3 files changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 51c4804392637..e9a1d0016804a 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -4079,7 +4079,6 @@ static inline void btrfs_init_full_stripe_locks_tree(
>   
>   /* dev-replace.c */
>   void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info);
> -void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info);
>   void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount);
>   
>   static inline void btrfs_bio_counter_dec(struct btrfs_fs_info *fs_info)
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 488f2105c5d0d..c26b3196a3f17 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -1284,11 +1284,6 @@ int __pure btrfs_dev_replace_is_ongoing(struct btrfs_dev_replace *dev_replace)
>   	return 1;
>   }
>   
> -void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
> -{
> -	percpu_counter_inc(&fs_info->dev_replace.bio_counter);
> -}
> -
>   void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
>   {
>   	percpu_counter_sub(&fs_info->dev_replace.bio_counter, amount);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a73bac7f42624..014df2e64e67b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6670,6 +6670,8 @@ static void btrfs_end_bioc(struct btrfs_io_context *bioc, bool async)
>   	struct bio *orig_bio = bioc->orig_bio;
>   	struct btrfs_bio *bbio = btrfs_bio(orig_bio);
>   
> +	btrfs_bio_counter_dec(bioc->fs_info);
> +
>   	bbio->mirror_num = bioc->mirror_num;
>   	orig_bio->bi_private = bioc->private;
>   	orig_bio->bi_end_io = bioc->end_io;
> @@ -6717,7 +6719,6 @@ static void btrfs_end_bio(struct bio *bio)
>   	if (bio != bioc->orig_bio)
>   		bio_put(bio);
>   
> -	btrfs_bio_counter_dec(bioc->fs_info);
>   	if (atomic_dec_and_test(&bioc->stripes_pending))
>   		btrfs_end_bioc(bioc, true);
>   }
> @@ -6772,8 +6773,6 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
>   		(unsigned long)dev->bdev->bd_dev, rcu_str_deref(dev->name),
>   		dev->devid, bio->bi_iter.bi_size);
>   
> -	btrfs_bio_counter_inc_noblocked(fs_info);
> -
>   	btrfsic_check_bio(bio);
>   	submit_bio(bio);
>   }
> @@ -6825,7 +6824,6 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
>   
>   		submit_stripe_bio(bioc, bio, dev_nr, should_clone);
>   	}
> -	btrfs_bio_counter_dec(fs_info);
>   }
>   
>   static bool dev_args_match_fs_devices(const struct btrfs_dev_lookup_args *args,


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/11] btrfs: remove bioc->stripes_pending
  2022-08-06  8:03 ` [PATCH 05/11] btrfs: remove bioc->stripes_pending Christoph Hellwig
@ 2022-08-18 11:33   ` Anand Jain
  2022-08-18 23:36     ` Anand Jain
  2022-08-18 23:37   ` Anand Jain
  1 sibling, 1 reply; 32+ messages in thread
From: Anand Jain @ 2022-08-18 11:33 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

On 8/6/22 16:03, Christoph Hellwig wrote:
> The stripes_pending in the btrfs_io_context counts number of inflight
> low-level bios for an upper btrfs_bio.  For reads this is generally
> one as reads are never cloned, while for writes we can trivially use
> the bio remaining mechanisms that is used for chained bios.
> 
> To be able to make use of that mechanism, split out a separate trivial
> end_io handler for the cloned bios that does a minimal amount of error
> tracking and which then calls bio_endio on the original bio to transfer
> control to that, with the remaining counter making sure it is completed
> last.  This then allows to merge btrfs_end_bioc into the original bio
> bi_end_io handler.  To make this all work all error handling needs to
> happen through the bi_end_io handler, which requires a small amount
> of reshuffling in submit_stripe_bio so that the bio is cloned already
> by the time the suitability of the device is checked.
> 
> This reduces the size of the btrfs_io_context and prepares splitting
> the btrfs_bio at the stripe boundary.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   fs/btrfs/volumes.c | 94 ++++++++++++++++++++++++----------------------
>   fs/btrfs/volumes.h |  1 -
>   2 files changed, 50 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 014df2e64e67b..8775f2a635919 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -5892,7 +5892,6 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_
>   		sizeof(u64) * (total_stripes),
>   		GFP_NOFS|__GFP_NOFAIL);
>   
> -	atomic_set(&bioc->error, 0);
>   	refcount_set(&bioc->refs, 1);
>   
>   	bioc->fs_info = fs_info;
> @@ -6647,7 +6646,21 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size)
>   	bio_trim(bio, offset >> 9, size >> 9);
>   	bbio->iter = bio->bi_iter;
>   	return bio;
> +}
> +
> +static void btrfs_log_dev_io_error(struct bio *bio, struct btrfs_device *dev)
> +{
> +	if (!dev || !dev->bdev)
> +		return;
> +	if (bio->bi_status != BLK_STS_IOERR && bio->bi_status != BLK_STS_TARGET)
> +		return;
>   
> +	if (btrfs_op(bio) == BTRFS_MAP_WRITE)
> +		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_WRITE_ERRS);
> +	if (!(bio->bi_opf & REQ_RAHEAD))
> +		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_READ_ERRS);
> +	if (bio->bi_opf & REQ_PREFLUSH)
> +		btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_FLUSH_ERRS);
>   }
>   
>   static struct workqueue_struct *btrfs_end_io_wq(struct btrfs_io_context *bioc)
> @@ -6665,62 +6678,54 @@ static void btrfs_end_bio_work(struct work_struct *work)
>   	bio_endio(&bbio->bio);
>   }
>   
> -static void btrfs_end_bioc(struct btrfs_io_context *bioc, bool async)
> +static void btrfs_end_bio(struct bio *bio)
>   {
> -	struct bio *orig_bio = bioc->orig_bio;
> -	struct btrfs_bio *bbio = btrfs_bio(orig_bio);
> +	struct btrfs_io_stripe *stripe = bio->bi_private;
> +	struct btrfs_io_context *bioc = stripe->bioc;
> +	struct btrfs_bio *bbio = btrfs_bio(bio);
>   
>   	btrfs_bio_counter_dec(bioc->fs_info);
>   
> +	if (bio->bi_status) {
> +		atomic_inc(&bioc->error);
> +		btrfs_log_dev_io_error(bio, stripe->dev);
> +	}
> +
>   	bbio->mirror_num = bioc->mirror_num;
> -	orig_bio->bi_private = bioc->private;
> -	orig_bio->bi_end_io = bioc->end_io;
> +	bio->bi_end_io = bioc->end_io;
> +	bio->bi_private = bioc->private;



It looks like we are duplicating the copy of bi_end_io and bi_private 
which can be avoided. OR I am missing something.

Here is how:

Save Orig bio in bioc:


btrfs_submit_bio()

         bioc->orig_bio = bio;
         bioc->private = bio->bi_private;
         bioc->end_io = bio->bi_end_io;



Only non-clone bio shall use the btrfs_end_bio callback and orig bio.


submit_stripe_bio()

	if (clone) {
	<snip>
	} else {
                 bio = orig_bio;
                 btrfs_bio(bio)->device = dev;
                 bio->bi_end_io = btrfs_end_bio;
	}



So again, we copy back the same content to it. Which can be avoided. No?

btrfs_end_bio()

         bio->bi_end_io = bioc->end_io;
         bio->bi_private = bioc->private;


Thanks.

>   
>   	/*
>   	 * Only send an error to the higher layers if it is beyond the tolerance
>   	 * threshold.
>   	 */
>   	if (atomic_read(&bioc->error) > bioc->max_errors)
> -		orig_bio->bi_status = BLK_STS_IOERR;
> +		bio->bi_status = BLK_STS_IOERR;
>   	else
> -		orig_bio->bi_status = BLK_STS_OK;
> +		bio->bi_status = BLK_STS_OK;
>   
> -	if (btrfs_op(orig_bio) == BTRFS_MAP_READ && async) {
> +	if (btrfs_op(bio) == BTRFS_MAP_READ) {
>   		INIT_WORK(&bbio->end_io_work, btrfs_end_bio_work);
>   		queue_work(btrfs_end_io_wq(bioc), &bbio->end_io_work);
>   	} else {
> -		bio_endio(orig_bio);
> +		bio_endio(bio);
>   	}
>   
>   	btrfs_put_bioc(bioc);
>   }
>   
> -static void btrfs_end_bio(struct bio *bio)
> +static void btrfs_clone_write_end_io(struct bio *bio)
>   {
>   	struct btrfs_io_stripe *stripe = bio->bi_private;
> -	struct btrfs_io_context *bioc = stripe->bioc;
>   
>   	if (bio->bi_status) {
> -		atomic_inc(&bioc->error);
> -		if (bio->bi_status == BLK_STS_IOERR ||
> -		    bio->bi_status == BLK_STS_TARGET) {
> -			if (btrfs_op(bio) == BTRFS_MAP_WRITE)
> -				btrfs_dev_stat_inc_and_print(stripe->dev,
> -						BTRFS_DEV_STAT_WRITE_ERRS);
> -			else if (!(bio->bi_opf & REQ_RAHEAD))
> -				btrfs_dev_stat_inc_and_print(stripe->dev,
> -						BTRFS_DEV_STAT_READ_ERRS);
> -			if (bio->bi_opf & REQ_PREFLUSH)
> -				btrfs_dev_stat_inc_and_print(stripe->dev,
> -						BTRFS_DEV_STAT_FLUSH_ERRS);
> -		}
> +		atomic_inc(&stripe->bioc->error);
> +		btrfs_log_dev_io_error(bio, stripe->dev);
>   	}
>   
> -	if (bio != bioc->orig_bio)
> -		bio_put(bio);
> -
> -	if (atomic_dec_and_test(&bioc->stripes_pending))
> -		btrfs_end_bioc(bioc, true);
> +	/* Pass on control to the original bio this one was cloned from */
> +	bio_endio(stripe->bioc->orig_bio);
> +	bio_put(bio);
>   }
>   
>   static void submit_stripe_bio(struct btrfs_io_context *bioc,
> @@ -6731,28 +6736,30 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
>   	u64 physical = bioc->stripes[dev_nr].physical;
>   	struct bio *bio;
>   
> -	if (!dev || !dev->bdev ||
> -	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
> -	    (btrfs_op(orig_bio) == BTRFS_MAP_WRITE &&
> -	     !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))) {
> -		atomic_inc(&bioc->error);
> -		if (atomic_dec_and_test(&bioc->stripes_pending))
> -			btrfs_end_bioc(bioc, false);
> -		return;
> -	}
> -
>   	if (clone) {
> -		bio = bio_alloc_clone(dev->bdev, orig_bio, GFP_NOFS, &fs_bio_set);
> +		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
> +		bio_inc_remaining(orig_bio);
> +		bio->bi_end_io = btrfs_clone_write_end_io;
>   	} else {
>   		bio = orig_bio;
> -		bio_set_dev(bio, dev->bdev);
>   		btrfs_bio(bio)->device = dev;
> +		bio->bi_end_io = btrfs_end_bio;
>   	}
>   
>   	bioc->stripes[dev_nr].bioc = bioc;
>   	bio->bi_private = &bioc->stripes[dev_nr];
> -	bio->bi_end_io = btrfs_end_bio;
>   	bio->bi_iter.bi_sector = physical >> 9;
> +
> +	if (!dev || !dev->bdev ||
> +	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
> +	    (btrfs_op(bio) == BTRFS_MAP_WRITE &&
> +	     !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))) {
> +		bio_io_error(bio);
> +		return;
> +	}
> +
> +	bio_set_dev(bio, dev->bdev);
> +
>   	/*
>   	 * For zone append writing, bi_sector must point the beginning of the
>   	 * zone
> @@ -6801,7 +6808,6 @@ void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror
>   	bioc->orig_bio = bio;
>   	bioc->private = bio->bi_private;
>   	bioc->end_io = bio->bi_end_io;
> -	atomic_set(&bioc->stripes_pending, total_devs);
>   
>   	if ((bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
>   	    ((btrfs_op(bio) == BTRFS_MAP_WRITE) || (mirror_num > 1))) {
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index bd108c7ed1ac3..9e984a9922c59 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -454,7 +454,6 @@ struct btrfs_discard_stripe {
>    */
>   struct btrfs_io_context {
>   	refcount_t refs;
> -	atomic_t stripes_pending;
>   	struct btrfs_fs_info *fs_info;
>   	u64 map_type; /* get from map_lookup->type */
>   	bio_end_io_t *end_io;


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/11] btrfs: remove bioc->stripes_pending
  2022-08-18 11:33   ` Anand Jain
@ 2022-08-18 23:36     ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-18 23:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs, David Sterba,
	Josef Bacik, Chris Mason



On 8/18/22 19:33, Anand Jain wrote:
> On 8/6/22 16:03, Christoph Hellwig wrote:
>> The stripes_pending in the btrfs_io_context counts number of inflight
>> low-level bios for an upper btrfs_bio.  For reads this is generally
>> one as reads are never cloned, while for writes we can trivially use
>> the bio remaining mechanisms that is used for chained bios.
>>
>> To be able to make use of that mechanism, split out a separate trivial
>> end_io handler for the cloned bios that does a minimal amount of error
>> tracking and which then calls bio_endio on the original bio to transfer
>> control to that, with the remaining counter making sure it is completed
>> last.  This then allows to merge btrfs_end_bioc into the original bio
>> bi_end_io handler.  To make this all work all error handling needs to
>> happen through the bi_end_io handler, which requires a small amount
>> of reshuffling in submit_stripe_bio so that the bio is cloned already
>> by the time the suitability of the device is checked.
>>
>> This reduces the size of the btrfs_io_context and prepares splitting
>> the btrfs_bio at the stripe boundary.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>> ---
>>   fs/btrfs/volumes.c | 94 ++++++++++++++++++++++++----------------------
>>   fs/btrfs/volumes.h |  1 -
>>   2 files changed, 50 insertions(+), 45 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 014df2e64e67b..8775f2a635919 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -5892,7 +5892,6 @@ static struct btrfs_io_context 
>> *alloc_btrfs_io_context(struct btrfs_fs_info *fs_
>>           sizeof(u64) * (total_stripes),
>>           GFP_NOFS|__GFP_NOFAIL);
>> -    atomic_set(&bioc->error, 0);
>>       refcount_set(&bioc->refs, 1);
>>       bioc->fs_info = fs_info;
>> @@ -6647,7 +6646,21 @@ struct bio *btrfs_bio_clone_partial(struct bio 
>> *orig, u64 offset, u64 size)
>>       bio_trim(bio, offset >> 9, size >> 9);
>>       bbio->iter = bio->bi_iter;
>>       return bio;
>> +}
>> +
>> +static void btrfs_log_dev_io_error(struct bio *bio, struct 
>> btrfs_device *dev)
>> +{
>> +    if (!dev || !dev->bdev)
>> +        return;
>> +    if (bio->bi_status != BLK_STS_IOERR && bio->bi_status != 
>> BLK_STS_TARGET)
>> +        return;
>> +    if (btrfs_op(bio) == BTRFS_MAP_WRITE)
>> +        btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_WRITE_ERRS);
>> +    if (!(bio->bi_opf & REQ_RAHEAD))
>> +        btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_READ_ERRS);
>> +    if (bio->bi_opf & REQ_PREFLUSH)
>> +        btrfs_dev_stat_inc_and_print(dev, BTRFS_DEV_STAT_FLUSH_ERRS);
>>   }
>>   static struct workqueue_struct *btrfs_end_io_wq(struct 
>> btrfs_io_context *bioc)
>> @@ -6665,62 +6678,54 @@ static void btrfs_end_bio_work(struct 
>> work_struct *work)
>>       bio_endio(&bbio->bio);
>>   }
>> -static void btrfs_end_bioc(struct btrfs_io_context *bioc, bool async)
>> +static void btrfs_end_bio(struct bio *bio)
>>   {
>> -    struct bio *orig_bio = bioc->orig_bio;
>> -    struct btrfs_bio *bbio = btrfs_bio(orig_bio);
>> +    struct btrfs_io_stripe *stripe = bio->bi_private;
>> +    struct btrfs_io_context *bioc = stripe->bioc;
>> +    struct btrfs_bio *bbio = btrfs_bio(bio);
>>       btrfs_bio_counter_dec(bioc->fs_info);
>> +    if (bio->bi_status) {
>> +        atomic_inc(&bioc->error);
>> +        btrfs_log_dev_io_error(bio, stripe->dev);
>> +    }
>> +
>>       bbio->mirror_num = bioc->mirror_num;
>> -    orig_bio->bi_private = bioc->private;
>> -    orig_bio->bi_end_io = bioc->end_io;
>> +    bio->bi_end_io = bioc->end_io;
>> +    bio->bi_private = bioc->private;
> 
> 
> 
> It looks like we are duplicating the copy of bi_end_io and bi_private 
> which can be avoided. OR I am missing something.
> 

No, it is not duplicating. Now I know what I  missed. Sorry for the noise.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/11] btrfs: remove bioc->stripes_pending
  2022-08-06  8:03 ` [PATCH 05/11] btrfs: remove bioc->stripes_pending Christoph Hellwig
  2022-08-18 11:33   ` Anand Jain
@ 2022-08-18 23:37   ` Anand Jain
  1 sibling, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-18 23:37 UTC (permalink / raw)
  To: Christoph Hellwig, linux-btrfs
  Cc: Nikolay Borisov, David Sterba, Johannes Thumshirn, Chris Mason,
	Josef Bacik

On 8/6/22 16:03, Christoph Hellwig wrote:
> The stripes_pending in the btrfs_io_context counts number of inflight
> low-level bios for an upper btrfs_bio.  For reads this is generally
> one as reads are never cloned, while for writes we can trivially use
> the bio remaining mechanisms that is used for chained bios.
> 
> To be able to make use of that mechanism, split out a separate trivial
> end_io handler for the cloned bios that does a minimal amount of error
> tracking and which then calls bio_endio on the original bio to transfer
> control to that, with the remaining counter making sure it is completed
> last.  This then allows to merge btrfs_end_bioc into the original bio
> bi_end_io handler.  To make this all work all error handling needs to
> happen through the bi_end_io handler, which requires a small amount
> of reshuffling in submit_stripe_bio so that the bio is cloned already
> by the time the suitability of the device is checked.
> 
> This reduces the size of the btrfs_io_context and prepares splitting
> the btrfs_bio at the stripe boundary.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


Reviewed-by: Anand Jain <anand.jain@oracle.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler
  2022-08-06  8:03 ` [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler Christoph Hellwig
@ 2022-08-19  6:20   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-19  6:20 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Nikolay Borisov, Johannes Thumshirn, David Sterba, Josef Bacik,
	Chris Mason, Christoph Hellwig

On 8/6/22 16:03, Christoph Hellwig wrote:
> Currently btrfs_bio end I/O handling is a bit of a mess.  The bi_end_io
> handler and bi_private pointer of the embedded struct bio are both used
> to handle the completion of the high-level btrfs_bio and for the I/O
> completion for the low-level device that the embedded bio ends up being
> sent to.  To support this bi_end_io and bi_private are saved into the
> btrfs_io_context structure and then restored after the bio sent to the
> underlying device has completed the actual I/O.
> 
> Untangle this by adding an end I/O handler and private data to struct
> btrfs_bio for the highlevel btrfs_bio based completions, and leave the
> actual bio bi_end_io handler and bi_private pointer entirely to the
> low-level device I/O.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-08-06  8:03 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
@ 2022-08-19 23:37   ` Anand Jain
  2022-08-19 23:56     ` Anand Jain
  0 siblings, 1 reply; 32+ messages in thread
From: Anand Jain @ 2022-08-19 23:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs, Chris Mason,
	David Sterba, Josef Bacik

On 8/6/22 16:03, Christoph Hellwig wrote:
> Split out a low-level btrfs_submit_dev_bio helper that just submits
> the bio without any cloning decisions or setting up the end I/O handler
> for future reuse by a different caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>

> +static void submit_stripe_bio(struct btrfs_io_context *bioc,
> +			      struct bio *orig_bio, int dev_nr, bool clone)
> +{
> +	struct bio *bio;
> +
> +	/* Reuse the bio embedded into the btrfs_bio for the last mirror */
> +	if (!clone) {

Nit: Why not as in the original, as below?

	if (clone) {
		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
		<snip>
  	} else {
		<snip>
	}


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention
  2022-08-06  8:03 ` [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention Christoph Hellwig
@ 2022-08-19 23:53   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-19 23:53 UTC (permalink / raw)
  To: Christoph Hellwig, linux-btrfs
  Cc: Nikolay Borisov, David Sterba, Johannes Thumshirn, Josef Bacik,
	Chris Mason

On 8/6/22 16:03, Christoph Hellwig wrote:
> Remove the orig_bio argument as it can be derived from the bioc, and
> the clone argument as it can be calculated from bioc and dev_nr.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


Reviewed-by: Anand Jain <anand.jain@oracle.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-08-19 23:37   ` Anand Jain
@ 2022-08-19 23:56     ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-19 23:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs, Chris Mason,
	David Sterba, Josef Bacik



On 8/20/22 07:37, Anand Jain wrote:
> On 8/6/22 16:03, Christoph Hellwig wrote:
>> Split out a low-level btrfs_submit_dev_bio helper that just submits
>> the bio without any cloning decisions or setting up the end I/O handler
>> for future reuse by a different caller.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> 
> Reviewed-by: Anand Jain <anand.jain@oracle.com>
> 
>> +static void submit_stripe_bio(struct btrfs_io_context *bioc,
>> +                  struct bio *orig_bio, int dev_nr, bool clone)
>> +{
>> +    struct bio *bio;
>> +
>> +    /* Reuse the bio embedded into the btrfs_bio for the last mirror */
>> +    if (!clone) {
> 
> Nit: Why not as in the original, as below?

Got it. Patch 9/11 reasons it.

> 
>      if (clone) {
>          bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
>          <snip>
>       } else {
>          <snip>
>      }
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional
  2022-08-06  8:03 ` [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional Christoph Hellwig
@ 2022-08-20 11:34   ` Anand Jain
  0 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-20 11:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs, Chris Mason,
	David Sterba, Josef Bacik

On 8/6/22 16:03, Christoph Hellwig wrote:
> There is no need for most of the btrfs_io_context when doing I/O to a
> single device.  To support such I/O without the extra btrfs_io_context
> allocation, turn the mirror_num argument into a pointer so that it can
> be used to output the selected mirror number, and add an optional
> argument that points to a btrfs_io_stripe structure, which will be
> filled with a single extent if provided by the caller.  In that case
> the btrfs_io_context allocation can be skipped as all information for
> the single device I/O is provided in the mirror_num argument and the
> on-stack btrfs_io_stripe.  A caller that makes use of this new
> argument will be added in the next commit.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Tested-by: Nikolay Borisov <nborisov@suse.com>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


Reviewed-by: Anand Jain <anand.jain@oracle.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: btrfs I/O completion cleanup and single device I/O optimizations v3
  2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2022-08-06  8:03 ` [PATCH 11/11] btrfs: stop allocation a btrfs_io_context for simple I/O Christoph Hellwig
@ 2022-08-23  4:33 ` Anand Jain
  11 siblings, 0 replies; 32+ messages in thread
From: Anand Jain @ 2022-08-23  4:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs, David Sterba,
	Chris Mason, Josef Bacik


Fio scripts completed, here are its results with and without this 
patchset on the latest misc-next.

Random READ
  Before 2016MB/s m:RAID0_d:RAID0   devs:2
  After  2011MB/s m:RAID0_d:RAID0   devs:2

  Before 1031MB/s m:single_d:single devs:1
  After  1015MB/s m:single_d:single devs:1

  Before  971MB/s m:RAID1_d:RAID1 devs:2
  After  1025MB/s m:RAID1_d:RAID1 devs:2


Random READWRITE
  Before 236MB/s m:RAID0_d:RAID0   devs:4
  After  234MB/s m:RAID0_d:RAID0   devs:4

  Before 204MB/s m:single_d:single devs:1
  After  202MB/s m:single_d:single devs:1

  Before 204MB/s m:RAID1_d:RAID1  devs:2
  After  200MB/s m:RAID1_d:RAID1  devs:2

We can say perforamnce is stable. Fio command use is...

fio --eta=auto --output=$CMDLOG \
--name fiotest --directory=/btrfs $OPT_IO \
--rw=$RW --bs=4k --size=4G --ioengine=libaio --iodepth=16 \
--time_based=1 --runtime=900 --randrepeat=1 --gtod_reduce=1  \
--group_reporting=1 --numjobs=$cpus




On 8/6/22 16:03, Christoph Hellwig wrote:
> Hi all,
> 
> this series cleans up the btrfs_bio API, most prominently by splitting
> the end_io handler for the highlevel bio from the low-level bio
> bi_end_io, which are really confusingly coupled in the current code.
> Once that is done it then optimizes the bio submission to not allocate
> a btrfs_io_context for I/Os that just go to a single device.
> 
> Changes since v2:
>   - fix a small comment typo
> 
> Changes since v1:
>   - add two previously submitted patches skipped from an earlier
>     series
>   - merged one of those patches with one from this series
>   - split one of the patches in this series into three
>   - improve various commit logs
> 
> Diffstat:
>   compression.c    |   43 ++-----
>   ctree.h          |    1
>   dev-replace.c    |    5
>   disk-io.c        |   16 +-
>   extent-io-tree.h |    4
>   extent_io.c      |  117 +++----------------
>   extent_io.h      |    3
>   inode.c          |   57 ++++-----
>   raid56.c         |   45 +------
>   raid56.h         |    4
>   scrub.c          |    7 -
>   super.c          |    6
>   volumes.c        |  337 ++++++++++++++++++++++++++++++++++++++-----------------
>   volumes.h        |   20 ++-
>   14 files changed, 340 insertions(+), 325 deletions(-)


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-07-13  6:13 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
  2022-07-13  6:46   ` Johannes Thumshirn
@ 2022-07-14 14:02   ` Nikolay Borisov
  1 sibling, 0 replies; 32+ messages in thread
From: Nikolay Borisov @ 2022-07-14 14:02 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Johannes Thumshirn, linux-btrfs



On 13.07.22 г. 9:13 ч., Christoph Hellwig wrote:
> Split out a low-level btrfs_submit_dev_bio helper that just submits
> the bio without any cloning decisions or setting up the end I/O handler
> for future reuse by a different caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-07-13  6:50     ` Christoph Hellwig
@ 2022-07-13  6:52       ` Johannes Thumshirn
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Thumshirn @ 2022-07-13  6:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, Josef Bacik, David Sterba, Nikolay Borisov, linux-btrfs

On 13.07.22 08:51, Christoph Hellwig wrote:
> On Wed, Jul 13, 2022 at 06:46:41AM +0000, Johannes Thumshirn wrote:
>> On 13.07.22 08:14, Christoph Hellwig wrote:
>>> +static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)>  {
>>
>> Nit:	struct btrfs_fs_info *fs_info = dev->fs_info;
>>
>> so we don't need dev->fs_info for the round_down() and 
>> btrfs_debug_in_rcu() calls.
> 
> We can't assign it here because it is before the NULL check for dev,
> and assigning it later just to save a second dereference does not
> seem all that useful.
> 

Damn missed this, time for another coffee I guess

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-07-13  6:46   ` Johannes Thumshirn
@ 2022-07-13  6:50     ` Christoph Hellwig
  2022-07-13  6:52       ` Johannes Thumshirn
  0 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2022-07-13  6:50 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba,
	Nikolay Borisov, linux-btrfs

On Wed, Jul 13, 2022 at 06:46:41AM +0000, Johannes Thumshirn wrote:
> On 13.07.22 08:14, Christoph Hellwig wrote:
> > +static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)>  {
> 
> Nit:	struct btrfs_fs_info *fs_info = dev->fs_info;
> 
> so we don't need dev->fs_info for the round_down() and 
> btrfs_debug_in_rcu() calls.

We can't assign it here because it is before the NULL check for dev,
and assigning it later just to save a second dereference does not
seem all that useful.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-07-13  6:13 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
@ 2022-07-13  6:46   ` Johannes Thumshirn
  2022-07-13  6:50     ` Christoph Hellwig
  2022-07-14 14:02   ` Nikolay Borisov
  1 sibling, 1 reply; 32+ messages in thread
From: Johannes Thumshirn @ 2022-07-13  6:46 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, linux-btrfs

On 13.07.22 08:14, Christoph Hellwig wrote:
> +static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)>  {

Nit:	struct btrfs_fs_info *fs_info = dev->fs_info;

so we don't need dev->fs_info for the round_down() and 
btrfs_debug_in_rcu() calls.

> -	struct btrfs_fs_info *fs_info = bioc->fs_info;
> -	struct btrfs_device *dev = bioc->stripes[dev_nr].dev;
> -	u64 physical = bioc->stripes[dev_nr].physical;
> -	struct bio *bio;


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 08/11] btrfs: split submit_stripe_bio
  2022-07-13  6:13 btrfs I/O completion cleanup and single device I/O optimizations v2 Christoph Hellwig
@ 2022-07-13  6:13 ` Christoph Hellwig
  2022-07-13  6:46   ` Johannes Thumshirn
  2022-07-14 14:02   ` Nikolay Borisov
  0 siblings, 2 replies; 32+ messages in thread
From: Christoph Hellwig @ 2022-07-13  6:13 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba
  Cc: Nikolay Borisov, Johannes Thumshirn, linux-btrfs

Split out a low-level btrfs_submit_dev_bio helper that just submits
the bio without any cloning decisions or setting up the end I/O handler
for future reuse by a different caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/btrfs/volumes.c | 51 +++++++++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7e092c69ee70b..d5b6777874e3c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6746,28 +6746,8 @@ static void btrfs_clone_write_end_io(struct bio *bio)
 	bio_put(bio);
 }
 
-static void submit_stripe_bio(struct btrfs_io_context *bioc,
-			      struct bio *orig_bio, int dev_nr, bool clone)
+static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)
 {
-	struct btrfs_fs_info *fs_info = bioc->fs_info;
-	struct btrfs_device *dev = bioc->stripes[dev_nr].dev;
-	u64 physical = bioc->stripes[dev_nr].physical;
-	struct bio *bio;
-
-	if (clone) {
-		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
-		bio_inc_remaining(orig_bio);
-		bio->bi_end_io = btrfs_clone_write_end_io;
-	} else {
-		bio = orig_bio;
-		btrfs_bio(bio)->device = dev;
-		bio->bi_end_io = btrfs_end_bio;
-	}
-
-	bioc->stripes[dev_nr].bioc = bioc;
-	bio->bi_private = &bioc->stripes[dev_nr];
-	bio->bi_iter.bi_sector = physical >> 9;
-
 	if (!dev || !dev->bdev ||
 	    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
 	    (btrfs_op(bio) == BTRFS_MAP_WRITE &&
@@ -6783,8 +6763,11 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 	 * zone
 	 */
 	if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
+		u64 physical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
+
 		if (btrfs_dev_is_sequential(dev, physical)) {
-			u64 zone_start = round_down(physical, fs_info->zone_size);
+			u64 zone_start = round_down(physical,
+						    dev->fs_info->zone_size);
 
 			bio->bi_iter.bi_sector = zone_start >> SECTOR_SHIFT;
 		} else {
@@ -6792,7 +6775,7 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 			bio->bi_opf |= REQ_OP_WRITE;
 		}
 	}
-	btrfs_debug_in_rcu(fs_info,
+	btrfs_debug_in_rcu(dev->fs_info,
 	"%s: rw %d 0x%x, sector=%llu, dev=%lu (%s id %llu), size=%u",
 		__func__, bio_op(bio), bio->bi_opf, bio->bi_iter.bi_sector,
 		(unsigned long)dev->bdev->bd_dev, rcu_str_deref(dev->name),
@@ -6802,6 +6785,28 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc,
 	submit_bio(bio);
 }
 
+static void submit_stripe_bio(struct btrfs_io_context *bioc,
+			      struct bio *orig_bio, int dev_nr, bool clone)
+{
+	struct bio *bio;
+
+	/* Reuse the bio embedded into the btrfs_bio for the last mirror */
+	if (!clone) {
+		bio = orig_bio;
+		btrfs_bio(bio)->device = bioc->stripes[dev_nr].dev;
+		bio->bi_end_io = btrfs_end_bio;
+	} else {
+		bio = bio_alloc_clone(NULL, orig_bio, GFP_NOFS, &fs_bio_set);
+		bio_inc_remaining(orig_bio);
+		bio->bi_end_io = btrfs_clone_write_end_io;
+	}
+
+	bio->bi_private = &bioc->stripes[dev_nr];
+	bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT;
+	bioc->stripes[dev_nr].bioc = bioc;
+	btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio);
+}
+
 void btrfs_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror_num)
 {
 	u64 logical = bio->bi_iter.bi_sector << 9;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-08-23  4:33 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-06  8:03 btrfs I/O completion cleanup and single device I/O optimizations v3 Christoph Hellwig
2022-08-06  8:03 ` [PATCH 01/11] btrfs: don't call bioset_integrity_create for btrfs_bioset Christoph Hellwig
2022-08-08  5:06   ` Anand Jain
2022-08-08  6:47     ` Christoph Hellwig
2022-08-08 11:24   ` Anand Jain
2022-08-06  8:03 ` [PATCH 02/11] btrfs: move btrfs_bio allocation to volumes.c Christoph Hellwig
2022-08-10 14:42   ` Anand Jain
2022-08-06  8:03 ` [PATCH 03/11] btrfs: pass the operation to btrfs_bio_alloc Christoph Hellwig
2022-08-08 11:20   ` Anand Jain
2022-08-06  8:03 ` [PATCH 04/11] btrfs: don't take a bio_counter reference for cloned bios Christoph Hellwig
2022-08-18 11:30   ` Anand Jain
2022-08-06  8:03 ` [PATCH 05/11] btrfs: remove bioc->stripes_pending Christoph Hellwig
2022-08-18 11:33   ` Anand Jain
2022-08-18 23:36     ` Anand Jain
2022-08-18 23:37   ` Anand Jain
2022-08-06  8:03 ` [PATCH 06/11] btrfs: properly abstract the parity raid bio handling Christoph Hellwig
2022-08-06  8:03 ` [PATCH 07/11] btrfs: give struct btrfs_bio a real end_io handler Christoph Hellwig
2022-08-19  6:20   ` Anand Jain
2022-08-06  8:03 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
2022-08-19 23:37   ` Anand Jain
2022-08-19 23:56     ` Anand Jain
2022-08-06  8:03 ` [PATCH 09/11] btrfs: simplify the submit_stripe_bio calling convention Christoph Hellwig
2022-08-19 23:53   ` Anand Jain
2022-08-06  8:03 ` [PATCH 10/11] btrfs: make the btrfs_io_context allocation in __btrfs_map_block optional Christoph Hellwig
2022-08-20 11:34   ` Anand Jain
2022-08-06  8:03 ` [PATCH 11/11] btrfs: stop allocation a btrfs_io_context for simple I/O Christoph Hellwig
2022-08-23  4:33 ` btrfs I/O completion cleanup and single device I/O optimizations v3 Anand Jain
  -- strict thread matches above, loose matches on Subject: below --
2022-07-13  6:13 btrfs I/O completion cleanup and single device I/O optimizations v2 Christoph Hellwig
2022-07-13  6:13 ` [PATCH 08/11] btrfs: split submit_stripe_bio Christoph Hellwig
2022-07-13  6:46   ` Johannes Thumshirn
2022-07-13  6:50     ` Christoph Hellwig
2022-07-13  6:52       ` Johannes Thumshirn
2022-07-14 14:02   ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).