linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/26] Prep work for immutable bio vecs
@ 2012-09-11  0:22 Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix Kent Overstreet
                   ` (20 more replies)
  0 siblings, 21 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

Random assortment of refactoring and trivial cleanups;

Immutable bio vecs and efficient bio splitting require auditing and
removing pretty much all bi_idx uses, among other things.

The reason is that with immutable bio vecs we can't use the bvec array
directly; if we have a partially completed bvec, that'll be indicated
with a field in struct bvec_iter (which gets embedded in struct bio) -
bi_bvec_done.

bio_for_each_segments() will handle this transparently, so code needs to
be converted to use it or some other generic accessor.

Also, bio splitting means that when a driver gets a bio, bi_idx and
bi_bvec_done may both be nonzero. Again, just need to use generic
accessors.

v2: Patch series now has all the prep work to be done before abstracting
out the bio iterator, I think.

Kent Overstreet (26):
  block: Convert integrity to bvec_alloc_bs(), and a bugfix
  block: Add bio_advance()
  block: Refactor blk_update_request()
  md: Convert md_trim_bio() to use bio_advance()
  block: Add bio_end()
  block: Use bio_sectors() more consistently
  block: Don't use bi_idx in bio_split() or require it to be 0
  block: Remove bi_idx references
  block: Remove some unnecessary bi_vcnt usage
  block: Add submit_bio_wait(), remove from md
  raid10: Use bio_reset()
  raid1: use bio_reset()
  raid5: use bio_reset()
  raid1: Refactor narrow_write_error() to not use bi_idx
  block: Add bio_copy_data()
  pktcdvd: use bio_copy_data()
  pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage
  raid1: use bio_copy_data()
  bounce: Refactor __blk_queue_bounce to not use bi_io_vec
  block: Add bio_for_each_segment_all()
  block: Convert some code to bio_for_each_segment_all()
  block: Add bio_alloc_pages()
  raid1: use bio_alloc_pages()
  block: Add an explicit bio flag for bios that own their bvec
  bio-integrity: Add explicit field for owner of bip_buf
  block: Add BIO_SUBMITTED flag, kill BIO_CLONED

 block/blk-core.c                         |  88 +++---------
 block/cfq-iosched.c                      |   7 +-
 block/deadline-iosched.c                 |   2 +-
 drivers/block/aoe/aoeblk.c               |   2 +-
 drivers/block/aoe/aoecmd.c               |   2 +-
 drivers/block/brd.c                      |   3 +-
 drivers/block/drbd/drbd_req.c            |   8 +-
 drivers/block/floppy.c                   |   1 -
 drivers/block/pktcdvd.c                  | 102 ++++----------
 drivers/block/ps3vram.c                  |   2 +-
 drivers/md/dm-crypt.c                    |   3 +-
 drivers/md/dm-raid1.c                    |   2 +-
 drivers/md/dm-stripe.c                   |   2 +-
 drivers/md/dm-verity.c                   |   4 +-
 drivers/md/dm.c                          |   1 -
 drivers/md/faulty.c                      |   6 +-
 drivers/md/linear.c                      |   3 +-
 drivers/md/md.c                          |  19 +--
 drivers/md/raid0.c                       |   9 +-
 drivers/md/raid1.c                       | 131 ++++++------------
 drivers/md/raid10.c                      |  78 +++--------
 drivers/md/raid5.c                       |  50 +++----
 drivers/message/fusion/mptsas.c          |   6 +-
 drivers/s390/block/dcssblk.c             |   3 +-
 drivers/scsi/libsas/sas_expander.c       |   6 +-
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  10 +-
 fs/bio-integrity.c                       | 134 ++++++------------
 fs/bio.c                                 | 226 +++++++++++++++++++++++++++----
 fs/btrfs/extent_io.c                     |   3 +-
 fs/buffer.c                              |   1 -
 fs/direct-io.c                           |   8 +-
 fs/exofs/ore.c                           |   2 +-
 fs/exofs/ore_raid.c                      |   2 +-
 fs/gfs2/lops.c                           |   2 +-
 fs/jfs/jfs_logmgr.c                      |   2 -
 fs/logfs/dev_bdev.c                      |   5 -
 include/linux/bio.h                      |  34 +++--
 include/linux/blk_types.h                |   3 +-
 include/trace/events/block.h             |  10 +-
 mm/bounce.c                              |  75 +++-------
 mm/page_io.c                             |   1 -
 41 files changed, 471 insertions(+), 587 deletions(-)

-- 
1.7.12

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 02/26] block: Add bio_advance() Kent Overstreet
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel
  Cc: axboe, tj, Kent Overstreet, Martin K. Petersen

This adds a pointer to the bvec array to struct bio_integrity_payload,
instead of the bvecs always being inline; then the bvecs are allocated
with bvec_alloc_bs().

This is needed eventually for immutable bio vecs - immutable bvecs
aren't useful if we still have to copy them, hence the need for the
pointer. Less code is always nice too, though.

Also fix an amusing bug in bio_integrity_split() - struct bio_pair
doesn't have the integrity bvecs after the bio_integrity_payloads, so
there was a buffer overrun. The code was confusing pointers with arrays.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Martin K. Petersen <martin.petersen@oracle.com>
---
 fs/bio-integrity.c  | 124 +++++++++++++++++-----------------------------------
 include/linux/bio.h |   5 ++-
 2 files changed, 43 insertions(+), 86 deletions(-)

diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index a3f28f3..1d64f7f 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -27,48 +27,11 @@
 #include <linux/workqueue.h>
 #include <linux/slab.h>
 
-struct integrity_slab {
-	struct kmem_cache *slab;
-	unsigned short nr_vecs;
-	char name[8];
-};
-
-#define IS(x) { .nr_vecs = x, .name = "bip-"__stringify(x) }
-struct integrity_slab bip_slab[BIOVEC_NR_POOLS] __read_mostly = {
-	IS(1), IS(4), IS(16), IS(64), IS(128), IS(BIO_MAX_PAGES),
-};
-#undef IS
+#define BIP_INLINE_VECS	4
 
+static struct kmem_cache *bip_slab;
 static struct workqueue_struct *kintegrityd_wq;
 
-static inline unsigned int vecs_to_idx(unsigned int nr)
-{
-	switch (nr) {
-	case 1:
-		return 0;
-	case 2 ... 4:
-		return 1;
-	case 5 ... 16:
-		return 2;
-	case 17 ... 64:
-		return 3;
-	case 65 ... 128:
-		return 4;
-	case 129 ... BIO_MAX_PAGES:
-		return 5;
-	default:
-		BUG();
-	}
-}
-
-static inline int use_bip_pool(unsigned int idx)
-{
-	if (idx == BIOVEC_MAX_IDX)
-		return 1;
-
-	return 0;
-}
-
 /**
  * bio_integrity_alloc - Allocate integrity payload and attach it to bio
  * @bio:	bio to attach integrity metadata to
@@ -84,37 +47,38 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio,
 						  unsigned int nr_vecs)
 {
 	struct bio_integrity_payload *bip;
-	unsigned int idx = vecs_to_idx(nr_vecs);
 	struct bio_set *bs = bio->bi_pool;
+	unsigned long idx = BIO_POOL_NONE;
+	unsigned inline_vecs;
+
+	if (!bs) {
+		bip = kmalloc(sizeof(struct bio_integrity_payload) +
+			      sizeof(struct bio_vec) * nr_vecs, gfp_mask);
+		inline_vecs = nr_vecs;
+	} else {
+		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
+		inline_vecs = BIP_INLINE_VECS;
+	}
 
-	if (!bs)
-		bs = fs_bio_set;
-
-	BUG_ON(bio == NULL);
-	bip = NULL;
+	if (unlikely(!bip))
+		return NULL;
 
-	/* Lower order allocations come straight from slab */
-	if (!use_bip_pool(idx))
-		bip = kmem_cache_alloc(bip_slab[idx].slab, gfp_mask);
+	memset(bip, 0, sizeof(struct bio_integrity_payload));
 
-	/* Use mempool if lower order alloc failed or max vecs were requested */
-	if (bip == NULL) {
-		idx = BIOVEC_MAX_IDX;  /* so we free the payload properly later */
-		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
-
-		if (unlikely(bip == NULL)) {
-			printk(KERN_ERR "%s: could not alloc bip\n", __func__);
-			return NULL;
-		}
+	if (nr_vecs > inline_vecs) {
+		bip->bip_vec = bvec_alloc_bs(gfp_mask, nr_vecs, &idx, bs);
+		if (!bip->bip_vec)
+			goto err;
 	}
 
-	memset(bip, 0, sizeof(*bip));
-
 	bip->bip_slab = idx;
 	bip->bip_bio = bio;
 	bio->bi_integrity = bip;
 
 	return bip;
+err:
+	mempool_free(bip, bs->bio_integrity_pool);
+	return NULL;
 }
 EXPORT_SYMBOL(bio_integrity_alloc);
 
@@ -130,20 +94,19 @@ void bio_integrity_free(struct bio *bio)
 	struct bio_integrity_payload *bip = bio->bi_integrity;
 	struct bio_set *bs = bio->bi_pool;
 
-	if (!bs)
-		bs = fs_bio_set;
-
-	BUG_ON(bip == NULL);
-
 	/* A cloned bio doesn't own the integrity metadata */
 	if (!bio_flagged(bio, BIO_CLONED) && !bio_flagged(bio, BIO_FS_INTEGRITY)
 	    && bip->bip_buf != NULL)
 		kfree(bip->bip_buf);
 
-	if (use_bip_pool(bip->bip_slab))
+	if (bs) {
+		if (bip->bip_slab != BIO_POOL_NONE)
+			bvec_free_bs(bs, bip->bip_vec, bip->bip_slab);
+
 		mempool_free(bip, bs->bio_integrity_pool);
-	else
-		kmem_cache_free(bip_slab[bip->bip_slab].slab, bip);
+	} else {
+		kfree(bip);
+	}
 
 	bio->bi_integrity = NULL;
 }
@@ -697,8 +660,8 @@ void bio_integrity_split(struct bio *bio, struct bio_pair *bp, int sectors)
 	bp->iv1 = bip->bip_vec[0];
 	bp->iv2 = bip->bip_vec[0];
 
-	bp->bip1.bip_vec[0] = bp->iv1;
-	bp->bip2.bip_vec[0] = bp->iv2;
+	bp->bip1.bip_vec = &bp->iv1;
+	bp->bip2.bip_vec = &bp->iv2;
 
 	bp->iv1.bv_len = sectors * bi->tuple_size;
 	bp->iv2.bv_offset += sectors * bi->tuple_size;
@@ -746,13 +709,10 @@ EXPORT_SYMBOL(bio_integrity_clone);
 
 int bioset_integrity_create(struct bio_set *bs, int pool_size)
 {
-	unsigned int max_slab = vecs_to_idx(BIO_MAX_PAGES);
-
 	if (bs->bio_integrity_pool)
 		return 0;
 
-	bs->bio_integrity_pool =
-		mempool_create_slab_pool(pool_size, bip_slab[max_slab].slab);
+	bs->bio_integrity_pool = mempool_create_slab_pool(pool_size, bip_slab);
 
 	if (!bs->bio_integrity_pool)
 		return -1;
@@ -770,8 +730,6 @@ EXPORT_SYMBOL(bioset_integrity_free);
 
 void __init bio_integrity_init(void)
 {
-	unsigned int i;
-
 	/*
 	 * kintegrityd won't block much but may burn a lot of CPU cycles.
 	 * Make it highpri CPU intensive wq with max concurrency of 1.
@@ -781,14 +739,10 @@ void __init bio_integrity_init(void)
 	if (!kintegrityd_wq)
 		panic("Failed to create kintegrityd\n");
 
-	for (i = 0 ; i < BIOVEC_NR_POOLS ; i++) {
-		unsigned int size;
-
-		size = sizeof(struct bio_integrity_payload)
-			+ bip_slab[i].nr_vecs * sizeof(struct bio_vec);
-
-		bip_slab[i].slab =
-			kmem_cache_create(bip_slab[i].name, size, 0,
-					  SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
-	}
+	bip_slab = kmem_cache_create("bio_integrity_payload",
+				     sizeof(struct bio_integrity_payload) +
+				     sizeof(struct bio_vec) * BIP_INLINE_VECS,
+				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+	if (!bip_slab)
+		panic("Failed to create slab\n");
 }
diff --git a/include/linux/bio.h b/include/linux/bio.h
index c32ea0d..7873465 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -182,7 +182,10 @@ struct bio_integrity_payload {
 	unsigned short		bip_idx;	/* current bip_vec index */
 
 	struct work_struct	bip_work;	/* I/O completion */
-	struct bio_vec		bip_vec[0];	/* embedded bvec array */
+
+	struct bio_vec		*bip_vec;
+	struct bio_vec		bip_inline_vecs[0];/* embedded bvec array */
+
 };
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 02/26] block: Add bio_advance()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 03/26] block: Refactor blk_update_request() Kent Overstreet
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, tj, Kent Overstreet

This is prep work for immutable bio vecs; we first want to centralize
where bvecs are modified.

Next two patches convert some existing code to use this function.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 fs/bio.c            | 41 +++++++++++++++++++++++++++++++++++++++++
 include/linux/bio.h |  2 ++
 2 files changed, 43 insertions(+)

diff --git a/fs/bio.c b/fs/bio.c
index 4783e31..07587c0 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -750,6 +750,47 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
 }
 EXPORT_SYMBOL(bio_add_page);
 
+/**
+ * bio_advance - increment/complete a bio by some number of bytes
+ * @bio:	bio to advance
+ * @bytes:	number of bytes to complete
+ *
+ * This updates bi_sector, bi_size and bi_idx; if the number of bytes to
+ * complete doesn't align with a bvec boundary, then bv_len and bv_offset will
+ * be updated on the last bvec as well.
+ *
+ * @bio will then represent the remaining, uncompleted portion of the io.
+ */
+void bio_advance(struct bio *bio, unsigned bytes)
+{
+	if (bio_integrity(bio))
+		bio_integrity_advance(bio, bytes);
+
+	bio->bi_sector += bytes >> 0;
+	bio->bi_size -= bytes;
+
+	if (!bio->bi_size)
+		return;
+
+	while (bytes) {
+		if (unlikely(bio->bi_idx >= bio->bi_vcnt)) {
+			printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n",
+			       __func__, bio->bi_idx, bio->bi_vcnt);
+			break;
+		}
+
+		if (bytes >= bio_iovec(bio)->bv_len) {
+			bytes -= bio_iovec(bio)->bv_len;
+			bio->bi_idx++;
+		} else {
+			bio_iovec(bio)->bv_len -= bytes;
+			bio_iovec(bio)->bv_offset += bytes;
+			bytes = 0;
+		}
+	}
+}
+EXPORT_SYMBOL(bio_advance);
+
 struct bio_map_data {
 	struct bio_vec *iovecs;
 	struct sg_iovec *sgvecs;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7873465..6763cdf 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -248,6 +248,8 @@ extern void bio_endio(struct bio *, int);
 struct request_queue;
 extern int bio_phys_segments(struct request_queue *, struct bio *);
 
+void bio_advance(struct bio *, unsigned);
+
 extern void bio_init(struct bio *);
 extern void bio_reset(struct bio *);
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 03/26] block: Refactor blk_update_request()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 02/26] block: Add bio_advance() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-4-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 04/26] md: Convert md_trim_bio() to use bio_advance() Kent Overstreet
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

Converts it to use bio_advance(), simplifying it quite a bit in the
process.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c | 84 +++++++++++---------------------------------------------
 1 file changed, 16 insertions(+), 68 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2d739ca..55c833c9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -153,25 +153,19 @@ EXPORT_SYMBOL(blk_rq_init);
 static void req_bio_endio(struct request *rq, struct bio *bio,
 			  unsigned int nbytes, int error)
 {
+	/*
+	 * XXX: bio_endio() does this. only need this because of the weird
+	 * flush seq thing.
+	 */
 	if (error)
 		clear_bit(BIO_UPTODATE, &bio->bi_flags);
 	else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
 		error = -EIO;
 
-	if (unlikely(nbytes > bio->bi_size)) {
-		printk(KERN_ERR "%s: want %u bytes done, %u left\n",
-		       __func__, nbytes, bio->bi_size);
-		nbytes = bio->bi_size;
-	}
-
 	if (unlikely(rq->cmd_flags & REQ_QUIET))
 		set_bit(BIO_QUIET, &bio->bi_flags);
 
-	bio->bi_size -= nbytes;
-	bio->bi_sector += (nbytes >> 9);
-
-	if (bio_integrity(bio))
-		bio_integrity_advance(bio, nbytes);
+	bio_advance(bio, nbytes);
 
 	/* don't actually finish bio if it's part of flush sequence */
 	if (bio->bi_size == 0 && !(rq->cmd_flags & REQ_FLUSH_SEQ))
@@ -2214,8 +2208,7 @@ EXPORT_SYMBOL(blk_fetch_request);
  **/
 bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 {
-	int total_bytes, bio_nbytes, next_idx = 0;
-	struct bio *bio;
+	int total_bytes;
 
 	if (!req->bio)
 		return false;
@@ -2259,56 +2252,21 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 
 	blk_account_io_completion(req, nr_bytes);
 
-	total_bytes = bio_nbytes = 0;
-	while ((bio = req->bio) != NULL) {
-		int nbytes;
+	total_bytes = 0;
+	while (req->bio) {
+		struct bio *bio = req->bio;
+		unsigned bio_bytes = min(bio->bi_size, nr_bytes);
 
-		if (nr_bytes >= bio->bi_size) {
+		if (bio_bytes == bio->bi_size)
 			req->bio = bio->bi_next;
-			nbytes = bio->bi_size;
-			req_bio_endio(req, bio, nbytes, error);
-			next_idx = 0;
-			bio_nbytes = 0;
-		} else {
-			int idx = bio->bi_idx + next_idx;
-
-			if (unlikely(idx >= bio->bi_vcnt)) {
-				blk_dump_rq_flags(req, "__end_that");
-				printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n",
-				       __func__, idx, bio->bi_vcnt);
-				break;
-			}
-
-			nbytes = bio_iovec_idx(bio, idx)->bv_len;
-			BIO_BUG_ON(nbytes > bio->bi_size);
-
-			/*
-			 * not a complete bvec done
-			 */
-			if (unlikely(nbytes > nr_bytes)) {
-				bio_nbytes += nr_bytes;
-				total_bytes += nr_bytes;
-				break;
-			}
 
-			/*
-			 * advance to the next vector
-			 */
-			next_idx++;
-			bio_nbytes += nbytes;
-		}
+		req_bio_endio(req, bio, bio_bytes, error);
 
-		total_bytes += nbytes;
-		nr_bytes -= nbytes;
+		total_bytes += bio_bytes;
+		nr_bytes -= bio_bytes;
 
-		bio = req->bio;
-		if (bio) {
-			/*
-			 * end more in this run, or just return 'not-done'
-			 */
-			if (unlikely(nr_bytes <= 0))
-				break;
-		}
+		if (!nr_bytes)
+			break;
 	}
 
 	/*
@@ -2324,16 +2282,6 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 		return false;
 	}
 
-	/*
-	 * if the request wasn't completed, update state
-	 */
-	if (bio_nbytes) {
-		req_bio_endio(req, bio, bio_nbytes, error);
-		bio->bi_idx += next_idx;
-		bio_iovec(bio)->bv_offset += nr_bytes;
-		bio_iovec(bio)->bv_len -= nr_bytes;
-	}
-
 	req->__data_len -= total_bytes;
 	req->buffer = bio_data(req->bio);
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 04/26] md: Convert md_trim_bio() to use bio_advance()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (2 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 03/26] block: Refactor blk_update_request() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-5-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 05/26] block: Add bio_end() Kent Overstreet
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, tj, Kent Overstreet

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/md.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7a2b079..51ce48c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -190,25 +190,16 @@ void md_trim_bio(struct bio *bio, int offset, int size)
 	struct bio_vec *bvec;
 	int sofar = 0;
 
-	size <<= 9;
 	if (offset == 0 && size == bio->bi_size)
 		return;
 
-	bio->bi_sector += offset;
-	bio->bi_size = size;
-	offset <<= 9;
 	clear_bit(BIO_SEG_VALID, &bio->bi_flags);
 
-	while (bio->bi_idx < bio->bi_vcnt &&
-	       bio->bi_io_vec[bio->bi_idx].bv_len <= offset) {
-		/* remove this whole bio_vec */
-		offset -= bio->bi_io_vec[bio->bi_idx].bv_len;
-		bio->bi_idx++;
-	}
-	if (bio->bi_idx < bio->bi_vcnt) {
-		bio->bi_io_vec[bio->bi_idx].bv_offset += offset;
-		bio->bi_io_vec[bio->bi_idx].bv_len -= offset;
-	}
+	bio_advance(bio, offset << 9);
+
+	size <<= 9;
+	bio->bi_size = size;
+
 	/* avoid any complications with bi_idx being non-zero*/
 	if (bio->bi_idx) {
 		memmove(bio->bi_io_vec, bio->bi_io_vec+bio->bi_idx,
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 05/26] block: Add bio_end()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (3 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 04/26] md: Convert md_trim_bio() to use bio_advance() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-6-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 06/26] block: Use bio_sectors() more consistently Kent Overstreet
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, tj, Kent Overstreet

Just a little convenience macro - main reason to add it now is preparing
for immutable bio vecs, it'll reduce the size of the patch that puts
bi_sector/bi_size/bi_idx into a struct bvec_iter.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c              |  2 +-
 block/cfq-iosched.c           |  7 ++-----
 block/deadline-iosched.c      |  2 +-
 drivers/block/drbd/drbd_req.c |  2 +-
 drivers/block/pktcdvd.c       |  6 +++---
 drivers/md/dm-stripe.c        |  2 +-
 drivers/md/dm-verity.c        |  2 +-
 drivers/md/faulty.c           |  6 ++----
 drivers/md/linear.c           |  3 +--
 drivers/md/raid1.c            |  4 ++--
 drivers/md/raid5.c            | 14 +++++++-------
 drivers/s390/block/dcssblk.c  |  3 +--
 fs/btrfs/extent_io.c          |  3 +--
 fs/gfs2/lops.c                |  2 +-
 include/linux/bio.h           |  1 +
 15 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 55c833c9..97511cb 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1557,7 +1557,7 @@ static void handle_bad_sector(struct bio *bio)
 	printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
 			bdevname(bio->bi_bdev, b),
 			bio->bi_rw,
-			(unsigned long long)bio->bi_sector + bio_sectors(bio),
+			(unsigned long long)bio_end(bio),
 			(long long)(i_size_read(bio->bi_bdev->bd_inode) >> 9));
 
 	set_bit(BIO_EOF, &bio->bi_flags);
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index fb52df9..8eae0f3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1883,11 +1883,8 @@ cfq_find_rq_fmerge(struct cfq_data *cfqd, struct bio *bio)
 		return NULL;
 
 	cfqq = cic_to_cfqq(cic, cfq_bio_sync(bio));
-	if (cfqq) {
-		sector_t sector = bio->bi_sector + bio_sectors(bio);
-
-		return elv_rb_find(&cfqq->sort_list, sector);
-	}
+	if (cfqq)
+		return elv_rb_find(&cfqq->sort_list, bio_end(bio));
 
 	return NULL;
 }
diff --git a/block/deadline-iosched.c b/block/deadline-iosched.c
index 599b12e..a3b4df9 100644
--- a/block/deadline-iosched.c
+++ b/block/deadline-iosched.c
@@ -132,7 +132,7 @@ deadline_merge(struct request_queue *q, struct request **req, struct bio *bio)
 	 * check for front merge
 	 */
 	if (dd->front_merges) {
-		sector_t sector = bio->bi_sector + bio_sectors(bio);
+		sector_t sector = bio_end(bio);
 
 		__rq = elv_rb_find(&dd->sort_list[bio_data_dir(bio)], sector);
 		if (__rq) {
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 01b2ac6..af69a96 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1144,7 +1144,7 @@ void drbd_make_request(struct request_queue *q, struct bio *bio)
 	/* to make some things easier, force alignment of requests within the
 	 * granularity of our hash tables */
 	s_enr = bio->bi_sector >> HT_SHIFT;
-	e_enr = bio->bi_size ? (bio->bi_sector+(bio->bi_size>>9)-1) >> HT_SHIFT : s_enr;
+	e_enr = (bio_end(bio) - 1) >> HT_SHIFT;
 
 	if (likely(s_enr == e_enr)) {
 		do {
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 2e7de7a..8df3216 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -901,7 +901,7 @@ static void pkt_iosched_process_queue(struct pktcdvd_device *pd)
 			pd->iosched.successive_reads += bio->bi_size >> 10;
 		else {
 			pd->iosched.successive_reads = 0;
-			pd->iosched.last_write = bio->bi_sector + bio_sectors(bio);
+			pd->iosched.last_write = bio_end(bio);
 		}
 		if (pd->iosched.successive_reads >= HI_SPEED_SWITCH) {
 			if (pd->read_speed == pd->write_speed) {
@@ -2454,7 +2454,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio)
 	zone = ZONE(bio->bi_sector, pd);
 	VPRINTK("pkt_make_request: start = %6llx stop = %6llx\n",
 		(unsigned long long)bio->bi_sector,
-		(unsigned long long)(bio->bi_sector + bio_sectors(bio)));
+		(unsigned long long)bio_end(bio));
 
 	/* Check if we have to split the bio */
 	{
@@ -2462,7 +2462,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio)
 		sector_t last_zone;
 		int first_sectors;
 
-		last_zone = ZONE(bio->bi_sector + bio_sectors(bio) - 1, pd);
+		last_zone = ZONE(bio_end(bio) - 1, pd);
 		if (last_zone != zone) {
 			BUG_ON(last_zone != zone + pd->settings.size);
 			first_sectors = last_zone - bio->bi_sector;
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index a087bf2..047dd08 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -257,7 +257,7 @@ static int stripe_map_discard(struct stripe_c *sc, struct bio *bio,
 	sector_t begin, end;
 
 	stripe_map_range_sector(sc, bio->bi_sector, target_stripe, &begin);
-	stripe_map_range_sector(sc, bio->bi_sector + bio_sectors(bio),
+	stripe_map_range_sector(sc, bio_end(bio),
 				target_stripe, &end);
 	if (begin < end) {
 		bio->bi_bdev = sc->stripe[target_stripe].dev->bdev;
diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c
index 254d192..18ef6c5 100644
--- a/drivers/md/dm-verity.c
+++ b/drivers/md/dm-verity.c
@@ -477,7 +477,7 @@ static int verity_map(struct dm_target *ti, struct bio *bio,
 		return -EIO;
 	}
 
-	if ((bio->bi_sector + bio_sectors(bio)) >>
+	if (bio_end(bio) >>
 	    (v->data_dev_block_bits - SECTOR_SHIFT) > v->data_blocks) {
 		DMERR_LIMIT("io out of range");
 		return -EIO;
diff --git a/drivers/md/faulty.c b/drivers/md/faulty.c
index 45135f6..a4d0ebd 100644
--- a/drivers/md/faulty.c
+++ b/drivers/md/faulty.c
@@ -185,8 +185,7 @@ static void make_request(struct mddev *mddev, struct bio *bio)
 			return;
 		}
 
-		if (check_sector(conf, bio->bi_sector, bio->bi_sector+(bio->bi_size>>9),
-				 WRITE))
+		if (check_sector(conf, bio->bi_sector, bio_end(bio), WRITE))
 			failit = 1;
 		if (check_mode(conf, WritePersistent)) {
 			add_sector(conf, bio->bi_sector, WritePersistent);
@@ -196,8 +195,7 @@ static void make_request(struct mddev *mddev, struct bio *bio)
 			failit = 1;
 	} else {
 		/* read request */
-		if (check_sector(conf, bio->bi_sector, bio->bi_sector + (bio->bi_size>>9),
-				 READ))
+		if (check_sector(conf, bio->bi_sector, bio_end(bio), READ))
 			failit = 1;
 		if (check_mode(conf, ReadTransient))
 			failit = 1;
diff --git a/drivers/md/linear.c b/drivers/md/linear.c
index fa211d8..3fc843c 100644
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -304,8 +304,7 @@ static void linear_make_request(struct mddev *mddev, struct bio *bio)
 		bio_io_error(bio);
 		return;
 	}
-	if (unlikely(bio->bi_sector + (bio->bi_size >> 9) >
-		     tmp_dev->end_sector)) {
+	if (unlikely(bio_end(bio) > tmp_dev->end_sector)) {
 		/* This bio crosses a device boundary, so we have to
 		 * split it.
 		 */
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 611b5f7..a242578 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1010,7 +1010,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 	md_write_start(mddev, bio); /* wait on superblock update early */
 
 	if (bio_data_dir(bio) == WRITE &&
-	    bio->bi_sector + bio->bi_size/512 > mddev->suspend_lo &&
+	    bio_end(bio) > mddev->suspend_lo &&
 	    bio->bi_sector < mddev->suspend_hi) {
 		/* As the suspend_* range is controlled by
 		 * userspace, we want an interruptible
@@ -1021,7 +1021,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 			flush_signals(current);
 			prepare_to_wait(&conf->wait_barrier,
 					&w, TASK_INTERRUPTIBLE);
-			if (bio->bi_sector + bio->bi_size/512 <= mddev->suspend_lo ||
+			if (bio_end(bio) <= mddev->suspend_lo ||
 			    bio->bi_sector >= mddev->suspend_hi)
 				break;
 			schedule();
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index adda94d..7b36e2a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2377,11 +2377,11 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in
 	} else
 		bip = &sh->dev[dd_idx].toread;
 	while (*bip && (*bip)->bi_sector < bi->bi_sector) {
-		if ((*bip)->bi_sector + ((*bip)->bi_size >> 9) > bi->bi_sector)
+		if (bio_end(*bip) > bi->bi_sector)
 			goto overlap;
 		bip = & (*bip)->bi_next;
 	}
-	if (*bip && (*bip)->bi_sector < bi->bi_sector + ((bi->bi_size)>>9))
+	if (*bip && (*bip)->bi_sector < bio_end(bi))
 		goto overlap;
 
 	BUG_ON(*bip && bi->bi_next && (*bip) != bi->bi_next);
@@ -2397,8 +2397,8 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in
 		     sector < sh->dev[dd_idx].sector + STRIPE_SECTORS &&
 			     bi && bi->bi_sector <= sector;
 		     bi = r5_next_bio(bi, sh->dev[dd_idx].sector)) {
-			if (bi->bi_sector + (bi->bi_size>>9) >= sector)
-				sector = bi->bi_sector + (bi->bi_size>>9);
+			if (bio_end(bi) >= sector)
+				sector = bio_end(bi);
 		}
 		if (sector >= sh->dev[dd_idx].sector + STRIPE_SECTORS)
 			set_bit(R5_OVERWRITE, &sh->dev[dd_idx].flags);
@@ -3908,7 +3908,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio)
 						    0,
 						    &dd_idx, NULL);
 
-	end_sector = align_bi->bi_sector + (align_bi->bi_size >> 9);
+	end_sector = bio_end(align_bi);
 	rcu_read_lock();
 	rdev = rcu_dereference(conf->disks[dd_idx].replacement);
 	if (!rdev || test_bit(Faulty, &rdev->flags) ||
@@ -4090,7 +4090,7 @@ static void make_request(struct mddev *mddev, struct bio * bi)
 		return;
 
 	logical_sector = bi->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
-	last_sector = bi->bi_sector + (bi->bi_size>>9);
+	last_sector = bio_end(bi);
 	bi->bi_next = NULL;
 	bi->bi_phys_segments = 1;	/* over-loaded to count active stripes */
 
@@ -4553,7 +4553,7 @@ static int  retry_aligned_read(struct r5conf *conf, struct bio *raid_bio)
 	logical_sector = raid_bio->bi_sector & ~((sector_t)STRIPE_SECTORS-1);
 	sector = raid5_compute_sector(conf, logical_sector,
 				      0, &dd_idx, NULL);
-	last_sector = raid_bio->bi_sector + (raid_bio->bi_size>>9);
+	last_sector = bio_end(raid_bio);
 
 	for (; logical_sector < last_sector;
 	     logical_sector += STRIPE_SECTORS,
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index a5a55da..52f88b7 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -832,8 +832,7 @@ dcssblk_make_request(struct request_queue *q, struct bio *bio)
 	if ((bio->bi_sector & 7) != 0 || (bio->bi_size & 4095) != 0)
 		/* Request is not page-aligned. */
 		goto fail;
-	if (((bio->bi_size >> 9) + bio->bi_sector)
-			> get_capacity(bio->bi_bdev->bd_disk)) {
+	if (bio_end(bio) > get_capacity(bio->bi_bdev->bd_disk)) {
 		/* Request beyond end of DCSS segment. */
 		goto fail;
 	}
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 4c87847..435274a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2480,8 +2480,7 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree,
 		if (old_compressed)
 			contig = bio->bi_sector == sector;
 		else
-			contig = bio->bi_sector + (bio->bi_size >> 9) ==
-				sector;
+			contig = bio_end(bio) == sector;
 
 		if (prev_bio_flags != bio_flags || !contig ||
 		    merge_bio(tree, page, offset, page_size, bio, bio_flags) ||
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 8ff95a2..fc28f99 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -300,7 +300,7 @@ static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, u64 blkno)
 	u64 nblk;
 
 	if (bio) {
-		nblk = bio->bi_sector + bio_sectors(bio);
+		nblk = bio_end(bio);
 		nblk >>= sdp->sd_fsb2bb_shift;
 		if (blkno == nblk)
 			return bio;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 6763cdf..92bff0e 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -67,6 +67,7 @@
 #define bio_offset(bio)		bio_iovec((bio))->bv_offset
 #define bio_segments(bio)	((bio)->bi_vcnt - (bio)->bi_idx)
 #define bio_sectors(bio)	((bio)->bi_size >> 9)
+#define bio_end(bio)		((bio)->bi_sector + bio_sectors(bio))
 
 static inline unsigned int bio_cur_bytes(struct bio *bio)
 {
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 06/26] block: Use bio_sectors() more consistently
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (4 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 05/26] block: Add bio_end() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-7-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0 Kent Overstreet
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

Bunch of places in the code weren't using it where they could be -
this'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx
into a struct bvec_iter.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 drivers/block/aoe/aoeblk.c   |  2 +-
 drivers/block/aoe/aoecmd.c   |  2 +-
 drivers/block/brd.c          |  3 +--
 drivers/block/pktcdvd.c      |  2 +-
 drivers/block/ps3vram.c      |  2 +-
 drivers/md/dm-raid1.c        |  2 +-
 drivers/md/raid0.c           |  6 +++---
 drivers/md/raid1.c           | 17 ++++++++---------
 drivers/md/raid10.c          | 24 +++++++++++-------------
 drivers/md/raid5.c           |  8 ++++----
 include/trace/events/block.h | 10 +++++-----
 11 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 321de7b..6e4420a 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -199,7 +199,7 @@ aoeblk_make_request(struct request_queue *q, struct bio *bio)
 	buf->bio = bio;
 	buf->resid = bio->bi_size;
 	buf->sector = bio->bi_sector;
-	buf->bv = &bio->bi_io_vec[bio->bi_idx];
+	buf->bv = bio_iovec(bio);
 	buf->bv_resid = buf->bv->bv_len;
 	WARN_ON(buf->bv_resid == 0);
 	buf->bv_off = buf->bv->bv_offset;
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index de0435e..2b52ebc 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -720,7 +720,7 @@ gettgt(struct aoedev *d, char *addr)
 static inline void
 diskstats(struct gendisk *disk, struct bio *bio, ulong duration, sector_t sector)
 {
-	unsigned long n_sect = bio->bi_size >> 9;
+	unsigned long n_sect = bio_sectors(bio);
 	const int rw = bio_data_dir(bio);
 	struct hd_struct *part;
 	int cpu;
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 531ceb3..d5c4978 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -334,8 +334,7 @@ static void brd_make_request(struct request_queue *q, struct bio *bio)
 	int err = -EIO;
 
 	sector = bio->bi_sector;
-	if (sector + (bio->bi_size >> SECTOR_SHIFT) >
-						get_capacity(bdev->bd_disk))
+	if (sector + bio_sectors(bio) > get_capacity(bdev->bd_disk))
 		goto out;
 
 	if (unlikely(bio->bi_rw & REQ_DISCARD)) {
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 8df3216..0824627 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2433,7 +2433,7 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio)
 		cloned_bio->bi_bdev = pd->bdev;
 		cloned_bio->bi_private = psd;
 		cloned_bio->bi_end_io = pkt_end_io_read_cloned;
-		pd->stats.secs_r += bio->bi_size >> 9;
+		pd->stats.secs_r += bio_sectors(bio);
 		pkt_queue_bio(pd, cloned_bio);
 		return;
 	}
diff --git a/drivers/block/ps3vram.c b/drivers/block/ps3vram.c
index f58cdcf..1ff38e8 100644
--- a/drivers/block/ps3vram.c
+++ b/drivers/block/ps3vram.c
@@ -553,7 +553,7 @@ static struct bio *ps3vram_do_bio(struct ps3_system_bus_device *dev,
 	struct ps3vram_priv *priv = ps3_system_bus_get_drvdata(dev);
 	int write = bio_data_dir(bio) == WRITE;
 	const char *op = write ? "write" : "read";
-	loff_t offset = bio->bi_sector << 9;
+	loff_t offset = bio_sectors(bio);
 	int error = 0;
 	struct bio_vec *bvec;
 	unsigned int i;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index bc5ddba8..3dac2de 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -457,7 +457,7 @@ static void map_region(struct dm_io_region *io, struct mirror *m,
 {
 	io->bdev = m->dev->bdev;
 	io->sector = map_sector(m, bio);
-	io->count = bio->bi_size >> 9;
+	io->count = bio_sectors(bio);
 }
 
 static void hold_bio(struct mirror_set *ms, struct bio *bio)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index de63a1f..387cb89 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -484,11 +484,11 @@ static inline int is_io_in_chunk_boundary(struct mddev *mddev,
 {
 	if (likely(is_power_of_2(chunk_sects))) {
 		return chunk_sects >= ((bio->bi_sector & (chunk_sects-1))
-					+ (bio->bi_size >> 9));
+					+ bio_sectors(bio));
 	} else{
 		sector_t sector = bio->bi_sector;
 		return chunk_sects >= (sector_div(sector, chunk_sects)
-						+ (bio->bi_size >> 9));
+						+ bio_sectors(bio));
 	}
 }
 
@@ -542,7 +542,7 @@ bad_map:
 	printk("md/raid0:%s: make_request bug: can't convert block across chunks"
 	       " or bigger than %dk %llu %d\n",
 	       mdname(mddev), chunk_sects / 2,
-	       (unsigned long long)bio->bi_sector, bio->bi_size >> 10);
+	       (unsigned long long)bio->bi_sector, bio_sectors(bio) / 2);
 
 	bio_io_error(bio);
 	return;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a242578..2488440 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -267,7 +267,7 @@ static void raid_end_bio_io(struct r1bio *r1_bio)
 			 (bio_data_dir(bio) == WRITE) ? "write" : "read",
 			 (unsigned long long) bio->bi_sector,
 			 (unsigned long long) bio->bi_sector +
-			 (bio->bi_size >> 9) - 1);
+			 bio_sectors(bio) - 1);
 
 		call_bio_endio(r1_bio);
 	}
@@ -458,7 +458,7 @@ static void raid1_end_write_request(struct bio *bio, int error)
 					 " %llu-%llu\n",
 					 (unsigned long long) mbio->bi_sector,
 					 (unsigned long long) mbio->bi_sector +
-					 (mbio->bi_size >> 9) - 1);
+					 bio_sectors(mbio) - 1);
 				call_bio_endio(r1_bio);
 			}
 		}
@@ -1041,7 +1041,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 	r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
 
 	r1_bio->master_bio = bio;
-	r1_bio->sectors = bio->bi_size >> 9;
+	r1_bio->sectors = bio_sectors(bio);
 	r1_bio->state = 0;
 	r1_bio->mddev = mddev;
 	r1_bio->sector = bio->bi_sector;
@@ -1119,7 +1119,7 @@ read_again:
 			r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
 
 			r1_bio->master_bio = bio;
-			r1_bio->sectors = (bio->bi_size >> 9) - sectors_handled;
+			r1_bio->sectors = bio_sectors(bio) - sectors_handled;
 			r1_bio->state = 0;
 			r1_bio->mddev = mddev;
 			r1_bio->sector = bio->bi_sector + sectors_handled;
@@ -1320,14 +1320,14 @@ read_again:
 	/* Mustn't call r1_bio_write_done before this next test,
 	 * as it could result in the bio being freed.
 	 */
-	if (sectors_handled < (bio->bi_size >> 9)) {
+	if (sectors_handled < bio_sectors(bio)) {
 		r1_bio_write_done(r1_bio);
 		/* We need another r1_bio.  It has already been counted
 		 * in bio->bi_phys_segments
 		 */
 		r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
 		r1_bio->master_bio = bio;
-		r1_bio->sectors = (bio->bi_size >> 9) - sectors_handled;
+		r1_bio->sectors = bio_sectors(bio) - sectors_handled;
 		r1_bio->state = 0;
 		r1_bio->mddev = mddev;
 		r1_bio->sector = bio->bi_sector + sectors_handled;
@@ -1936,7 +1936,7 @@ static void sync_request_write(struct mddev *mddev, struct r1bio *r1_bio)
 		wbio->bi_rw = WRITE;
 		wbio->bi_end_io = end_sync_write;
 		atomic_inc(&r1_bio->remaining);
-		md_sync_acct(conf->mirrors[i].rdev->bdev, wbio->bi_size >> 9);
+		md_sync_acct(conf->mirrors[i].rdev->bdev, bio_sectors(wbio));
 
 		generic_make_request(wbio);
 	}
@@ -2272,8 +2272,7 @@ read_more:
 			r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
 
 			r1_bio->master_bio = mbio;
-			r1_bio->sectors = (mbio->bi_size >> 9)
-					  - sectors_handled;
+			r1_bio->sectors = bio_sectors(mbio) - sectors_handled;
 			r1_bio->state = 0;
 			set_bit(R1BIO_ReadError, &r1_bio->state);
 			r1_bio->mddev = mddev;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 1c2eb38..9715aaf 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1075,7 +1075,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 	/* If this request crosses a chunk boundary, we need to
 	 * split it.  This will only happen for 1 PAGE (or less) requests.
 	 */
-	if (unlikely((bio->bi_sector & chunk_mask) + (bio->bi_size >> 9)
+	if (unlikely((bio->bi_sector & chunk_mask) + bio_sectors(bio)
 		     > chunk_sects
 		     && (conf->geo.near_copies < conf->geo.raid_disks
 			 || conf->prev.near_copies < conf->prev.raid_disks))) {
@@ -1115,7 +1115,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 	bad_map:
 		printk("md/raid10:%s: make_request bug: can't convert block across chunks"
 		       " or bigger than %dk %llu %d\n", mdname(mddev), chunk_sects/2,
-		       (unsigned long long)bio->bi_sector, bio->bi_size >> 10);
+		       (unsigned long long)bio->bi_sector, bio_sectors(bio) / 2);
 
 		bio_io_error(bio);
 		return;
@@ -1130,7 +1130,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 	 */
 	wait_barrier(conf);
 
-	sectors = bio->bi_size >> 9;
+	sectors = bio_sectors(bio);
 	while (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 	    bio->bi_sector < conf->reshape_progress &&
 	    bio->bi_sector + sectors > conf->reshape_progress) {
@@ -1232,8 +1232,7 @@ read_again:
 			r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO);
 
 			r10_bio->master_bio = bio;
-			r10_bio->sectors = ((bio->bi_size >> 9)
-					    - sectors_handled);
+			r10_bio->sectors = bio_sectors(bio) - sectors_handled;
 			r10_bio->state = 0;
 			r10_bio->mddev = mddev;
 			r10_bio->sector = bio->bi_sector + sectors_handled;
@@ -1455,7 +1454,7 @@ retry_write:
 	 * after checking if we need to go around again.
 	 */
 
-	if (sectors_handled < (bio->bi_size >> 9)) {
+	if (sectors_handled < bio_sectors(bio)) {
 		one_write_done(r10_bio);
 		/* We need another r10_bio.  It has already been counted
 		 * in bio->bi_phys_segments.
@@ -1463,7 +1462,7 @@ retry_write:
 		r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO);
 
 		r10_bio->master_bio = bio;
-		r10_bio->sectors = (bio->bi_size >> 9) - sectors_handled;
+		r10_bio->sectors = bio_sectors(bio) - sectors_handled;
 
 		r10_bio->mddev = mddev;
 		r10_bio->sector = bio->bi_sector + sectors_handled;
@@ -1984,7 +1983,7 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 		d = r10_bio->devs[i].devnum;
 		atomic_inc(&conf->mirrors[d].rdev->nr_pending);
 		atomic_inc(&r10_bio->remaining);
-		md_sync_acct(conf->mirrors[d].rdev->bdev, tbio->bi_size >> 9);
+		md_sync_acct(conf->mirrors[d].rdev->bdev, bio_sectors(tbio));
 
 		tbio->bi_sector += conf->mirrors[d].rdev->data_offset;
 		tbio->bi_bdev = conf->mirrors[d].rdev->bdev;
@@ -2009,7 +2008,7 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 		d = r10_bio->devs[i].devnum;
 		atomic_inc(&r10_bio->remaining);
 		md_sync_acct(conf->mirrors[d].replacement->bdev,
-			     tbio->bi_size >> 9);
+			     bio_sectors(tbio));
 		generic_make_request(tbio);
 	}
 
@@ -2135,13 +2134,13 @@ static void recovery_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 	wbio2 = r10_bio->devs[1].repl_bio;
 	if (wbio->bi_end_io) {
 		atomic_inc(&conf->mirrors[d].rdev->nr_pending);
-		md_sync_acct(conf->mirrors[d].rdev->bdev, wbio->bi_size >> 9);
+		md_sync_acct(conf->mirrors[d].rdev->bdev, bio_sectors(wbio));
 		generic_make_request(wbio);
 	}
 	if (wbio2 && wbio2->bi_end_io) {
 		atomic_inc(&conf->mirrors[d].replacement->nr_pending);
 		md_sync_acct(conf->mirrors[d].replacement->bdev,
-			     wbio2->bi_size >> 9);
+			     bio_sectors(wbio2));
 		generic_make_request(wbio2);
 	}
 }
@@ -2571,8 +2570,7 @@ read_more:
 		r10_bio = mempool_alloc(conf->r10bio_pool,
 					GFP_NOIO);
 		r10_bio->master_bio = mbio;
-		r10_bio->sectors = (mbio->bi_size >> 9)
-			- sectors_handled;
+		r10_bio->sectors = bio_sectors(mbio) - sectors_handled;
 		r10_bio->state = 0;
 		set_bit(R10BIO_ReadError,
 			&r10_bio->state);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 7b36e2a..7c19dbe 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -88,7 +88,7 @@ static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect)
  */
 static inline struct bio *r5_next_bio(struct bio *bio, sector_t sector)
 {
-	int sectors = bio->bi_size >> 9;
+	int sectors = bio_sectors(bio);
 	if (bio->bi_sector + sectors < sector + STRIPE_SECTORS)
 		return bio->bi_next;
 	else
@@ -3771,7 +3771,7 @@ static int in_chunk_boundary(struct mddev *mddev, struct bio *bio)
 {
 	sector_t sector = bio->bi_sector + get_start_sect(bio->bi_bdev);
 	unsigned int chunk_sectors = mddev->chunk_sectors;
-	unsigned int bio_sectors = bio->bi_size >> 9;
+	unsigned int bio_sectors = bio_sectors(bio);
 
 	if (mddev->new_chunk_sectors < mddev->chunk_sectors)
 		chunk_sectors = mddev->new_chunk_sectors;
@@ -3861,7 +3861,7 @@ static int bio_fits_rdev(struct bio *bi)
 {
 	struct request_queue *q = bdev_get_queue(bi->bi_bdev);
 
-	if ((bi->bi_size>>9) > queue_max_sectors(q))
+	if (bio_sectors(bi) > queue_max_sectors(q))
 		return 0;
 	blk_recount_segments(q, bi);
 	if (bi->bi_phys_segments > queue_max_segments(q))
@@ -3931,7 +3931,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct bio * raid_bio)
 		align_bi->bi_flags &= ~(1 << BIO_SEG_VALID);
 
 		if (!bio_fits_rdev(align_bi) ||
-		    is_badblock(rdev, align_bi->bi_sector, align_bi->bi_size>>9,
+		    is_badblock(rdev, align_bi->bi_sector, bio_sectors(align_bi),
 				&first_bad, &bad_sectors)) {
 			/* too big in some way, or has a known bad block */
 			bio_put(align_bi);
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 05c5e61..3c210ac 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -193,7 +193,7 @@ TRACE_EVENT(block_bio_bounce,
 		__entry->dev		= bio->bi_bdev ?
 					  bio->bi_bdev->bd_dev : 0;
 		__entry->sector		= bio->bi_sector;
-		__entry->nr_sector	= bio->bi_size >> 9;
+		__entry->nr_sector	= bio_sectors(bio);
 		blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
 		memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
 	),
@@ -230,7 +230,7 @@ TRACE_EVENT(block_bio_complete,
 	TP_fast_assign(
 		__entry->dev		= bio->bi_bdev->bd_dev;
 		__entry->sector		= bio->bi_sector;
-		__entry->nr_sector	= bio->bi_size >> 9;
+		__entry->nr_sector	= bio_sectors(bio);
 		__entry->error		= error;
 		blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
 	),
@@ -258,7 +258,7 @@ DECLARE_EVENT_CLASS(block_bio,
 	TP_fast_assign(
 		__entry->dev		= bio->bi_bdev->bd_dev;
 		__entry->sector		= bio->bi_sector;
-		__entry->nr_sector	= bio->bi_size >> 9;
+		__entry->nr_sector	= bio_sectors(bio);
 		blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
 		memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
 	),
@@ -330,7 +330,7 @@ DECLARE_EVENT_CLASS(block_get_rq,
 	TP_fast_assign(
 		__entry->dev		= bio ? bio->bi_bdev->bd_dev : 0;
 		__entry->sector		= bio ? bio->bi_sector : 0;
-		__entry->nr_sector	= bio ? bio->bi_size >> 9 : 0;
+		__entry->nr_sector	= bio ? bio_sectors(bio) : 0;
 		blk_fill_rwbs(__entry->rwbs,
 			      bio ? bio->bi_rw : 0, __entry->nr_sector);
 		memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
@@ -506,7 +506,7 @@ TRACE_EVENT(block_bio_remap,
 	TP_fast_assign(
 		__entry->dev		= bio->bi_bdev->bd_dev;
 		__entry->sector		= bio->bi_sector;
-		__entry->nr_sector	= bio->bi_size >> 9;
+		__entry->nr_sector	= bio_sectors(bio);
 		__entry->old_dev	= dev;
 		__entry->old_sector	= from;
 		blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_size);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (5 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 06/26] block: Use bio_sectors() more consistently Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-8-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage Kent Overstreet
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

Prep work for immutable bio_vecs/efficient bio splitting: they require
auditing and removing most uses of bi_idx.

So here we convert bio_split() to respect the current value of bi_idx
and use the bio_iovec() macro, instead of assuming bi_idx will be 0.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 drivers/block/drbd/drbd_req.c | 6 +++---
 drivers/md/raid0.c            | 3 +--
 drivers/md/raid10.c           | 3 +--
 fs/bio-integrity.c            | 4 ++--
 fs/bio.c                      | 7 +++----
 5 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index af69a96..57eb253 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1155,11 +1155,11 @@ void drbd_make_request(struct request_queue *q, struct bio *bio)
 
 	/* can this bio be split generically?
 	 * Maybe add our own split-arbitrary-bios function. */
-	if (bio->bi_vcnt != 1 || bio->bi_idx != 0 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
+	if (bio_segments(bio) != 1 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
 		/* rather error out here than BUG in bio_split */
 		dev_err(DEV, "bio would need to, but cannot, be split: "
-		    "(vcnt=%u,idx=%u,size=%u,sector=%llu)\n",
-		    bio->bi_vcnt, bio->bi_idx, bio->bi_size,
+		    "(segments=%u,size=%u,sector=%llu)\n",
+		    bio_segments(bio), bio->bi_size,
 		    (unsigned long long)bio->bi_sector);
 		bio_endio(bio, -EINVAL);
 	} else {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 387cb89..0587450 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -509,8 +509,7 @@ static void raid0_make_request(struct mddev *mddev, struct bio *bio)
 		sector_t sector = bio->bi_sector;
 		struct bio_pair *bp;
 		/* Sanity check -- queue functions should prevent this happening */
-		if (bio->bi_vcnt != 1 ||
-		    bio->bi_idx != 0)
+		if (bio_segments(bio) != 1)
 			goto bad_map;
 		/* This is a one page bio that upper layers
 		 * refuse to split for us, so we need to split it.
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 9715aaf..bbd08f5 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1081,8 +1081,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
 			 || conf->prev.near_copies < conf->prev.raid_disks))) {
 		struct bio_pair *bp;
 		/* Sanity check -- queue functions should prevent this happening */
-		if (bio->bi_vcnt != 1 ||
-		    bio->bi_idx != 0)
+		if (bio_segments(bio) != 1)
 			goto bad_map;
 		/* This is a one page bio that upper layers
 		 * refuse to split for us, so we need to split it.
diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index 1d64f7f..e8555a5 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -657,8 +657,8 @@ void bio_integrity_split(struct bio *bio, struct bio_pair *bp, int sectors)
 	bp->bio1.bi_integrity = &bp->bip1;
 	bp->bio2.bi_integrity = &bp->bip2;
 
-	bp->iv1 = bip->bip_vec[0];
-	bp->iv2 = bip->bip_vec[0];
+	bp->iv1 = bip->bip_vec[bip->bip_idx];
+	bp->iv2 = bip->bip_vec[bip->bip_idx];
 
 	bp->bip1.bip_vec = &bp->iv1;
 	bp->bip2.bip_vec = &bp->iv2;
diff --git a/fs/bio.c b/fs/bio.c
index 07587c0..addeac2 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1616,8 +1616,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
 	trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
 				bi->bi_sector + first_sectors);
 
-	BUG_ON(bi->bi_vcnt != 1);
-	BUG_ON(bi->bi_idx != 0);
+	BUG_ON(bio_segments(bi) != 1);
 	atomic_set(&bp->cnt, 3);
 	bp->error = 0;
 	bp->bio1 = *bi;
@@ -1626,8 +1625,8 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
 	bp->bio2.bi_size -= first_sectors << 9;
 	bp->bio1.bi_size = first_sectors << 9;
 
-	bp->bv1 = bi->bi_io_vec[0];
-	bp->bv2 = bi->bi_io_vec[0];
+	bp->bv1 = *bio_iovec(bi);
+	bp->bv2 = *bio_iovec(bi);
 	bp->bv2.bv_offset += first_sectors << 9;
 	bp->bv2.bv_len -= first_sectors << 9;
 	bp->bv1.bv_len = first_sectors << 9;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 08/26] block: Remove bi_idx references
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-9-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 12/26] raid1: use bio_reset() Kent Overstreet
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

These were harmless but uneccessary,andt getting rid of them makes the
code easier to audit since most of them need to be removed.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
---
 drivers/block/floppy.c | 1 -
 drivers/md/dm-verity.c | 2 +-
 drivers/md/raid10.c    | 1 -
 fs/buffer.c            | 1 -
 fs/jfs/jfs_logmgr.c    | 2 --
 fs/logfs/dev_bdev.c    | 5 -----
 mm/page_io.c           | 1 -
 7 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 95e52879..24e5cef 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3778,7 +3778,6 @@ static int __floppy_read_block_0(struct block_device *bdev)
 	bio_vec.bv_len = size;
 	bio_vec.bv_offset = 0;
 	bio.bi_vcnt = 1;
-	bio.bi_idx = 0;
 	bio.bi_size = size;
 	bio.bi_bdev = bdev;
 	bio.bi_sector = 0;
diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c
index 18ef6c5..6956626 100644
--- a/drivers/md/dm-verity.c
+++ b/drivers/md/dm-verity.c
@@ -496,7 +496,7 @@ static int verity_map(struct dm_target *ti, struct bio *bio,
 
 	bio->bi_end_io = verity_end_io;
 	bio->bi_private = io;
-	io->io_vec_size = bio->bi_vcnt - bio->bi_idx;
+	io->io_vec_size = bio_segments(bio);
 	if (io->io_vec_size < DM_VERITY_IO_VEC_INLINE)
 		io->io_vec = io->io_vec_inline;
 	else
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index bbd08f5..6d06d83 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4249,7 +4249,6 @@ read_more:
 	read_bio->bi_flags &= ~(BIO_POOL_MASK - 1);
 	read_bio->bi_flags |= 1 << BIO_UPTODATE;
 	read_bio->bi_vcnt = 0;
-	read_bio->bi_idx = 0;
 	read_bio->bi_size = 0;
 	r10_bio->master_bio = read_bio;
 	r10_bio->read_slot = r10_bio->devs[r10_bio->read_slot].devnum;
diff --git a/fs/buffer.c b/fs/buffer.c
index 58e2e7b..38d8793 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2893,7 +2893,6 @@ int submit_bh(int rw, struct buffer_head * bh)
 	bio->bi_io_vec[0].bv_offset = bh_offset(bh);
 
 	bio->bi_vcnt = 1;
-	bio->bi_idx = 0;
 	bio->bi_size = bh->b_size;
 
 	bio->bi_end_io = end_bio_bh_io_sync;
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 2eb952c..8ae5e35 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2004,7 +2004,6 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp)
 	bio->bi_io_vec[0].bv_offset = bp->l_offset;
 
 	bio->bi_vcnt = 1;
-	bio->bi_idx = 0;
 	bio->bi_size = LOGPSIZE;
 
 	bio->bi_end_io = lbmIODone;
@@ -2145,7 +2144,6 @@ static void lbmStartIO(struct lbuf * bp)
 	bio->bi_io_vec[0].bv_offset = bp->l_offset;
 
 	bio->bi_vcnt = 1;
-	bio->bi_idx = 0;
 	bio->bi_size = LOGPSIZE;
 
 	bio->bi_end_io = lbmIODone;
diff --git a/fs/logfs/dev_bdev.c b/fs/logfs/dev_bdev.c
index e784a21..550475c 100644
--- a/fs/logfs/dev_bdev.c
+++ b/fs/logfs/dev_bdev.c
@@ -32,7 +32,6 @@ static int sync_request(struct page *page, struct block_device *bdev, int rw)
 	bio_vec.bv_len = PAGE_SIZE;
 	bio_vec.bv_offset = 0;
 	bio.bi_vcnt = 1;
-	bio.bi_idx = 0;
 	bio.bi_size = PAGE_SIZE;
 	bio.bi_bdev = bdev;
 	bio.bi_sector = page->index * (PAGE_SIZE >> 9);
@@ -108,7 +107,6 @@ static int __bdev_writeseg(struct super_block *sb, u64 ofs, pgoff_t index,
 		if (i >= max_pages) {
 			/* Block layer cannot split bios :( */
 			bio->bi_vcnt = i;
-			bio->bi_idx = 0;
 			bio->bi_size = i * PAGE_SIZE;
 			bio->bi_bdev = super->s_bdev;
 			bio->bi_sector = ofs >> 9;
@@ -136,7 +134,6 @@ static int __bdev_writeseg(struct super_block *sb, u64 ofs, pgoff_t index,
 		unlock_page(page);
 	}
 	bio->bi_vcnt = nr_pages;
-	bio->bi_idx = 0;
 	bio->bi_size = nr_pages * PAGE_SIZE;
 	bio->bi_bdev = super->s_bdev;
 	bio->bi_sector = ofs >> 9;
@@ -202,7 +199,6 @@ static int do_erase(struct super_block *sb, u64 ofs, pgoff_t index,
 		if (i >= max_pages) {
 			/* Block layer cannot split bios :( */
 			bio->bi_vcnt = i;
-			bio->bi_idx = 0;
 			bio->bi_size = i * PAGE_SIZE;
 			bio->bi_bdev = super->s_bdev;
 			bio->bi_sector = ofs >> 9;
@@ -224,7 +220,6 @@ static int do_erase(struct super_block *sb, u64 ofs, pgoff_t index,
 		bio->bi_io_vec[i].bv_offset = 0;
 	}
 	bio->bi_vcnt = nr_pages;
-	bio->bi_idx = 0;
 	bio->bi_size = nr_pages * PAGE_SIZE;
 	bio->bi_bdev = super->s_bdev;
 	bio->bi_sector = ofs >> 9;
diff --git a/mm/page_io.c b/mm/page_io.c
index 78eee32..8d3c0c0 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -35,7 +35,6 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
 		bio->bi_io_vec[0].bv_len = PAGE_SIZE;
 		bio->bi_io_vec[0].bv_offset = 0;
 		bio->bi_vcnt = 1;
-		bio->bi_idx = 0;
 		bio->bi_size = PAGE_SIZE;
 		bio->bi_end_io = end_io;
 	}
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (6 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0 Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-10-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md Kent Overstreet
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

More prep work for immutable bvecs/effecient bio splitting - usage of
bi_vcnt has to be auditing, so getting rid of all the unnecessary usage
makes that easier.

Plus, bio_segments() is really what this code wanted, as it respects the
current value of bi_idx.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 drivers/message/fusion/mptsas.c          |  6 +++---
 drivers/scsi/libsas/sas_expander.c       |  6 +++---
 drivers/scsi/mpt2sas/mpt2sas_transport.c | 10 +++++-----
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 551262e..5406a9f 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -2235,10 +2235,10 @@ static int mptsas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	}
 
 	/* do we need to support multiple segments? */
-	if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) {
 		printk(MYIOC_s_ERR_FMT "%s: multiple segments req %u %u, rsp %u %u\n",
-		    ioc->name, __func__, req->bio->bi_vcnt, blk_rq_bytes(req),
-		    rsp->bio->bi_vcnt, blk_rq_bytes(rsp));
+		    ioc->name, __func__, bio_segments(req->bio), blk_rq_bytes(req),
+		    bio_segments(rsp->bio), blk_rq_bytes(rsp));
 		return -EINVAL;
 	}
 
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index efc6e72..ee331a7 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -2151,10 +2151,10 @@ int sas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	}
 
 	/* do we need to support multiple segments? */
-	if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) {
 		printk("%s: multiple segments req %u %u, rsp %u %u\n",
-		       __func__, req->bio->bi_vcnt, blk_rq_bytes(req),
-		       rsp->bio->bi_vcnt, blk_rq_bytes(rsp));
+		       __func__, bio_segments(req->bio), blk_rq_bytes(req),
+		       bio_segments(rsp->bio), blk_rq_bytes(rsp));
 		return -EINVAL;
 	}
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index c6cf20f..403a57b 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1939,7 +1939,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	ioc->transport_cmds.status = MPT2_CMD_PENDING;
 
 	/* Check if the request is split across multiple segments */
-	if (req->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1) {
 		u32 offset = 0;
 
 		/* Allocate memory and copy the request */
@@ -1971,7 +1971,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 
 	/* Check if the response needs to be populated across
 	 * multiple segments */
-	if (rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(rsp->bio) > 1) {
 		pci_addr_in = pci_alloc_consistent(ioc->pdev, blk_rq_bytes(rsp),
 		    &pci_dma_in);
 		if (!pci_addr_in) {
@@ -2038,7 +2038,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	sgl_flags = (MPI2_SGE_FLAGS_SIMPLE_ELEMENT |
 	    MPI2_SGE_FLAGS_END_OF_BUFFER | MPI2_SGE_FLAGS_HOST_TO_IOC);
 	sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT;
-	if (req->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1) {
 		ioc->base_add_sg_single(psge, sgl_flags |
 		    (blk_rq_bytes(req) - 4), pci_dma_out);
 	} else {
@@ -2054,7 +2054,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	    MPI2_SGE_FLAGS_LAST_ELEMENT | MPI2_SGE_FLAGS_END_OF_BUFFER |
 	    MPI2_SGE_FLAGS_END_OF_LIST);
 	sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT;
-	if (rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(rsp->bio) > 1) {
 		ioc->base_add_sg_single(psge, sgl_flags |
 		    (blk_rq_bytes(rsp) + 4), pci_dma_in);
 	} else {
@@ -2099,7 +2099,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 		    le16_to_cpu(mpi_reply->ResponseDataLength);
 		/* check if the resp needs to be copied from the allocated
 		 * pci mem */
-		if (rsp->bio->bi_vcnt > 1) {
+		if (bio_segments(rsp->bio) > 1) {
 			u32 offset = 0;
 			u32 bytes_to_copy =
 			    le16_to_cpu(mpi_reply->ResponseDataLength);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (7 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-11-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 11/26] raid10: Use bio_reset() Kent Overstreet
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, tj, Kent Overstreet

Random cleanup - this code was duplicated and it's not really specific
to md.

Also added the ability to return the actual error code.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/raid1.c  | 19 -------------------
 drivers/md/raid10.c | 19 -------------------
 fs/bio.c            | 36 ++++++++++++++++++++++++++++++++++++
 include/linux/bio.h |  1 +
 4 files changed, 37 insertions(+), 38 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 2488440..ee85154 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2048,25 +2048,6 @@ static void fix_read_error(struct r1conf *conf, int read_disk,
 	}
 }
 
-static void bi_complete(struct bio *bio, int error)
-{
-	complete((struct completion *)bio->bi_private);
-}
-
-static int submit_bio_wait(int rw, struct bio *bio)
-{
-	struct completion event;
-	rw |= REQ_SYNC;
-
-	init_completion(&event);
-	bio->bi_private = &event;
-	bio->bi_end_io = bi_complete;
-	submit_bio(rw, bio);
-	wait_for_completion(&event);
-
-	return test_bit(BIO_UPTODATE, &bio->bi_flags);
-}
-
 static int narrow_write_error(struct r1bio *r1_bio, int i)
 {
 	struct mddev *mddev = r1_bio->mddev;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 6d06d83..f001c1b 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2410,25 +2410,6 @@ static void fix_read_error(struct r10conf *conf, struct mddev *mddev, struct r10
 	}
 }
 
-static void bi_complete(struct bio *bio, int error)
-{
-	complete((struct completion *)bio->bi_private);
-}
-
-static int submit_bio_wait(int rw, struct bio *bio)
-{
-	struct completion event;
-	rw |= REQ_SYNC;
-
-	init_completion(&event);
-	bio->bi_private = &event;
-	bio->bi_end_io = bi_complete;
-	submit_bio(rw, bio);
-	wait_for_completion(&event);
-
-	return test_bit(BIO_UPTODATE, &bio->bi_flags);
-}
-
 static int narrow_write_error(struct r10bio *r10_bio, int i)
 {
 	struct bio *bio = r10_bio->master_bio;
diff --git a/fs/bio.c b/fs/bio.c
index addeac2..1342a16 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -750,6 +750,42 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
 }
 EXPORT_SYMBOL(bio_add_page);
 
+struct submit_bio_ret {
+	struct completion event;
+	int error;
+};
+
+static void submit_bio_wait_endio(struct bio *bio, int error)
+{
+	struct submit_bio_ret *ret = bio->bi_private;
+
+	ret->error = error;
+	complete(&ret->event);
+}
+
+/**
+ * submit_bio_wait - submit a bio, and wait until it completes
+ * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
+ * @bio: The &struct bio which describes the I/O
+ *
+ * Simple wrapper around submit_bio(). Returns 0 on success, or the error from
+ * bio_endio() on failure.
+ */
+int submit_bio_wait(int rw, struct bio *bio)
+{
+	struct submit_bio_ret ret;
+
+	rw |= REQ_SYNC;
+	init_completion(&ret.event);
+	bio->bi_private = &ret;
+	bio->bi_end_io = submit_bio_wait_endio;
+	submit_bio(rw, bio);
+	wait_for_completion(&ret.event);
+
+	return ret.error;
+}
+EXPORT_SYMBOL(submit_bio_wait);
+
 /**
  * bio_advance - increment/complete a bio by some number of bytes
  * @bio:	bio to advance
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 92bff0e..949c48a 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -249,6 +249,7 @@ extern void bio_endio(struct bio *, int);
 struct request_queue;
 extern int bio_phys_segments(struct request_queue *, struct bio *);
 
+extern int submit_bio_wait(int rw, struct bio *bio);
 void bio_advance(struct bio *, unsigned);
 
 extern void bio_init(struct bio *);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 11/26] raid10: Use bio_reset()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (8 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
  2012-09-20 23:59   ` Tejun Heo
  2012-09-11  0:22 ` [PATCH v2 13/26] raid5: use bio_reset() Kent Overstreet
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

More prep work for immutable bio vecs, mainly getting rid of references
to bi_idx.

bio_reset was being open coded in a few places. The one in sync_request
was a bit nontrivial to convert, so could use some extra eyeballs.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/raid10.c | 31 +++++++++----------------------
 1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index f001c1b..6b83207 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1958,13 +1958,10 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 		 * First we need to fixup bv_offset, bv_len and
 		 * bi_vecs, as the read request might have corrupted these
 		 */
+		bio_reset(tbio);
+
 		tbio->bi_vcnt = vcnt;
 		tbio->bi_size = r10_bio->sectors << 9;
-		tbio->bi_idx = 0;
-		tbio->bi_phys_segments = 0;
-		tbio->bi_flags &= ~(BIO_POOL_MASK - 1);
-		tbio->bi_flags |= 1 << BIO_UPTODATE;
-		tbio->bi_next = NULL;
 		tbio->bi_rw = WRITE;
 		tbio->bi_private = r10_bio;
 		tbio->bi_sector = r10_bio->devs[i].addr;
@@ -2970,6 +2967,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 					}
 				}
 				bio = r10_bio->devs[0].bio;
+				bio_reset(bio);
 				bio->bi_next = biolist;
 				biolist = bio;
 				bio->bi_private = r10_bio;
@@ -2994,6 +2992,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 				rdev = mirror->rdev;
 				if (!test_bit(In_sync, &rdev->flags)) {
 					bio = r10_bio->devs[1].bio;
+					bio_reset(bio);
 					bio->bi_next = biolist;
 					biolist = bio;
 					bio->bi_private = r10_bio;
@@ -3022,6 +3021,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 				if (rdev == NULL || bio == NULL ||
 				    test_bit(Faulty, &rdev->flags))
 					break;
+				bio_reset(bio);
 				bio->bi_next = biolist;
 				biolist = bio;
 				bio->bi_private = r10_bio;
@@ -3120,7 +3120,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 				r10_bio->devs[i].repl_bio->bi_end_io = NULL;
 
 			bio = r10_bio->devs[i].bio;
-			bio->bi_end_io = NULL;
+			bio_reset(bio);
 			clear_bit(BIO_UPTODATE, &bio->bi_flags);
 			if (conf->mirrors[d].rdev == NULL ||
 			    test_bit(Faulty, &conf->mirrors[d].rdev->flags))
@@ -3157,6 +3157,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 
 			/* Need to set up for writing to the replacement */
 			bio = r10_bio->devs[i].repl_bio;
+			bio_reset(bio);
 			clear_bit(BIO_UPTODATE, &bio->bi_flags);
 
 			sector = r10_bio->devs[i].addr;
@@ -3190,17 +3191,6 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 		}
 	}
 
-	for (bio = biolist; bio ; bio=bio->bi_next) {
-
-		bio->bi_flags &= ~(BIO_POOL_MASK - 1);
-		if (bio->bi_end_io)
-			bio->bi_flags |= 1 << BIO_UPTODATE;
-		bio->bi_vcnt = 0;
-		bio->bi_idx = 0;
-		bio->bi_phys_segments = 0;
-		bio->bi_size = 0;
-	}
-
 	nr_sectors = 0;
 	if (sector_nr + max_sync < max_sector)
 		max_sector = sector_nr + max_sync;
@@ -4253,17 +4243,14 @@ read_more:
 		}
 		if (!rdev2 || test_bit(Faulty, &rdev2->flags))
 			continue;
+
+		bio_reset(b);
 		b->bi_bdev = rdev2->bdev;
 		b->bi_sector = r10_bio->devs[s/2].addr + rdev2->new_data_offset;
 		b->bi_private = r10_bio;
 		b->bi_end_io = end_reshape_write;
 		b->bi_rw = WRITE;
-		b->bi_flags &= ~(BIO_POOL_MASK - 1);
-		b->bi_flags |= 1 << BIO_UPTODATE;
 		b->bi_next = blist;
-		b->bi_vcnt = 0;
-		b->bi_idx = 0;
-		b->bi_size = 0;
 		blist = b;
 	}
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 12/26] raid1: use bio_reset()
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 08/26] block: Remove bi_idx references Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-13-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec Kent Overstreet
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

I couldn't figure out what sbio->bi_end_io in process_checks() was
supposed to be, so I took the easy way out.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
---
 drivers/md/raid1.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ee85154..bd3e3b9 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1835,6 +1835,7 @@ static int process_checks(struct r1bio *r1_bio)
 	int primary;
 	int i;
 	int vcnt;
+	bio_end_io_t *bi_end_io;
 
 	for (primary = 0; primary < conf->raid_disks * 2; primary++)
 		if (r1_bio->bios[primary]->bi_end_io == end_sync_read &&
@@ -1876,13 +1877,11 @@ static int process_checks(struct r1bio *r1_bio)
 			continue;
 		}
 		/* fixup the bio for reuse */
+		bi_end_io = sbio->bi_end_io;
+		bio_reset(sbio);
+
 		sbio->bi_vcnt = vcnt;
 		sbio->bi_size = r1_bio->sectors << 9;
-		sbio->bi_idx = 0;
-		sbio->bi_phys_segments = 0;
-		sbio->bi_flags &= ~(BIO_POOL_MASK - 1);
-		sbio->bi_flags |= 1 << BIO_UPTODATE;
-		sbio->bi_next = NULL;
 		sbio->bi_sector = r1_bio->sector +
 			conf->mirrors[i].rdev->data_offset;
 		sbio->bi_bdev = conf->mirrors[i].rdev->bdev;
@@ -2426,18 +2425,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
 	for (i = 0; i < conf->raid_disks * 2; i++) {
 		struct md_rdev *rdev;
 		bio = r1_bio->bios[i];
-
-		/* take from bio_init */
-		bio->bi_next = NULL;
-		bio->bi_flags &= ~(BIO_POOL_MASK-1);
-		bio->bi_flags |= 1 << BIO_UPTODATE;
-		bio->bi_rw = READ;
-		bio->bi_vcnt = 0;
-		bio->bi_idx = 0;
-		bio->bi_phys_segments = 0;
-		bio->bi_size = 0;
-		bio->bi_end_io = NULL;
-		bio->bi_private = NULL;
+		bio_reset(bio);
 
 		rdev = rcu_dereference(conf->mirrors[i].rdev);
 		if (rdev == NULL ||
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 13/26] raid5: use bio_reset()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (9 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 11/26] raid10: Use bio_reset() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-14-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 14/26] raid1: Refactor narrow_write_error() to not use bi_idx Kent Overstreet
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

Had to shuffle the code around a bit (where bi_rw and bi_end_io were
set), but shouldn't really be anything tricky here

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/raid5.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 7c19dbe..ebe43f7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -561,14 +561,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
 		bi = &sh->dev[i].req;
 		rbi = &sh->dev[i].rreq; /* For writing to replacement */
 
-		bi->bi_rw = rw;
-		rbi->bi_rw = rw;
-		if (rw & WRITE) {
-			bi->bi_end_io = raid5_end_write_request;
-			rbi->bi_end_io = raid5_end_write_request;
-		} else
-			bi->bi_end_io = raid5_end_read_request;
-
 		rcu_read_lock();
 		rrdev = rcu_dereference(conf->disks[i].replacement);
 		smp_mb(); /* Ensure that if rrdev is NULL, rdev won't be */
@@ -643,7 +635,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
 
 			set_bit(STRIPE_IO_STARTED, &sh->state);
 
+			bio_reset(bi);
 			bi->bi_bdev = rdev->bdev;
+			bi->bi_rw = rw;
+			bi->bi_end_io = (rw & WRITE)
+				? raid5_end_write_request
+				: raid5_end_read_request;
+			bi->bi_private = sh;
+
 			pr_debug("%s: for %llu schedule op %ld on disc %d\n",
 				__func__, (unsigned long long)sh->sector,
 				bi->bi_rw, i);
@@ -657,12 +656,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
 			if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
 				bi->bi_rw |= REQ_FLUSH;
 
-			bi->bi_flags = 1 << BIO_UPTODATE;
-			bi->bi_idx = 0;
 			bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
 			bi->bi_io_vec[0].bv_offset = 0;
 			bi->bi_size = STRIPE_SIZE;
-			bi->bi_next = NULL;
 			if (rrdev)
 				set_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags);
 			generic_make_request(bi);
@@ -674,7 +670,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
 
 			set_bit(STRIPE_IO_STARTED, &sh->state);
 
+			bio_reset(rbi);
 			rbi->bi_bdev = rrdev->bdev;
+			rbi->bi_rw = rw;
+			rbi->bi_end_io = (rw & WRITE)
+				? raid5_end_write_request
+				: raid5_end_read_request;
+			rbi->bi_private = sh;
+
 			pr_debug("%s: for %llu schedule op %ld on "
 				 "replacement disc %d\n",
 				__func__, (unsigned long long)sh->sector,
@@ -686,12 +689,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
 			else
 				rbi->bi_sector = (sh->sector
 						  + rrdev->data_offset);
-			rbi->bi_flags = 1 << BIO_UPTODATE;
-			rbi->bi_idx = 0;
 			rbi->bi_io_vec[0].bv_len = STRIPE_SIZE;
 			rbi->bi_io_vec[0].bv_offset = 0;
 			rbi->bi_size = STRIPE_SIZE;
-			rbi->bi_next = NULL;
 			generic_make_request(rbi);
 		}
 		if (!rdev && !rrdev) {
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 14/26] raid1: Refactor narrow_write_error() to not use bi_idx
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (10 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 13/26] raid5: use bio_reset() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 15/26] block: Add bio_copy_data() Kent Overstreet
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

More bi_idx removal. This code was just open coding bio_clone(). This
could probably be further improved by using bio_advance() instead of
skipping over null pages, but that'd be a larger rework.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/raid1.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index bd3e3b9..b1072da 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2052,8 +2052,6 @@ static int narrow_write_error(struct r1bio *r1_bio, int i)
 	struct mddev *mddev = r1_bio->mddev;
 	struct r1conf *conf = mddev->private;
 	struct md_rdev *rdev = conf->mirrors[i].rdev;
-	int vcnt, idx;
-	struct bio_vec *vec;
 
 	/* bio has the data to be written to device 'i' where
 	 * we just recently had a write error.
@@ -2081,30 +2079,32 @@ static int narrow_write_error(struct r1bio *r1_bio, int i)
 		   & ~(sector_t)(block_sectors - 1))
 		- sector;
 
-	if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
-		vcnt = r1_bio->behind_page_count;
-		vec = r1_bio->behind_bvecs;
-		idx = 0;
-		while (vec[idx].bv_page == NULL)
-			idx++;
-	} else {
-		vcnt = r1_bio->master_bio->bi_vcnt;
-		vec = r1_bio->master_bio->bi_io_vec;
-		idx = r1_bio->master_bio->bi_idx;
-	}
 	while (sect_to_write) {
 		struct bio *wbio;
 		if (sectors > sect_to_write)
 			sectors = sect_to_write;
 		/* Write at 'sector' for 'sectors'*/
 
-		wbio = bio_alloc_mddev(GFP_NOIO, vcnt, mddev);
-		memcpy(wbio->bi_io_vec, vec, vcnt * sizeof(struct bio_vec));
-		wbio->bi_sector = r1_bio->sector;
+		if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
+			unsigned vcnt = r1_bio->behind_page_count;
+			struct bio_vec *vec = r1_bio->behind_bvecs;
+
+			while (!vec->bv_page) {
+				vec++;
+				vcnt--;
+			}
+
+			wbio = bio_alloc_mddev(GFP_NOIO, vcnt, mddev);
+			memcpy(wbio->bi_io_vec, vec, vcnt * sizeof(struct bio_vec));
+
+			wbio->bi_vcnt = vcnt;
+		} else {
+			wbio = bio_clone_mddev(r1_bio->master_bio, GFP_NOIO, mddev);
+		}
+
 		wbio->bi_rw = WRITE;
-		wbio->bi_vcnt = vcnt;
+		wbio->bi_sector = r1_bio->sector;
 		wbio->bi_size = r1_bio->sectors << 9;
-		wbio->bi_idx = idx;
 
 		md_trim_bio(wbio, sector - r1_bio->sector, sectors);
 		wbio->bi_sector += rdev->data_offset;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 15/26] block: Add bio_copy_data()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (11 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 14/26] raid1: Refactor narrow_write_error() to not use bi_idx Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-16-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 16/26] pktcdvd: use bio_copy_data() Kent Overstreet
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

This gets open coded quite a bit and it's tricky to get right, so make a
generic version and convert some existing users over to it instead.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 fs/bio.c            | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/bio.h |  2 ++
 2 files changed, 72 insertions(+)

diff --git a/fs/bio.c b/fs/bio.c
index 1342a16..7fb9f4e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -827,6 +827,76 @@ void bio_advance(struct bio *bio, unsigned bytes)
 }
 EXPORT_SYMBOL(bio_advance);
 
+/**
+ * bio_copy_data - copy contents of data buffers from one chain of bios to
+ * another
+ * @src: source bio list
+ * @dst: destination bio list
+ *
+ * If @src and @dst are single bios, bi_next must be NULL - otherwise, treats
+ * @src and @dst as linked lists of bios.
+ *
+ * Stops when it reaches the end of either @src or @dst - that is, copies
+ * min(src->bi_size, dst->bi_size) bytes (or the equivalent for lists of bios).
+ */
+void bio_copy_data(struct bio *dst, struct bio *src)
+{
+	struct bio_vec *src_bv, *dst_bv;
+	unsigned src_offset, dst_offset, bytes;
+	void *src_p, *dst_p;
+
+	src_bv = bio_iovec(src);
+	dst_bv = bio_iovec(dst);
+
+	src_offset = src_bv->bv_offset;
+	dst_offset = dst_bv->bv_offset;
+
+	while (1) {
+		if (src_offset == src_bv->bv_offset + src_bv->bv_len) {
+			src_bv++;
+			if (src_bv == bio_iovec_idx(src, src->bi_vcnt)) {
+				src = src->bi_next;
+				if (!src)
+					break;
+
+				src_bv = bio_iovec(src);
+			}
+
+			src_offset = src_bv->bv_offset;
+		}
+
+		if (dst_offset == dst_bv->bv_offset + dst_bv->bv_len) {
+			dst_bv++;
+			if (dst_bv == bio_iovec_idx(dst, dst->bi_vcnt)) {
+				dst = dst->bi_next;
+				if (!dst)
+					break;
+
+				dst_bv = bio_iovec(dst);
+			}
+
+			dst_offset = dst_bv->bv_offset;
+		}
+
+		bytes = min(dst_bv->bv_offset + dst_bv->bv_len - dst_offset,
+			    src_bv->bv_offset + src_bv->bv_len - src_offset);
+
+		src_p = kmap_atomic(src_bv->bv_page);
+		dst_p = kmap_atomic(dst_bv->bv_page);
+
+		memcpy(dst_p + dst_bv->bv_offset,
+		       src_p + src_bv->bv_offset,
+		       bytes);
+
+		kunmap_atomic(dst_p);
+		kunmap_atomic(src_p);
+
+		src_offset += bytes;
+		dst_offset += bytes;
+	}
+}
+EXPORT_SYMBOL(bio_copy_data);
+
 struct bio_map_data {
 	struct bio_vec *iovecs;
 	struct sg_iovec *sgvecs;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 949c48a..92015ce 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -286,6 +286,8 @@ static inline void bio_flush_dcache_pages(struct bio *bi)
 }
 #endif
 
+extern void bio_copy_data(struct bio *dst, struct bio *src);
+
 extern struct bio *bio_copy_user(struct request_queue *, struct rq_map_data *,
 				 unsigned long, unsigned int, int, gfp_t);
 extern struct bio *bio_copy_user_iov(struct request_queue *,
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 16/26] pktcdvd: use bio_copy_data()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (12 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 15/26] block: Add bio_copy_data() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 17/26] pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage Kent Overstreet
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel
  Cc: axboe, Kent Overstreet, tj, neilb, Jiri Kosina

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Jiri Kosina <jkosina@suse.cz>
---
 drivers/block/pktcdvd.c | 79 ++++++++-----------------------------------------
 1 file changed, 12 insertions(+), 67 deletions(-)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 0824627..1079a77 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -948,31 +948,6 @@ static int pkt_set_segment_merging(struct pktcdvd_device *pd, struct request_que
 }
 
 /*
- * Copy CD_FRAMESIZE bytes from src_bio into a destination page
- */
-static void pkt_copy_bio_data(struct bio *src_bio, int seg, int offs, struct page *dst_page, int dst_offs)
-{
-	unsigned int copy_size = CD_FRAMESIZE;
-
-	while (copy_size > 0) {
-		struct bio_vec *src_bvl = bio_iovec_idx(src_bio, seg);
-		void *vfrom = kmap_atomic(src_bvl->bv_page) +
-			src_bvl->bv_offset + offs;
-		void *vto = page_address(dst_page) + dst_offs;
-		int len = min_t(int, copy_size, src_bvl->bv_len - offs);
-
-		BUG_ON(len < 0);
-		memcpy(vto, vfrom, len);
-		kunmap_atomic(vfrom);
-
-		seg++;
-		offs = 0;
-		dst_offs += len;
-		copy_size -= len;
-	}
-}
-
-/*
  * Copy all data for this packet to pkt->pages[], so that
  * a) The number of required segments for the write bio is minimized, which
  *    is necessary for some scsi controllers.
@@ -1325,55 +1300,35 @@ try_next_bio:
  */
 static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt)
 {
-	struct bio *bio;
 	int f;
-	int frames_write;
 	struct bio_vec *bvec = pkt->w_bio->bi_io_vec;
 
+	bio_reset(pkt->w_bio);
+	pkt->w_bio->bi_sector = pkt->sector;
+	pkt->w_bio->bi_bdev = pd->bdev;
+	pkt->w_bio->bi_end_io = pkt_end_io_packet_write;
+	pkt->w_bio->bi_private = pkt;
+
+	/* XXX: locking? */
 	for (f = 0; f < pkt->frames; f++) {
 		bvec[f].bv_page = pkt->pages[(f * CD_FRAMESIZE) / PAGE_SIZE];
 		bvec[f].bv_offset = (f * CD_FRAMESIZE) % PAGE_SIZE;
+		if (!bio_add_page(pkt->w_bio, bvec[f].bv_page, CD_FRAMESIZE, bvec[f].bv_offset))
+			BUG();
 	}
+	VPRINTK(DRIVER_NAME": vcnt=%d\n", pkt->w_bio->bi_vcnt);
 
 	/*
 	 * Fill-in bvec with data from orig_bios.
 	 */
-	frames_write = 0;
 	spin_lock(&pkt->lock);
-	bio_list_for_each(bio, &pkt->orig_bios) {
-		int segment = bio->bi_idx;
-		int src_offs = 0;
-		int first_frame = (bio->bi_sector - pkt->sector) / (CD_FRAMESIZE >> 9);
-		int num_frames = bio->bi_size / CD_FRAMESIZE;
-		BUG_ON(first_frame < 0);
-		BUG_ON(first_frame + num_frames > pkt->frames);
-		for (f = first_frame; f < first_frame + num_frames; f++) {
-			struct bio_vec *src_bvl = bio_iovec_idx(bio, segment);
-
-			while (src_offs >= src_bvl->bv_len) {
-				src_offs -= src_bvl->bv_len;
-				segment++;
-				BUG_ON(segment >= bio->bi_vcnt);
-				src_bvl = bio_iovec_idx(bio, segment);
-			}
+	bio_copy_data(pkt->w_bio, pkt->orig_bios.head);
 
-			if (src_bvl->bv_len - src_offs >= CD_FRAMESIZE) {
-				bvec[f].bv_page = src_bvl->bv_page;
-				bvec[f].bv_offset = src_bvl->bv_offset + src_offs;
-			} else {
-				pkt_copy_bio_data(bio, segment, src_offs,
-						  bvec[f].bv_page, bvec[f].bv_offset);
-			}
-			src_offs += CD_FRAMESIZE;
-			frames_write++;
-		}
-	}
 	pkt_set_state(pkt, PACKET_WRITE_WAIT_STATE);
 	spin_unlock(&pkt->lock);
 
 	VPRINTK("pkt_start_write: Writing %d frames for zone %llx\n",
-		frames_write, (unsigned long long)pkt->sector);
-	BUG_ON(frames_write != pkt->write_size);
+		pkt->write_size, (unsigned long long)pkt->sector);
 
 	if (test_bit(PACKET_MERGE_SEGS, &pd->flags) || (pkt->write_size < pkt->frames)) {
 		pkt_make_local_copy(pkt, bvec);
@@ -1383,16 +1338,6 @@ static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt)
 	}
 
 	/* Start the write request */
-	bio_reset(pkt->w_bio);
-	pkt->w_bio->bi_sector = pkt->sector;
-	pkt->w_bio->bi_bdev = pd->bdev;
-	pkt->w_bio->bi_end_io = pkt_end_io_packet_write;
-	pkt->w_bio->bi_private = pkt;
-	for (f = 0; f < pkt->frames; f++)
-		if (!bio_add_page(pkt->w_bio, bvec[f].bv_page, CD_FRAMESIZE, bvec[f].bv_offset))
-			BUG();
-	VPRINTK(DRIVER_NAME": vcnt=%d\n", pkt->w_bio->bi_vcnt);
-
 	atomic_set(&pkt->io_wait, 1);
 	pkt->w_bio->bi_rw = WRITE;
 	pkt_queue_bio(pd, pkt->w_bio);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 17/26] pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (13 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 16/26] pktcdvd: use bio_copy_data() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 18/26] raid1: use bio_copy_data() Kent Overstreet
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel
  Cc: axboe, Kent Overstreet, tj, neilb, Jiri Kosina

In the short term this'll help with code auditing, and if this code ever
gets used now it's converted :)

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jiri Kosina <jkosina@suse.cz>
---
 drivers/block/pktcdvd.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 1079a77..5318ad39 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -1156,16 +1156,15 @@ static int pkt_start_recovery(struct packet_data *pkt)
 	new_sector = new_block * (CD_FRAMESIZE >> 9);
 	pkt->sector = new_sector;
 
+	bio_reset(pkt->bio);
+	pkt->bio->bi_bdev = pd->bdev;
+	pkt->bio->bi_rw = REQ_WRITE;
 	pkt->bio->bi_sector = new_sector;
-	pkt->bio->bi_next = NULL;
-	pkt->bio->bi_flags = 1 << BIO_UPTODATE;
-	pkt->bio->bi_idx = 0;
-
-	BUG_ON(pkt->bio->bi_rw != REQ_WRITE);
-	BUG_ON(pkt->bio->bi_vcnt != pkt->frames);
-	BUG_ON(pkt->bio->bi_size != pkt->frames * CD_FRAMESIZE);
-	BUG_ON(pkt->bio->bi_end_io != pkt_end_io_packet_write);
-	BUG_ON(pkt->bio->bi_private != pkt);
+	pkt->bio->bi_size = pkt->frames * CD_FRAMESIZE;
+	pkt->bio->bi_vcnt = pkt->frames;
+
+	pkt->bio->bi_end_io = pkt_end_io_packet_write;
+	pkt->bio->bi_private = pkt;
 
 	drop_super(sb);
 	return 1;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 18/26] raid1: use bio_copy_data()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (14 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 17/26] pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
  2012-09-11  0:22 ` [PATCH v2 20/26] block: Add bio_for_each_segment_all() Kent Overstreet
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

This doesn't really delete any code _yet_, but once immutable bvecs are
done we can just delete the rest of the code in that loop.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: NeilBrown <neilb@suse.de>
---
 drivers/md/raid1.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index b1072da..6cd1fb2 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1895,10 +1895,9 @@ static int process_checks(struct r1bio *r1_bio)
 			else
 				bi->bv_len = size;
 			size -= PAGE_SIZE;
-			memcpy(page_address(bi->bv_page),
-			       page_address(pbio->bi_io_vec[j].bv_page),
-			       PAGE_SIZE);
 		}
+
+		bio_copy_data(sbio, pbio);
 	}
 	return 0;
 }
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 08/26] block: Remove bi_idx references Kent Overstreet
  2012-09-11  0:22   ` [PATCH v2 12/26] raid1: use bio_reset() Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-20-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all() Kent Overstreet
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

A bunch of what __blk_queue_bounce() was doing was problematic for the
immutable bvec work; this cleans that up and the code is quite a bit
smaller, too.

The __bio_for_each_segment() in copy_to_high_bio_irq() was changed
because that one's looping over the original bio, not the bounce bio -
since the bounce code doesn't own that bio the __ version wasn't
correct.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
---
 mm/bounce.c | 73 ++++++++++++++++---------------------------------------------
 1 file changed, 19 insertions(+), 54 deletions(-)

diff --git a/mm/bounce.c b/mm/bounce.c
index 0420867..3068300 100644
--- a/mm/bounce.c
+++ b/mm/bounce.c
@@ -101,7 +101,7 @@ static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
 	struct bio_vec *tovec, *fromvec;
 	int i;
 
-	__bio_for_each_segment(tovec, to, i, 0) {
+	bio_for_each_segment(tovec, to, i) {
 		fromvec = from->bi_io_vec + i;
 
 		/*
@@ -181,78 +181,43 @@ static void bounce_end_io_read_isa(struct bio *bio, int err)
 static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 			       mempool_t *pool)
 {
-	struct page *page;
-	struct bio *bio = NULL;
-	int i, rw = bio_data_dir(*bio_orig);
+	struct bio *bio;
+	int rw = bio_data_dir(*bio_orig);
 	struct bio_vec *to, *from;
+	unsigned i;
 
-	bio_for_each_segment(from, *bio_orig, i) {
-		page = from->bv_page;
+	bio_for_each_segment(from, *bio_orig, i)
+		if (page_to_pfn(from->bv_page) > queue_bounce_pfn(q))
+			goto bounce;
 
-		/*
-		 * is destination page below bounce pfn?
-		 */
-		if (page_to_pfn(page) <= queue_bounce_pfn(q))
-			continue;
-
-		/*
-		 * irk, bounce it
-		 */
-		if (!bio) {
-			unsigned int cnt = (*bio_orig)->bi_vcnt;
+	return;
+bounce:
+	bio = bio_clone_bioset(*bio_orig, GFP_NOIO, fs_bio_set);
 
-			bio = bio_alloc(GFP_NOIO, cnt);
-			memset(bio->bi_io_vec, 0, cnt * sizeof(struct bio_vec));
-		}
-			
+	bio_for_each_segment(to, bio, i) {
+		struct page *page = to->bv_page;
 
-		to = bio->bi_io_vec + i;
+		if (page_to_pfn(page) <= queue_bounce_pfn(q))
+			continue;
 
-		to->bv_page = mempool_alloc(pool, q->bounce_gfp);
-		to->bv_len = from->bv_len;
-		to->bv_offset = from->bv_offset;
 		inc_zone_page_state(to->bv_page, NR_BOUNCE);
+		to->bv_page = mempool_alloc(pool, q->bounce_gfp);
 
 		if (rw == WRITE) {
 			char *vto, *vfrom;
 
-			flush_dcache_page(from->bv_page);
+			flush_dcache_page(page);
+
 			vto = page_address(to->bv_page) + to->bv_offset;
-			vfrom = kmap(from->bv_page) + from->bv_offset;
+			vfrom = kmap_atomic(page) + to->bv_offset;
 			memcpy(vto, vfrom, to->bv_len);
-			kunmap(from->bv_page);
+			kunmap_atomic(vfrom);
 		}
 	}
 
-	/*
-	 * no pages bounced
-	 */
-	if (!bio)
-		return;
-
 	trace_block_bio_bounce(q, *bio_orig);
 
-	/*
-	 * at least one page was bounced, fill in possible non-highmem
-	 * pages
-	 */
-	__bio_for_each_segment(from, *bio_orig, i, 0) {
-		to = bio_iovec_idx(bio, i);
-		if (!to->bv_page) {
-			to->bv_page = from->bv_page;
-			to->bv_len = from->bv_len;
-			to->bv_offset = from->bv_offset;
-		}
-	}
-
-	bio->bi_bdev = (*bio_orig)->bi_bdev;
 	bio->bi_flags |= (1 << BIO_BOUNCED);
-	bio->bi_sector = (*bio_orig)->bi_sector;
-	bio->bi_rw = (*bio_orig)->bi_rw;
-
-	bio->bi_vcnt = (*bio_orig)->bi_vcnt;
-	bio->bi_idx = (*bio_orig)->bi_idx;
-	bio->bi_size = (*bio_orig)->bi_size;
 
 	if (pool == page_pool) {
 		bio->bi_end_io = bounce_end_io_write;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 20/26] block: Add bio_for_each_segment_all()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (15 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 18/26] raid1: use bio_copy_data() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-21-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 22/26] block: Add bio_alloc_pages() Kent Overstreet
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

This is part of the immutable bvec prep work; bio_for_each_segment() is
going to have a different implementation so these need to be split
apart.

This change is also to better document the intent of code that's using
it - bio_for_each_segment_all() is only legal to use for code that owns
the bio.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 drivers/md/raid1.c  |  2 +-
 fs/bio.c            | 12 ++++++------
 fs/exofs/ore.c      |  2 +-
 fs/exofs/ore_raid.c |  2 +-
 include/linux/bio.h | 16 +++++++++-------
 mm/bounce.c         |  2 +-
 6 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6cd1fb2..ade95ac 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1283,7 +1283,7 @@ read_again:
 			 * know the original bi_idx, so we just free
 			 * them all
 			 */
-			__bio_for_each_segment(bvec, mbio, j, 0)
+			bio_for_each_segment_all(bvec, mbio, j)
 				bvec->bv_page = r1_bio->behind_bvecs[j].bv_page;
 			if (test_bit(WriteMostly, &conf->mirrors[i].rdev->flags))
 				atomic_inc(&r1_bio->behind_remaining);
diff --git a/fs/bio.c b/fs/bio.c
index 7fb9f4e..efdc437 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -959,7 +959,7 @@ static int __bio_copy_iov(struct bio *bio, struct bio_vec *iovecs,
 	int iov_idx = 0;
 	unsigned int iov_off = 0;
 
-	__bio_for_each_segment(bvec, bio, i, 0) {
+	bio_for_each_segment_all(bvec, bio, i) {
 		char *bv_addr = page_address(bvec->bv_page);
 		unsigned int bv_len = iovecs[i].bv_len;
 
@@ -1141,7 +1141,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
 	return bio;
 cleanup:
 	if (!map_data)
-		bio_for_each_segment(bvec, bio, i)
+		bio_for_each_segment_all(bvec, bio, i)
 			__free_page(bvec->bv_page);
 
 	bio_put(bio);
@@ -1355,7 +1355,7 @@ static void __bio_unmap_user(struct bio *bio)
 	/*
 	 * make sure we dirty pages we wrote to
 	 */
-	__bio_for_each_segment(bvec, bio, i, 0) {
+	bio_for_each_segment_all(bvec, bio, i) {
 		if (bio_data_dir(bio) == READ)
 			set_page_dirty_lock(bvec->bv_page);
 
@@ -1461,7 +1461,7 @@ static void bio_copy_kern_endio(struct bio *bio, int err)
 	int i;
 	char *p = bmd->sgvecs[0].iov_base;
 
-	__bio_for_each_segment(bvec, bio, i, 0) {
+	bio_for_each_segment_all(bvec, bio, i) {
 		char *addr = page_address(bvec->bv_page);
 		int len = bmd->iovecs[i].bv_len;
 
@@ -1501,7 +1501,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len,
 	if (!reading) {
 		void *p = data;
 
-		bio_for_each_segment(bvec, bio, i) {
+		bio_for_each_segment_all(bvec, bio, i) {
 			char *addr = page_address(bvec->bv_page);
 
 			memcpy(addr, p, bvec->bv_len);
@@ -1780,7 +1780,7 @@ sector_t bio_sector_offset(struct bio *bio, unsigned short index,
 	if (index >= bio->bi_idx)
 		index = bio->bi_vcnt - 1;
 
-	__bio_for_each_segment(bv, bio, i, 0) {
+	bio_for_each_segment_all(bv, bio, i) {
 		if (i == index) {
 			if (offset > bv->bv_offset)
 				sectors += (offset - bv->bv_offset) / sector_sz;
diff --git a/fs/exofs/ore.c b/fs/exofs/ore.c
index f936cb5..b744228 100644
--- a/fs/exofs/ore.c
+++ b/fs/exofs/ore.c
@@ -401,7 +401,7 @@ static void _clear_bio(struct bio *bio)
 	struct bio_vec *bv;
 	unsigned i;
 
-	__bio_for_each_segment(bv, bio, i, 0) {
+	bio_for_each_segment_all(bv, bio, i) {
 		unsigned this_count = bv->bv_len;
 
 		if (likely(PAGE_SIZE == this_count))
diff --git a/fs/exofs/ore_raid.c b/fs/exofs/ore_raid.c
index 5f376d1..4dec928 100644
--- a/fs/exofs/ore_raid.c
+++ b/fs/exofs/ore_raid.c
@@ -432,7 +432,7 @@ static void _mark_read4write_pages_uptodate(struct ore_io_state *ios, int ret)
 		if (!bio)
 			continue;
 
-		__bio_for_each_segment(bv, bio, i, 0) {
+		bio_for_each_segment_all(bv, bio, i) {
 			struct page *page = bv->bv_page;
 
 			SetPageUptodate(page);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 92015ce..b433ff8 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -137,16 +137,18 @@ static inline int bio_has_allocated_vec(struct bio *bio)
 #define bio_io_error(bio) bio_endio((bio), -EIO)
 
 /*
- * drivers should not use the __ version unless they _really_ want to
- * run through the entire bio and not just pending pieces
+ * drivers should _never_ use the all version - the bio may have been split
+ * before it got to the driver and the driver won't own all of it
  */
-#define __bio_for_each_segment(bvl, bio, i, start_idx)			\
-	for (bvl = bio_iovec_idx((bio), (start_idx)), i = (start_idx);	\
-	     i < (bio)->bi_vcnt;					\
-	     bvl++, i++)
+#define bio_for_each_segment_all(bvl, bio, i)				\
+	for (i = 0;							\
+	     bvl = bio_iovec_idx((bio), (i)), i < (bio)->bi_vcnt;	\
+	     i++)
 
 #define bio_for_each_segment(bvl, bio, i)				\
-	__bio_for_each_segment(bvl, bio, i, (bio)->bi_idx)
+	for (i = (bio)->bi_idx;						\
+	     bvl = bio_iovec_idx((bio), (i)), i < (bio)->bi_vcnt;	\
+	     i++)
 
 /*
  * get a reference to a bio, so it won't disappear. the intended use is
diff --git a/mm/bounce.c b/mm/bounce.c
index 3068300..89324e2 100644
--- a/mm/bounce.c
+++ b/mm/bounce.c
@@ -134,7 +134,7 @@ static void bounce_end_io(struct bio *bio, mempool_t *pool, int err)
 	/*
 	 * free up bounce indirect pages used
 	 */
-	__bio_for_each_segment(bvec, bio, i, 0) {
+	bio_for_each_segment_all(bvec, bio, i) {
 		org_vec = bio_orig->bi_io_vec + i;
 		if (bvec->bv_page == org_vec->bv_page)
 			continue;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all()
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2012-09-11  0:22   ` [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-22-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 23/26] raid1: use bio_alloc_pages() Kent Overstreet
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

A few places in the code were either open coding or using the wrong
version - fix.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
---
 drivers/md/dm-crypt.c |  3 +--
 drivers/md/raid1.c    | 10 +++-------
 fs/bio.c              | 20 ++++++++++----------
 fs/direct-io.c        |  8 ++++----
 4 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index bbf459b..f50798e 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -858,8 +858,7 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone)
 	unsigned int i;
 	struct bio_vec *bv;
 
-	for (i = 0; i < clone->bi_vcnt; i++) {
-		bv = bio_iovec_idx(clone, i);
+	bio_for_each_segment_all(bv, clone, i) {
 		BUG_ON(!bv->bv_page);
 		mempool_free(bv->bv_page, cc->page_pool);
 		bv->bv_page = NULL;
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ade95ac..d30b4cb 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -921,7 +921,7 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
 	if (unlikely(!bvecs))
 		return;
 
-	bio_for_each_segment(bvec, bio, i) {
+	bio_for_each_segment_all(bvec, bio, i) {
 		bvecs[i] = *bvec;
 		bvecs[i].bv_page = alloc_page(GFP_NOIO);
 		if (unlikely(!bvecs[i].bv_page))
@@ -1276,12 +1276,8 @@ read_again:
 			struct bio_vec *bvec;
 			int j;
 
-			/* Yes, I really want the '__' version so that
-			 * we clear any unused pointer in the io_vec, rather
-			 * than leave them unchanged.  This is important
-			 * because when we come to free the pages, we won't
-			 * know the original bi_idx, so we just free
-			 * them all
+			/*
+			 * We trimmed the bio, so _all is legit
 			 */
 			bio_for_each_segment_all(bvec, mbio, j)
 				bvec->bv_page = r1_bio->behind_bvecs[j].bv_page;
diff --git a/fs/bio.c b/fs/bio.c
index efdc437..d88ad77 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1546,11 +1546,11 @@ EXPORT_SYMBOL(bio_copy_kern);
  */
 void bio_set_pages_dirty(struct bio *bio)
 {
-	struct bio_vec *bvec = bio->bi_io_vec;
+	struct bio_vec *bvec;
 	int i;
 
-	for (i = 0; i < bio->bi_vcnt; i++) {
-		struct page *page = bvec[i].bv_page;
+	bio_for_each_segment_all(bvec, bio, i) {
+		struct page *page = bvec->bv_page;
 
 		if (page && !PageCompound(page))
 			set_page_dirty_lock(page);
@@ -1559,11 +1559,11 @@ void bio_set_pages_dirty(struct bio *bio)
 
 static void bio_release_pages(struct bio *bio)
 {
-	struct bio_vec *bvec = bio->bi_io_vec;
+	struct bio_vec *bvec;
 	int i;
 
-	for (i = 0; i < bio->bi_vcnt; i++) {
-		struct page *page = bvec[i].bv_page;
+	bio_for_each_segment_all(bvec, bio, i) {
+		struct page *page = bvec->bv_page;
 
 		if (page)
 			put_page(page);
@@ -1612,16 +1612,16 @@ static void bio_dirty_fn(struct work_struct *work)
 
 void bio_check_pages_dirty(struct bio *bio)
 {
-	struct bio_vec *bvec = bio->bi_io_vec;
+	struct bio_vec *bvec;
 	int nr_clean_pages = 0;
 	int i;
 
-	for (i = 0; i < bio->bi_vcnt; i++) {
-		struct page *page = bvec[i].bv_page;
+	bio_for_each_segment_all(bvec, bio, i) {
+		struct page *page = bvec->bv_page;
 
 		if (PageDirty(page) || PageCompound(page)) {
 			page_cache_release(page);
-			bvec[i].bv_page = NULL;
+			bvec->bv_page = NULL;
 		} else {
 			nr_clean_pages++;
 		}
diff --git a/fs/direct-io.c b/fs/direct-io.c
index f86c720..6089916 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -441,8 +441,8 @@ static struct bio *dio_await_one(struct dio *dio)
 static int dio_bio_complete(struct dio *dio, struct bio *bio)
 {
 	const int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
-	struct bio_vec *bvec = bio->bi_io_vec;
-	int page_no;
+	struct bio_vec *bvec;
+	unsigned i;
 
 	if (!uptodate)
 		dio->io_error = -EIO;
@@ -450,8 +450,8 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
 	if (dio->is_async && dio->rw == READ) {
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
-		for (page_no = 0; page_no < bio->bi_vcnt; page_no++) {
-			struct page *page = bvec[page_no].bv_page;
+		bio_for_each_segment_all(bvec, bio, i) {
+			struct page *page = bvec->bv_page;
 
 			if (dio->rw == READ && !PageCompound(page))
 				set_page_dirty_lock(page);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 22/26] block: Add bio_alloc_pages()
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (16 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 20/26] block: Add bio_for_each_segment_all() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found]   ` <1347322957-25260-23-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22 ` [PATCH v2 24/26] block: Add an explicit bio flag for bios that own their bvec Kent Overstreet
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

More utility code to replace stuff that's getting open coded.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 fs/bio.c            | 28 ++++++++++++++++++++++++++++
 include/linux/bio.h |  1 +
 2 files changed, 29 insertions(+)

diff --git a/fs/bio.c b/fs/bio.c
index d88ad77..65e6eac 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -828,6 +828,34 @@ void bio_advance(struct bio *bio, unsigned bytes)
 EXPORT_SYMBOL(bio_advance);
 
 /**
+ * bio_alloc_pages - allocates a single page for each bvec in a bio
+ * @bio: bio to allocate pages for
+ * @gfp_mask: flags for allocation
+ *
+ * Allocates pages up to @bio->bi_vcnt.
+ *
+ * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
+ * freed.
+ */
+int bio_alloc_pages(struct bio *bio, gfp_t gfp_mask)
+{
+	int i;
+	struct bio_vec *bv;
+
+	bio_for_each_segment_all(bv, bio, i) {
+		bv->bv_page = alloc_page(gfp_mask);
+		if (!bv->bv_page) {
+			while (bv-- != bio->bi_io_vec)
+				__free_page(bv->bv_page);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(bio_alloc_pages);
+
+/**
  * bio_copy_data - copy contents of data buffers from one chain of bios to
  * another
  * @src: source bio list
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b433ff8..bd45154 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -289,6 +289,7 @@ static inline void bio_flush_dcache_pages(struct bio *bi)
 #endif
 
 extern void bio_copy_data(struct bio *dst, struct bio *src);
+extern int bio_alloc_pages(struct bio *bio, gfp_t gfp);
 
 extern struct bio *bio_copy_user(struct request_queue *, struct rq_map_data *,
 				 unsigned long, unsigned int, int, gfp_t);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 23/26] raid1: use bio_alloc_pages()
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2012-09-11  0:22   ` [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all() Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-24-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf Kent Overstreet
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
---
 drivers/md/raid1.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index d30b4cb..18b743a 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -92,7 +92,6 @@ static void r1bio_pool_free(void *r1_bio, void *data)
 static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
 {
 	struct pool_info *pi = data;
-	struct page *page;
 	struct r1bio *r1_bio;
 	struct bio *bio;
 	int i, j;
@@ -122,14 +121,10 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
 		j = 1;
 	while(j--) {
 		bio = r1_bio->bios[j];
-		for (i = 0; i < RESYNC_PAGES; i++) {
-			page = alloc_page(gfp_flags);
-			if (unlikely(!page))
-				goto out_free_pages;
+		bio->bi_vcnt = RESYNC_PAGES;
 
-			bio->bi_io_vec[i].bv_page = page;
-			bio->bi_vcnt = i+1;
-		}
+		if (bio_alloc_pages(bio, gfp_flags))
+			goto out_free_bio;
 	}
 	/* If not user-requests, copy the page pointers to all bios */
 	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
@@ -143,11 +138,6 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
 
 	return r1_bio;
 
-out_free_pages:
-	for (j=0 ; j < pi->raid_disks; j++)
-		for (i=0; i < r1_bio->bios[j]->bi_vcnt ; i++)
-			put_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
-	j = -1;
 out_free_bio:
 	while (++j < pi->raid_disks)
 		bio_put(r1_bio->bios[j]);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 24/26] block: Add an explicit bio flag for bios that own their bvec
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (17 preceding siblings ...)
  2012-09-11  0:22 ` [PATCH v2 22/26] block: Add bio_alloc_pages() Kent Overstreet
@ 2012-09-11  0:22 ` Kent Overstreet
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-20 23:22 ` Tejun Heo
  20 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: axboe, Kent Overstreet, tj, neilb

This is for the new bio splitting code. When we split a bio, if the
split occured on a bvec boundry we reuse the bvec for the new bio. But
that means bio_free() can't free it, hence the explicit flag.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
Acked-by: Tejun Heo <tj@kernel.org>
---
 fs/bio.c                  | 4 +++-
 include/linux/bio.h       | 5 -----
 include/linux/blk_types.h | 1 +
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 65e6eac..5e91e36 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -250,7 +250,7 @@ static void bio_free(struct bio *bio)
 	__bio_free(bio);
 
 	if (bs) {
-		if (bio_has_allocated_vec(bio))
+		if (bio_flagged(bio, BIO_OWNS_VEC))
 			bvec_free_bs(bs, bio->bi_io_vec, BIO_POOL_IDX(bio));
 
 		/*
@@ -449,6 +449,8 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, int nr_iovecs, struct bio_set *bs)
 
 		if (unlikely(!bvl))
 			goto err_free;
+
+		bio->bi_flags |= 1 << BIO_OWNS_VEC;
 	} else if (nr_iovecs) {
 		bvl = bio->bi_inline_vecs;
 	}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index bd45154..edd66f3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -85,11 +85,6 @@ static inline void *bio_data(struct bio *bio)
 	return NULL;
 }
 
-static inline int bio_has_allocated_vec(struct bio *bio)
-{
-	return bio->bi_io_vec && bio->bi_io_vec != bio->bi_inline_vecs;
-}
-
 /*
  * will die
  */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3eefbb2..e9375cf 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -117,6 +117,7 @@ struct bio {
  * BIO_POOL_IDX()
  */
 #define BIO_RESET_BITS	12
+#define BIO_OWNS_VEC	12	/* bio_free() should free bvec */
 
 #define bio_flagged(bio, flag)	((bio)->bi_flags & (1 << (flag)))
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2012-09-11  0:22   ` [PATCH v2 23/26] raid1: use bio_alloc_pages() Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
       [not found]     ` <1347322957-25260-26-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11  0:22   ` [PATCH v2 26/26] block: Add BIO_SUBMITTED flag, kill BIO_CLONED Kent Overstreet
  2012-09-11  5:22   ` [PATCH v2 00/26] Prep work for immutable bio vecs NeilBrown
  7 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM, Martin K. Petersen

This was the only real user of BIO_CLONED, which didn't have very clear
semantics. Convert to its own flag so we can get rid of BIO_CLONED.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
CC: Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 fs/bio-integrity.c  | 5 ++---
 include/linux/bio.h | 1 +
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index e8555a5..462a131 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -94,9 +94,7 @@ void bio_integrity_free(struct bio *bio)
 	struct bio_integrity_payload *bip = bio->bi_integrity;
 	struct bio_set *bs = bio->bi_pool;
 
-	/* A cloned bio doesn't own the integrity metadata */
-	if (!bio_flagged(bio, BIO_CLONED) && !bio_flagged(bio, BIO_FS_INTEGRITY)
-	    && bip->bip_buf != NULL)
+	if (bip->bip_owns_buf)
 		kfree(bip->bip_buf);
 
 	if (bs) {
@@ -382,6 +380,7 @@ int bio_integrity_prep(struct bio *bio)
 		return -EIO;
 	}
 
+	bip->bip_owns_buf = 1;
 	bip->bip_buf = buf;
 	bip->bip_size = len;
 	bip->bip_sector = bio->bi_sector;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index edd66f3..f429d0f 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -178,6 +178,7 @@ struct bio_integrity_payload {
 	unsigned short		bip_slab;	/* slab the bip came from */
 	unsigned short		bip_vcnt;	/* # of integrity bio_vecs */
 	unsigned short		bip_idx;	/* current bip_vec index */
+	unsigned		bip_owns_buf:1;	/* should free bip_buf */
 
 	struct work_struct	bip_work;	/* I/O completion */
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH v2 26/26] block: Add BIO_SUBMITTED flag, kill BIO_CLONED
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
                     ` (5 preceding siblings ...)
  2012-09-11  0:22   ` [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf Kent Overstreet
@ 2012-09-11  0:22   ` Kent Overstreet
  2012-09-11  5:22   ` [PATCH v2 00/26] Prep work for immutable bio vecs NeilBrown
  7 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11  0:22 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA
  Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, Kent Overstreet,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

BIO_CLONED wasn't very useful, and didn't have very clear semantics, so
kill it.

Replace it with a more useful flag - BIO_SUBMITTED means the bio has
been passed to generic_make_request() and the bvec can no longer be
modified.

Roll both changes into the same patch so we can steal the old bit for
the new flag.

Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
---
 block/blk-core.c          | 2 ++
 drivers/md/dm.c           | 1 -
 fs/bio-integrity.c        | 1 -
 fs/bio.c                  | 8 +++++---
 include/linux/blk_types.h | 2 +-
 5 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 97511cb..1d4e893 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1638,6 +1638,8 @@ generic_make_request_checks(struct bio *bio)
 
 	might_sleep();
 
+	bio->bi_flags |= 1 << BIO_SUBMITTED;
+
 	if (bio_check_eod(bio, nr_sectors))
 		goto end_io;
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 8378797..777e70d 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1065,7 +1065,6 @@ static struct bio *split_bvec(struct bio *bio, sector_t sector,
 	clone->bi_size = to_bytes(len);
 	clone->bi_io_vec->bv_offset = offset;
 	clone->bi_io_vec->bv_len = clone->bi_size;
-	clone->bi_flags |= 1 << BIO_CLONED;
 
 	if (bio_integrity(bio)) {
 		bio_integrity_clone(clone, bio, GFP_NOIO);
diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
index 462a131..a77a566 100644
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -621,7 +621,6 @@ void bio_integrity_trim(struct bio *bio, unsigned int offset,
 
 	BUG_ON(bip == NULL);
 	BUG_ON(bi == NULL);
-	BUG_ON(!bio_flagged(bio, BIO_CLONED));
 
 	nr_sectors = bio_integrity_hw_sectors(bi, sectors);
 	bip->bip_sector = bip->bip_sector + offset;
diff --git a/fs/bio.c b/fs/bio.c
index 5e91e36..d3b6e2a 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -531,7 +531,7 @@ void __bio_clone(struct bio *bio, struct bio *bio_src)
 	 */
 	bio->bi_sector = bio_src->bi_sector;
 	bio->bi_bdev = bio_src->bi_bdev;
-	bio->bi_flags |= 1 << BIO_CLONED;
+	bio->bi_flags |= (bio_src->bi_flags & (1 << BIO_SUBMITTED));
 	bio->bi_rw = bio_src->bi_rw;
 	bio->bi_vcnt = bio_src->bi_vcnt;
 	bio->bi_size = bio_src->bi_size;
@@ -604,9 +604,9 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 	struct bio_vec *bvec;
 
 	/*
-	 * cloned bio must not modify vec list
+	 * submitted bio must not modify vec list
 	 */
-	if (unlikely(bio_flagged(bio, BIO_CLONED)))
+	if (unlikely(bio_flagged(bio, BIO_SUBMITTED)))
 		return 0;
 
 	if (((bio->bi_size + len) >> 9) > max_sectors)
@@ -844,6 +844,8 @@ int bio_alloc_pages(struct bio *bio, gfp_t gfp_mask)
 	int i;
 	struct bio_vec *bv;
 
+	BUG_ON(bio_flagged(bio, BIO_SUBMITTED));
+
 	bio_for_each_segment_all(bv, bio, i) {
 		bv->bv_page = alloc_page(gfp_mask);
 		if (!bv->bv_page) {
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index e9375cf..fb49107 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -103,7 +103,7 @@ struct bio {
 #define BIO_RW_BLOCK	1	/* RW_AHEAD set, and read/write would block */
 #define BIO_EOF		2	/* out-out-bounds error */
 #define BIO_SEG_VALID	3	/* bi_phys_segments valid */
-#define BIO_CLONED	4	/* doesn't own data */
+#define BIO_SUBMITTED	4	/* bio has been submitted */
 #define BIO_BOUNCED	5	/* bio is a bounce bio */
 #define BIO_USER_MAPPED 6	/* contains user pages */
 #define BIO_EOPNOTSUPP	7	/* not supported */
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 12/26] raid1: use bio_reset()
       [not found]     ` <1347322957-25260-13-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-11  4:59       ` NeilBrown
  2012-09-11 18:28         ` Kent Overstreet
  0 siblings, 1 reply; 81+ messages in thread
From: NeilBrown @ 2012-09-11  4:59 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

[-- Attachment #1: Type: text/plain, Size: 2887 bytes --]

On Mon, 10 Sep 2012 17:22:23 -0700 Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
wrote:

> I couldn't figure out what sbio->bi_end_io in process_checks() was
> supposed to be, so I took the easy way out.

Almost.
You save 'sbio->bi_end_io' to 'bi_end_io', then do nothing with it...

A little way above the 'fixup the bio for reuse' comment you'll find:

		struct bio *sbio = r1_bio->bios[i];
....
		if (r1_bio->bios[i]->bi_end_io != end_sync_read)
			continue;

which implies that if we don't 'continue', then sbio->bi_end_io ==
end_sync_read.

So I suspect you want to add
    sbio->bi_end_io = end_sync_read;
somewhere after the 'bio_reset()'.

If you happened to also fix that 'if' that I quoted so that it reads:

     if (sbio->bi_end_io != end_sync_read)
           continue;

I wouldn't complain at all :-)

Thanks,
NeilBrown


> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/md/raid1.c | 22 +++++-----------------
>  1 file changed, 5 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index ee85154..bd3e3b9 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1835,6 +1835,7 @@ static int process_checks(struct r1bio *r1_bio)
>  	int primary;
>  	int i;
>  	int vcnt;
> +	bio_end_io_t *bi_end_io;
>  
>  	for (primary = 0; primary < conf->raid_disks * 2; primary++)
>  		if (r1_bio->bios[primary]->bi_end_io == end_sync_read &&
> @@ -1876,13 +1877,11 @@ static int process_checks(struct r1bio *r1_bio)
>  			continue;
>  		}
>  		/* fixup the bio for reuse */
> +		bi_end_io = sbio->bi_end_io;
> +		bio_reset(sbio);
> +
>  		sbio->bi_vcnt = vcnt;
>  		sbio->bi_size = r1_bio->sectors << 9;
> -		sbio->bi_idx = 0;
> -		sbio->bi_phys_segments = 0;
> -		sbio->bi_flags &= ~(BIO_POOL_MASK - 1);
> -		sbio->bi_flags |= 1 << BIO_UPTODATE;
> -		sbio->bi_next = NULL;
>  		sbio->bi_sector = r1_bio->sector +
>  			conf->mirrors[i].rdev->data_offset;
>  		sbio->bi_bdev = conf->mirrors[i].rdev->bdev;
> @@ -2426,18 +2425,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
>  	for (i = 0; i < conf->raid_disks * 2; i++) {
>  		struct md_rdev *rdev;
>  		bio = r1_bio->bios[i];
> -
> -		/* take from bio_init */
> -		bio->bi_next = NULL;
> -		bio->bi_flags &= ~(BIO_POOL_MASK-1);
> -		bio->bi_flags |= 1 << BIO_UPTODATE;
> -		bio->bi_rw = READ;
> -		bio->bi_vcnt = 0;
> -		bio->bi_idx = 0;
> -		bio->bi_phys_segments = 0;
> -		bio->bi_size = 0;
> -		bio->bi_end_io = NULL;
> -		bio->bi_private = NULL;
> +		bio_reset(bio);
>  
>  		rdev = rcu_dereference(conf->mirrors[i].rdev);
>  		if (rdev == NULL ||


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 13/26] raid5: use bio_reset()
       [not found]   ` <1347322957-25260-14-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-11  5:03     ` NeilBrown
       [not found]       ` <20120911150326.79f066c0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: NeilBrown @ 2012-09-11  5:03 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

[-- Attachment #1: Type: text/plain, Size: 2896 bytes --]

On Mon, 10 Sep 2012 17:22:24 -0700 Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
wrote:

> Had to shuffle the code around a bit (where bi_rw and bi_end_io were
> set), but shouldn't really be anything tricky here
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/md/raid5.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 7c19dbe..ebe43f7 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -561,14 +561,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>  		bi = &sh->dev[i].req;
>  		rbi = &sh->dev[i].rreq; /* For writing to replacement */
>  
> -		bi->bi_rw = rw;
> -		rbi->bi_rw = rw;
> -		if (rw & WRITE) {
> -			bi->bi_end_io = raid5_end_write_request;
> -			rbi->bi_end_io = raid5_end_write_request;
> -		} else
> -			bi->bi_end_io = raid5_end_read_request;
> -
>  		rcu_read_lock();
>  		rrdev = rcu_dereference(conf->disks[i].replacement);
>  		smp_mb(); /* Ensure that if rrdev is NULL, rdev won't be */
> @@ -643,7 +635,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>  
>  			set_bit(STRIPE_IO_STARTED, &sh->state);
>  
> +			bio_reset(bi);
>  			bi->bi_bdev = rdev->bdev;
> +			bi->bi_rw = rw;
> +			bi->bi_end_io = (rw & WRITE)
> +				? raid5_end_write_request
> +				: raid5_end_read_request;
> +			bi->bi_private = sh;
> +
>  			pr_debug("%s: for %llu schedule op %ld on disc %d\n",
>  				__func__, (unsigned long long)sh->sector,
>  				bi->bi_rw, i);
> @@ -657,12 +656,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>  			if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
>  				bi->bi_rw |= REQ_FLUSH;
>  
> -			bi->bi_flags = 1 << BIO_UPTODATE;
> -			bi->bi_idx = 0;
>  			bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
>  			bi->bi_io_vec[0].bv_offset = 0;
>  			bi->bi_size = STRIPE_SIZE;
> -			bi->bi_next = NULL;
>  			if (rrdev)
>  				set_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags);
>  			generic_make_request(bi);
> @@ -674,7 +670,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>  
>  			set_bit(STRIPE_IO_STARTED, &sh->state);
>  
> +			bio_reset(rbi);
>  			rbi->bi_bdev = rrdev->bdev;
> +			rbi->bi_rw = rw;
> +			rbi->bi_end_io = (rw & WRITE)
> +				? raid5_end_write_request
> +				: raid5_end_read_request;

'rbi->bi_end_io' can only ever be raid5_end_write_request.  We only get here
on a write.
I'd be OK with 
    BUG_ON(!(rw & WRITE));
but I don't want the condition in the assignment.

The rest looks quite sane.

Thanks
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 00/26] Prep work for immutable bio vecs
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
                     ` (6 preceding siblings ...)
  2012-09-11  0:22   ` [PATCH v2 26/26] block: Add BIO_SUBMITTED flag, kill BIO_CLONED Kent Overstreet
@ 2012-09-11  5:22   ` NeilBrown
  7 siblings, 0 replies; 81+ messages in thread
From: NeilBrown @ 2012-09-11  5:22 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

On Mon, 10 Sep 2012 17:22:11 -0700 Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
wrote:

> Random assortment of refactoring and trivial cleanups;
> 
> Immutable bio vecs and efficient bio splitting require auditing and
> removing pretty much all bi_idx uses, among other things.
> 
> The reason is that with immutable bio vecs we can't use the bvec array
> directly; if we have a partially completed bvec, that'll be indicated
> with a field in struct bvec_iter (which gets embedded in struct bio) -
> bi_bvec_done.
> 
> bio_for_each_segments() will handle this transparently, so code needs to
> be converted to use it or some other generic accessor.
> 
> Also, bio splitting means that when a driver gets a bio, bi_idx and
> bi_bvec_done may both be nonzero. Again, just need to use generic
> accessors.
> 
> v2: Patch series now has all the prep work to be done before abstracting
> out the bio iterator, I think.

Hi Kent,
 this looks pretty good to me.  I've only really looked closely at the md
 bits, but they all seem to make sense (with the minor issues that I reported
 separately).

 If/when there is another posting I'll try to allocate some time to testing
 them and looking more closely.

thanks!

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 12/26] raid1: use bio_reset()
  2012-09-11  4:59       ` NeilBrown
@ 2012-09-11 18:28         ` Kent Overstreet
       [not found]           ` <20120911182825.GG19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11 18:28 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, tj

On Tue, Sep 11, 2012 at 02:59:13PM +1000, NeilBrown wrote:
> On Mon, 10 Sep 2012 17:22:23 -0700 Kent Overstreet <koverstreet@google.com>
> wrote:
> 
> > I couldn't figure out what sbio->bi_end_io in process_checks() was
> > supposed to be, so I took the easy way out.
> 
> Almost.
> You save 'sbio->bi_end_io' to 'bi_end_io', then do nothing with it...

Whoops :) I think I must've gotten distracted and forgot to finish with
that patch, I wasn't setting bi_private either.

> 
> A little way above the 'fixup the bio for reuse' comment you'll find:
> 
> 		struct bio *sbio = r1_bio->bios[i];
> ....
> 		if (r1_bio->bios[i]->bi_end_io != end_sync_read)
> 			continue;
> 
> which implies that if we don't 'continue', then sbio->bi_end_io ==
> end_sync_read.

Ahh. I remember reading that, but I missed that that was sbio that was
being checked.

> 
> So I suspect you want to add
>     sbio->bi_end_io = end_sync_read;
> somewhere after the 'bio_reset()'.
> 
> If you happened to also fix that 'if' that I quoted so that it reads:
> 
>      if (sbio->bi_end_io != end_sync_read)
>            continue;

Will do! How's this look?


commit 40a4645a4346edd040066baedcf2184ac4211ba7
Author: Kent Overstreet <koverstreet@google.com>
Date:   Tue Sep 11 11:26:12 2012 -0700

    raid1: use bio_reset()
    
    Signed-off-by: Kent Overstreet <koverstreet@google.com>
    CC: Jens Axboe <axboe@kernel.dk>
    CC: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ee85154..df68691 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1851,7 +1851,7 @@ static int process_checks(struct r1bio *r1_bio)
 		struct bio *sbio = r1_bio->bios[i];
 		int size;
 
-		if (r1_bio->bios[i]->bi_end_io != end_sync_read)
+		if (sbio->bi_end_io != end_sync_read)
 			continue;
 
 		if (test_bit(BIO_UPTODATE, &sbio->bi_flags)) {
@@ -1876,16 +1876,15 @@ static int process_checks(struct r1bio *r1_bio)
 			continue;
 		}
 		/* fixup the bio for reuse */
+		bio_reset(sbio);
 		sbio->bi_vcnt = vcnt;
 		sbio->bi_size = r1_bio->sectors << 9;
-		sbio->bi_idx = 0;
-		sbio->bi_phys_segments = 0;
-		sbio->bi_flags &= ~(BIO_POOL_MASK - 1);
-		sbio->bi_flags |= 1 << BIO_UPTODATE;
-		sbio->bi_next = NULL;
 		sbio->bi_sector = r1_bio->sector +
 			conf->mirrors[i].rdev->data_offset;
 		sbio->bi_bdev = conf->mirrors[i].rdev->bdev;
+		sbio->bi_end_io = end_sync_read;
+		sbio->bi_private = r1_bio;
+
 		size = sbio->bi_size;
 		for (j = 0; j < vcnt ; j++) {
 			struct bio_vec *bi;
@@ -2426,18 +2425,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
 	for (i = 0; i < conf->raid_disks * 2; i++) {
 		struct md_rdev *rdev;
 		bio = r1_bio->bios[i];
-
-		/* take from bio_init */
-		bio->bi_next = NULL;
-		bio->bi_flags &= ~(BIO_POOL_MASK-1);
-		bio->bi_flags |= 1 << BIO_UPTODATE;
-		bio->bi_rw = READ;
-		bio->bi_vcnt = 0;
-		bio->bi_idx = 0;
-		bio->bi_phys_segments = 0;
-		bio->bi_size = 0;
-		bio->bi_end_io = NULL;
-		bio->bi_private = NULL;
+		bio_reset(bio);
 
 		rdev = rcu_dereference(conf->mirrors[i].rdev);
 		if (rdev == NULL ||

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 13/26] raid5: use bio_reset()
       [not found]       ` <20120911150326.79f066c0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2012-09-11 19:26         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11 19:26 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

On Tue, Sep 11, 2012 at 03:03:26PM +1000, NeilBrown wrote:
> On Mon, 10 Sep 2012 17:22:24 -0700 Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> wrote:
> 
> > Had to shuffle the code around a bit (where bi_rw and bi_end_io were
> > set), but shouldn't really be anything tricky here
> > 
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> > ---
> >  drivers/md/raid5.c | 28 ++++++++++++++--------------
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 7c19dbe..ebe43f7 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -561,14 +561,6 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >  		bi = &sh->dev[i].req;
> >  		rbi = &sh->dev[i].rreq; /* For writing to replacement */
> >  
> > -		bi->bi_rw = rw;
> > -		rbi->bi_rw = rw;
> > -		if (rw & WRITE) {
> > -			bi->bi_end_io = raid5_end_write_request;
> > -			rbi->bi_end_io = raid5_end_write_request;
> > -		} else
> > -			bi->bi_end_io = raid5_end_read_request;
> > -
> >  		rcu_read_lock();
> >  		rrdev = rcu_dereference(conf->disks[i].replacement);
> >  		smp_mb(); /* Ensure that if rrdev is NULL, rdev won't be */
> > @@ -643,7 +635,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >  
> >  			set_bit(STRIPE_IO_STARTED, &sh->state);
> >  
> > +			bio_reset(bi);
> >  			bi->bi_bdev = rdev->bdev;
> > +			bi->bi_rw = rw;
> > +			bi->bi_end_io = (rw & WRITE)
> > +				? raid5_end_write_request
> > +				: raid5_end_read_request;
> > +			bi->bi_private = sh;
> > +
> >  			pr_debug("%s: for %llu schedule op %ld on disc %d\n",
> >  				__func__, (unsigned long long)sh->sector,
> >  				bi->bi_rw, i);
> > @@ -657,12 +656,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >  			if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
> >  				bi->bi_rw |= REQ_FLUSH;
> >  
> > -			bi->bi_flags = 1 << BIO_UPTODATE;
> > -			bi->bi_idx = 0;
> >  			bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
> >  			bi->bi_io_vec[0].bv_offset = 0;
> >  			bi->bi_size = STRIPE_SIZE;
> > -			bi->bi_next = NULL;
> >  			if (rrdev)
> >  				set_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags);
> >  			generic_make_request(bi);
> > @@ -674,7 +670,14 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >  
> >  			set_bit(STRIPE_IO_STARTED, &sh->state);
> >  
> > +			bio_reset(rbi);
> >  			rbi->bi_bdev = rrdev->bdev;
> > +			rbi->bi_rw = rw;
> > +			rbi->bi_end_io = (rw & WRITE)
> > +				? raid5_end_write_request
> > +				: raid5_end_read_request;
> 
> 'rbi->bi_end_io' can only ever be raid5_end_write_request.  We only get here
> on a write.
> I'd be OK with 
>     BUG_ON(!(rw & WRITE));
> but I don't want the condition in the assignment.

I was thinking to myself that if I was doing a bit more with that code,
I'd factor out all the code in the if (rdev) {} into a separate function
that was called twice there. But, I didn't do that here so you're right
- I'll change it and stick a BUG_ON() there.

> The rest looks quite sane.

Thanks!

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [dm-devel] [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
       [not found]   ` <1347322957-25260-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-11 20:36     ` Vivek Goyal
  2012-09-11 20:48       ` Kent Overstreet
  2012-09-11 22:07       ` Kent Overstreet
  2012-09-20 21:53     ` Tejun Heo
  1 sibling, 2 replies; 81+ messages in thread
From: Vivek Goyal @ 2012-09-11 20:36 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A, Martin K. Petersen

On Mon, Sep 10, 2012 at 05:22:12PM -0700, Kent Overstreet wrote:
> This adds a pointer to the bvec array to struct bio_integrity_payload,
> instead of the bvecs always being inline; then the bvecs are allocated
> with bvec_alloc_bs().

If you starting allocating bvec from same mempool for  bio and bip, 
are you not breaking the principle of multiple allocations from a
mempool and hence increasing the possibility of deadlock?

Also there seems to be too much happening in this patch. Please break
it down in 2. First fix the bio integrity bug you mentioned then
introduce your changes on top.

Thanks
Vivek

> 
> This is needed eventually for immutable bio vecs - immutable bvecs
> aren't useful if we still have to copy them, hence the need for the
> pointer. Less code is always nice too, though.
> 
> Also fix an amusing bug in bio_integrity_split() - struct bio_pair
> doesn't have the integrity bvecs after the bio_integrity_payloads, so
> there was a buffer overrun. The code was confusing pointers with arrays.
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/bio-integrity.c  | 124 +++++++++++++++++-----------------------------------
>  include/linux/bio.h |   5 ++-
>  2 files changed, 43 insertions(+), 86 deletions(-)
> 
> diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> index a3f28f3..1d64f7f 100644
> --- a/fs/bio-integrity.c
> +++ b/fs/bio-integrity.c
> @@ -27,48 +27,11 @@
>  #include <linux/workqueue.h>
>  #include <linux/slab.h>
>  
> -struct integrity_slab {
> -	struct kmem_cache *slab;
> -	unsigned short nr_vecs;
> -	char name[8];
> -};
> -
> -#define IS(x) { .nr_vecs = x, .name = "bip-"__stringify(x) }
> -struct integrity_slab bip_slab[BIOVEC_NR_POOLS] __read_mostly = {
> -	IS(1), IS(4), IS(16), IS(64), IS(128), IS(BIO_MAX_PAGES),
> -};
> -#undef IS
> +#define BIP_INLINE_VECS	4
>  
> +static struct kmem_cache *bip_slab;
>  static struct workqueue_struct *kintegrityd_wq;
>  
> -static inline unsigned int vecs_to_idx(unsigned int nr)
> -{
> -	switch (nr) {
> -	case 1:
> -		return 0;
> -	case 2 ... 4:
> -		return 1;
> -	case 5 ... 16:
> -		return 2;
> -	case 17 ... 64:
> -		return 3;
> -	case 65 ... 128:
> -		return 4;
> -	case 129 ... BIO_MAX_PAGES:
> -		return 5;
> -	default:
> -		BUG();
> -	}
> -}
> -
> -static inline int use_bip_pool(unsigned int idx)
> -{
> -	if (idx == BIOVEC_MAX_IDX)
> -		return 1;
> -
> -	return 0;
> -}
> -
>  /**
>   * bio_integrity_alloc - Allocate integrity payload and attach it to bio
>   * @bio:	bio to attach integrity metadata to
> @@ -84,37 +47,38 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio,
>  						  unsigned int nr_vecs)
>  {
>  	struct bio_integrity_payload *bip;
> -	unsigned int idx = vecs_to_idx(nr_vecs);
>  	struct bio_set *bs = bio->bi_pool;
> +	unsigned long idx = BIO_POOL_NONE;
> +	unsigned inline_vecs;
> +
> +	if (!bs) {
> +		bip = kmalloc(sizeof(struct bio_integrity_payload) +
> +			      sizeof(struct bio_vec) * nr_vecs, gfp_mask);
> +		inline_vecs = nr_vecs;
> +	} else {
> +		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
> +		inline_vecs = BIP_INLINE_VECS;
> +	}
>  
> -	if (!bs)
> -		bs = fs_bio_set;
> -
> -	BUG_ON(bio == NULL);
> -	bip = NULL;
> +	if (unlikely(!bip))
> +		return NULL;
>  
> -	/* Lower order allocations come straight from slab */
> -	if (!use_bip_pool(idx))
> -		bip = kmem_cache_alloc(bip_slab[idx].slab, gfp_mask);
> +	memset(bip, 0, sizeof(struct bio_integrity_payload));
>  
> -	/* Use mempool if lower order alloc failed or max vecs were requested */
> -	if (bip == NULL) {
> -		idx = BIOVEC_MAX_IDX;  /* so we free the payload properly later */
> -		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
> -
> -		if (unlikely(bip == NULL)) {
> -			printk(KERN_ERR "%s: could not alloc bip\n", __func__);
> -			return NULL;
> -		}
> +	if (nr_vecs > inline_vecs) {
> +		bip->bip_vec = bvec_alloc_bs(gfp_mask, nr_vecs, &idx, bs);
> +		if (!bip->bip_vec)
> +			goto err;
>  	}
>  
> -	memset(bip, 0, sizeof(*bip));
> -
>  	bip->bip_slab = idx;
>  	bip->bip_bio = bio;
>  	bio->bi_integrity = bip;
>  
>  	return bip;
> +err:
> +	mempool_free(bip, bs->bio_integrity_pool);
> +	return NULL;
>  }
>  EXPORT_SYMBOL(bio_integrity_alloc);
>  
> @@ -130,20 +94,19 @@ void bio_integrity_free(struct bio *bio)
>  	struct bio_integrity_payload *bip = bio->bi_integrity;
>  	struct bio_set *bs = bio->bi_pool;
>  
> -	if (!bs)
> -		bs = fs_bio_set;
> -
> -	BUG_ON(bip == NULL);
> -
>  	/* A cloned bio doesn't own the integrity metadata */
>  	if (!bio_flagged(bio, BIO_CLONED) && !bio_flagged(bio, BIO_FS_INTEGRITY)
>  	    && bip->bip_buf != NULL)
>  		kfree(bip->bip_buf);
>  
> -	if (use_bip_pool(bip->bip_slab))
> +	if (bs) {
> +		if (bip->bip_slab != BIO_POOL_NONE)
> +			bvec_free_bs(bs, bip->bip_vec, bip->bip_slab);
> +
>  		mempool_free(bip, bs->bio_integrity_pool);
> -	else
> -		kmem_cache_free(bip_slab[bip->bip_slab].slab, bip);
> +	} else {
> +		kfree(bip);
> +	}
>  
>  	bio->bi_integrity = NULL;
>  }
> @@ -697,8 +660,8 @@ void bio_integrity_split(struct bio *bio, struct bio_pair *bp, int sectors)
>  	bp->iv1 = bip->bip_vec[0];
>  	bp->iv2 = bip->bip_vec[0];
>  
> -	bp->bip1.bip_vec[0] = bp->iv1;
> -	bp->bip2.bip_vec[0] = bp->iv2;
> +	bp->bip1.bip_vec = &bp->iv1;
> +	bp->bip2.bip_vec = &bp->iv2;
>  
>  	bp->iv1.bv_len = sectors * bi->tuple_size;
>  	bp->iv2.bv_offset += sectors * bi->tuple_size;
> @@ -746,13 +709,10 @@ EXPORT_SYMBOL(bio_integrity_clone);
>  
>  int bioset_integrity_create(struct bio_set *bs, int pool_size)
>  {
> -	unsigned int max_slab = vecs_to_idx(BIO_MAX_PAGES);
> -
>  	if (bs->bio_integrity_pool)
>  		return 0;
>  
> -	bs->bio_integrity_pool =
> -		mempool_create_slab_pool(pool_size, bip_slab[max_slab].slab);
> +	bs->bio_integrity_pool = mempool_create_slab_pool(pool_size, bip_slab);
>  
>  	if (!bs->bio_integrity_pool)
>  		return -1;
> @@ -770,8 +730,6 @@ EXPORT_SYMBOL(bioset_integrity_free);
>  
>  void __init bio_integrity_init(void)
>  {
> -	unsigned int i;
> -
>  	/*
>  	 * kintegrityd won't block much but may burn a lot of CPU cycles.
>  	 * Make it highpri CPU intensive wq with max concurrency of 1.
> @@ -781,14 +739,10 @@ void __init bio_integrity_init(void)
>  	if (!kintegrityd_wq)
>  		panic("Failed to create kintegrityd\n");
>  
> -	for (i = 0 ; i < BIOVEC_NR_POOLS ; i++) {
> -		unsigned int size;
> -
> -		size = sizeof(struct bio_integrity_payload)
> -			+ bip_slab[i].nr_vecs * sizeof(struct bio_vec);
> -
> -		bip_slab[i].slab =
> -			kmem_cache_create(bip_slab[i].name, size, 0,
> -					  SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> -	}
> +	bip_slab = kmem_cache_create("bio_integrity_payload",
> +				     sizeof(struct bio_integrity_payload) +
> +				     sizeof(struct bio_vec) * BIP_INLINE_VECS,
> +				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> +	if (!bip_slab)
> +		panic("Failed to create slab\n");
>  }
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index c32ea0d..7873465 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -182,7 +182,10 @@ struct bio_integrity_payload {
>  	unsigned short		bip_idx;	/* current bip_vec index */
>  
>  	struct work_struct	bip_work;	/* I/O completion */
> -	struct bio_vec		bip_vec[0];	/* embedded bvec array */
> +
> +	struct bio_vec		*bip_vec;
> +	struct bio_vec		bip_inline_vecs[0];/* embedded bvec array */
> +
>  };
>  #endif /* CONFIG_BLK_DEV_INTEGRITY */
>  
> -- 
> 1.7.12
> 
> --
> dm-devel mailing list
> dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [dm-devel] [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
  2012-09-11 20:36     ` [dm-devel] " Vivek Goyal
@ 2012-09-11 20:48       ` Kent Overstreet
  2012-09-11 22:07       ` Kent Overstreet
  1 sibling, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11 20:48 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-bcache, linux-kernel, dm-devel, axboe, tj, Martin K. Petersen

On Tue, Sep 11, 2012 at 04:36:43PM -0400, Vivek Goyal wrote:
> On Mon, Sep 10, 2012 at 05:22:12PM -0700, Kent Overstreet wrote:
> > This adds a pointer to the bvec array to struct bio_integrity_payload,
> > instead of the bvecs always being inline; then the bvecs are allocated
> > with bvec_alloc_bs().
> 
> If you starting allocating bvec from same mempool for  bio and bip, 
> are you not breaking the principle of multiple allocations from a
> mempool and hence increasing the possibility of deadlock?

Argh, you're right. I should've thought about that.

It might be nice if mempools had some kind of internal watermark... here
we know we're only going to do at most 2 allocations from the same
mempool, so if we just passed the number we already had allocated to
mempool_alloc(), it could do the right thing and ensure we never
deadlocked.

I _think_ that'd let us make more efficient use of the reserve. But, for
just this it's not really worth the extra code, I think I'll just have
to add another mempool.

> Also there seems to be too much happening in this patch. Please break
> it down in 2. First fix the bio integrity bug you mentioned then
> introduce your changes on top.

Ok, I can do that.

> Thanks
> Vivek
> 
> > 
> > This is needed eventually for immutable bio vecs - immutable bvecs
> > aren't useful if we still have to copy them, hence the need for the
> > pointer. Less code is always nice too, though.
> > 
> > Also fix an amusing bug in bio_integrity_split() - struct bio_pair
> > doesn't have the integrity bvecs after the bio_integrity_payloads, so
> > there was a buffer overrun. The code was confusing pointers with arrays.
> > 
> > Signed-off-by: Kent Overstreet <koverstreet@google.com>
> > CC: Jens Axboe <axboe@kernel.dk>
> > CC: Martin K. Petersen <martin.petersen@oracle.com>
> > ---
> >  fs/bio-integrity.c  | 124 +++++++++++++++++-----------------------------------
> >  include/linux/bio.h |   5 ++-
> >  2 files changed, 43 insertions(+), 86 deletions(-)
> > 
> > diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> > index a3f28f3..1d64f7f 100644
> > --- a/fs/bio-integrity.c
> > +++ b/fs/bio-integrity.c
> > @@ -27,48 +27,11 @@
> >  #include <linux/workqueue.h>
> >  #include <linux/slab.h>
> >  
> > -struct integrity_slab {
> > -	struct kmem_cache *slab;
> > -	unsigned short nr_vecs;
> > -	char name[8];
> > -};
> > -
> > -#define IS(x) { .nr_vecs = x, .name = "bip-"__stringify(x) }
> > -struct integrity_slab bip_slab[BIOVEC_NR_POOLS] __read_mostly = {
> > -	IS(1), IS(4), IS(16), IS(64), IS(128), IS(BIO_MAX_PAGES),
> > -};
> > -#undef IS
> > +#define BIP_INLINE_VECS	4
> >  
> > +static struct kmem_cache *bip_slab;
> >  static struct workqueue_struct *kintegrityd_wq;
> >  
> > -static inline unsigned int vecs_to_idx(unsigned int nr)
> > -{
> > -	switch (nr) {
> > -	case 1:
> > -		return 0;
> > -	case 2 ... 4:
> > -		return 1;
> > -	case 5 ... 16:
> > -		return 2;
> > -	case 17 ... 64:
> > -		return 3;
> > -	case 65 ... 128:
> > -		return 4;
> > -	case 129 ... BIO_MAX_PAGES:
> > -		return 5;
> > -	default:
> > -		BUG();
> > -	}
> > -}
> > -
> > -static inline int use_bip_pool(unsigned int idx)
> > -{
> > -	if (idx == BIOVEC_MAX_IDX)
> > -		return 1;
> > -
> > -	return 0;
> > -}
> > -
> >  /**
> >   * bio_integrity_alloc - Allocate integrity payload and attach it to bio
> >   * @bio:	bio to attach integrity metadata to
> > @@ -84,37 +47,38 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio,
> >  						  unsigned int nr_vecs)
> >  {
> >  	struct bio_integrity_payload *bip;
> > -	unsigned int idx = vecs_to_idx(nr_vecs);
> >  	struct bio_set *bs = bio->bi_pool;
> > +	unsigned long idx = BIO_POOL_NONE;
> > +	unsigned inline_vecs;
> > +
> > +	if (!bs) {
> > +		bip = kmalloc(sizeof(struct bio_integrity_payload) +
> > +			      sizeof(struct bio_vec) * nr_vecs, gfp_mask);
> > +		inline_vecs = nr_vecs;
> > +	} else {
> > +		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
> > +		inline_vecs = BIP_INLINE_VECS;
> > +	}
> >  
> > -	if (!bs)
> > -		bs = fs_bio_set;
> > -
> > -	BUG_ON(bio == NULL);
> > -	bip = NULL;
> > +	if (unlikely(!bip))
> > +		return NULL;
> >  
> > -	/* Lower order allocations come straight from slab */
> > -	if (!use_bip_pool(idx))
> > -		bip = kmem_cache_alloc(bip_slab[idx].slab, gfp_mask);
> > +	memset(bip, 0, sizeof(struct bio_integrity_payload));
> >  
> > -	/* Use mempool if lower order alloc failed or max vecs were requested */
> > -	if (bip == NULL) {
> > -		idx = BIOVEC_MAX_IDX;  /* so we free the payload properly later */
> > -		bip = mempool_alloc(bs->bio_integrity_pool, gfp_mask);
> > -
> > -		if (unlikely(bip == NULL)) {
> > -			printk(KERN_ERR "%s: could not alloc bip\n", __func__);
> > -			return NULL;
> > -		}
> > +	if (nr_vecs > inline_vecs) {
> > +		bip->bip_vec = bvec_alloc_bs(gfp_mask, nr_vecs, &idx, bs);
> > +		if (!bip->bip_vec)
> > +			goto err;
> >  	}
> >  
> > -	memset(bip, 0, sizeof(*bip));
> > -
> >  	bip->bip_slab = idx;
> >  	bip->bip_bio = bio;
> >  	bio->bi_integrity = bip;
> >  
> >  	return bip;
> > +err:
> > +	mempool_free(bip, bs->bio_integrity_pool);
> > +	return NULL;
> >  }
> >  EXPORT_SYMBOL(bio_integrity_alloc);
> >  
> > @@ -130,20 +94,19 @@ void bio_integrity_free(struct bio *bio)
> >  	struct bio_integrity_payload *bip = bio->bi_integrity;
> >  	struct bio_set *bs = bio->bi_pool;
> >  
> > -	if (!bs)
> > -		bs = fs_bio_set;
> > -
> > -	BUG_ON(bip == NULL);
> > -
> >  	/* A cloned bio doesn't own the integrity metadata */
> >  	if (!bio_flagged(bio, BIO_CLONED) && !bio_flagged(bio, BIO_FS_INTEGRITY)
> >  	    && bip->bip_buf != NULL)
> >  		kfree(bip->bip_buf);
> >  
> > -	if (use_bip_pool(bip->bip_slab))
> > +	if (bs) {
> > +		if (bip->bip_slab != BIO_POOL_NONE)
> > +			bvec_free_bs(bs, bip->bip_vec, bip->bip_slab);
> > +
> >  		mempool_free(bip, bs->bio_integrity_pool);
> > -	else
> > -		kmem_cache_free(bip_slab[bip->bip_slab].slab, bip);
> > +	} else {
> > +		kfree(bip);
> > +	}
> >  
> >  	bio->bi_integrity = NULL;
> >  }
> > @@ -697,8 +660,8 @@ void bio_integrity_split(struct bio *bio, struct bio_pair *bp, int sectors)
> >  	bp->iv1 = bip->bip_vec[0];
> >  	bp->iv2 = bip->bip_vec[0];
> >  
> > -	bp->bip1.bip_vec[0] = bp->iv1;
> > -	bp->bip2.bip_vec[0] = bp->iv2;
> > +	bp->bip1.bip_vec = &bp->iv1;
> > +	bp->bip2.bip_vec = &bp->iv2;
> >  
> >  	bp->iv1.bv_len = sectors * bi->tuple_size;
> >  	bp->iv2.bv_offset += sectors * bi->tuple_size;
> > @@ -746,13 +709,10 @@ EXPORT_SYMBOL(bio_integrity_clone);
> >  
> >  int bioset_integrity_create(struct bio_set *bs, int pool_size)
> >  {
> > -	unsigned int max_slab = vecs_to_idx(BIO_MAX_PAGES);
> > -
> >  	if (bs->bio_integrity_pool)
> >  		return 0;
> >  
> > -	bs->bio_integrity_pool =
> > -		mempool_create_slab_pool(pool_size, bip_slab[max_slab].slab);
> > +	bs->bio_integrity_pool = mempool_create_slab_pool(pool_size, bip_slab);
> >  
> >  	if (!bs->bio_integrity_pool)
> >  		return -1;
> > @@ -770,8 +730,6 @@ EXPORT_SYMBOL(bioset_integrity_free);
> >  
> >  void __init bio_integrity_init(void)
> >  {
> > -	unsigned int i;
> > -
> >  	/*
> >  	 * kintegrityd won't block much but may burn a lot of CPU cycles.
> >  	 * Make it highpri CPU intensive wq with max concurrency of 1.
> > @@ -781,14 +739,10 @@ void __init bio_integrity_init(void)
> >  	if (!kintegrityd_wq)
> >  		panic("Failed to create kintegrityd\n");
> >  
> > -	for (i = 0 ; i < BIOVEC_NR_POOLS ; i++) {
> > -		unsigned int size;
> > -
> > -		size = sizeof(struct bio_integrity_payload)
> > -			+ bip_slab[i].nr_vecs * sizeof(struct bio_vec);
> > -
> > -		bip_slab[i].slab =
> > -			kmem_cache_create(bip_slab[i].name, size, 0,
> > -					  SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> > -	}
> > +	bip_slab = kmem_cache_create("bio_integrity_payload",
> > +				     sizeof(struct bio_integrity_payload) +
> > +				     sizeof(struct bio_vec) * BIP_INLINE_VECS,
> > +				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> > +	if (!bip_slab)
> > +		panic("Failed to create slab\n");
> >  }
> > diff --git a/include/linux/bio.h b/include/linux/bio.h
> > index c32ea0d..7873465 100644
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -182,7 +182,10 @@ struct bio_integrity_payload {
> >  	unsigned short		bip_idx;	/* current bip_vec index */
> >  
> >  	struct work_struct	bip_work;	/* I/O completion */
> > -	struct bio_vec		bip_vec[0];	/* embedded bvec array */
> > +
> > +	struct bio_vec		*bip_vec;
> > +	struct bio_vec		bip_inline_vecs[0];/* embedded bvec array */
> > +
> >  };
> >  #endif /* CONFIG_BLK_DEV_INTEGRITY */
> >  
> > -- 
> > 1.7.12
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 12/26] raid1: use bio_reset()
       [not found]           ` <20120911182825.GG19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-11 21:17             ` NeilBrown
  0 siblings, 0 replies; 81+ messages in thread
From: NeilBrown @ 2012-09-11 21:17 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

[-- Attachment #1: Type: text/plain, Size: 3720 bytes --]

On Tue, 11 Sep 2012 11:28:25 -0700 Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
wrote:

> On Tue, Sep 11, 2012 at 02:59:13PM +1000, NeilBrown wrote:
> > On Mon, 10 Sep 2012 17:22:23 -0700 Kent Overstreet <koverstreet@google.com>
> > wrote:
> > 
> > > I couldn't figure out what sbio->bi_end_io in process_checks() was
> > > supposed to be, so I took the easy way out.
> > 
> > Almost.
> > You save 'sbio->bi_end_io' to 'bi_end_io', then do nothing with it...
> 
> Whoops :) I think I must've gotten distracted and forgot to finish with
> that patch, I wasn't setting bi_private either.
> 
> > 
> > A little way above the 'fixup the bio for reuse' comment you'll find:
> > 
> > 		struct bio *sbio = r1_bio->bios[i];
> > ....
> > 		if (r1_bio->bios[i]->bi_end_io != end_sync_read)
> > 			continue;
> > 
> > which implies that if we don't 'continue', then sbio->bi_end_io ==
> > end_sync_read.
> 
> Ahh. I remember reading that, but I missed that that was sbio that was
> being checked.
> 
> > 
> > So I suspect you want to add
> >     sbio->bi_end_io = end_sync_read;
> > somewhere after the 'bio_reset()'.
> > 
> > If you happened to also fix that 'if' that I quoted so that it reads:
> > 
> >      if (sbio->bi_end_io != end_sync_read)
> >            continue;
> 
> Will do! How's this look?

Looks good.

Acked-by: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>

NeilBrown


> 
> 
> commit 40a4645a4346edd040066baedcf2184ac4211ba7
> Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Date:   Tue Sep 11 11:26:12 2012 -0700
> 
>     raid1: use bio_reset()
>     
>     Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>     CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
>     CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index ee85154..df68691 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1851,7 +1851,7 @@ static int process_checks(struct r1bio *r1_bio)
>  		struct bio *sbio = r1_bio->bios[i];
>  		int size;
>  
> -		if (r1_bio->bios[i]->bi_end_io != end_sync_read)
> +		if (sbio->bi_end_io != end_sync_read)
>  			continue;
>  
>  		if (test_bit(BIO_UPTODATE, &sbio->bi_flags)) {
> @@ -1876,16 +1876,15 @@ static int process_checks(struct r1bio *r1_bio)
>  			continue;
>  		}
>  		/* fixup the bio for reuse */
> +		bio_reset(sbio);
>  		sbio->bi_vcnt = vcnt;
>  		sbio->bi_size = r1_bio->sectors << 9;
> -		sbio->bi_idx = 0;
> -		sbio->bi_phys_segments = 0;
> -		sbio->bi_flags &= ~(BIO_POOL_MASK - 1);
> -		sbio->bi_flags |= 1 << BIO_UPTODATE;
> -		sbio->bi_next = NULL;
>  		sbio->bi_sector = r1_bio->sector +
>  			conf->mirrors[i].rdev->data_offset;
>  		sbio->bi_bdev = conf->mirrors[i].rdev->bdev;
> +		sbio->bi_end_io = end_sync_read;
> +		sbio->bi_private = r1_bio;
> +
>  		size = sbio->bi_size;
>  		for (j = 0; j < vcnt ; j++) {
>  			struct bio_vec *bi;
> @@ -2426,18 +2425,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
>  	for (i = 0; i < conf->raid_disks * 2; i++) {
>  		struct md_rdev *rdev;
>  		bio = r1_bio->bios[i];
> -
> -		/* take from bio_init */
> -		bio->bi_next = NULL;
> -		bio->bi_flags &= ~(BIO_POOL_MASK-1);
> -		bio->bi_flags |= 1 << BIO_UPTODATE;
> -		bio->bi_rw = READ;
> -		bio->bi_vcnt = 0;
> -		bio->bi_idx = 0;
> -		bio->bi_phys_segments = 0;
> -		bio->bi_size = 0;
> -		bio->bi_end_io = NULL;
> -		bio->bi_private = NULL;
> +		bio_reset(bio);
>  
>  		rdev = rcu_dereference(conf->mirrors[i].rdev);
>  		if (rdev == NULL ||


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [dm-devel] [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
  2012-09-11 20:36     ` [dm-devel] " Vivek Goyal
  2012-09-11 20:48       ` Kent Overstreet
@ 2012-09-11 22:07       ` Kent Overstreet
       [not found]         ` <20120911220750.GM19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-11 22:07 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-bcache, linux-kernel, dm-devel, axboe, tj, Martin K. Petersen

On Tue, Sep 11, 2012 at 04:36:43PM -0400, Vivek Goyal wrote:
> Also there seems to be too much happening in this patch. Please break
> it down in 2. First fix the bio integrity bug you mentioned then
> introduce your changes on top.

Oh, I forgot - the reason I squashed them into one patch is fixing the
bug came for free with the rest of the refactoring, and combining them
touches less code.

To fix the bug first, I'd have to reorder struct bio_pair and then just
delete two lines of code from bio_integrity_split(). But the reordering
is unnecessary with the refactoring.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [dm-devel] [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
       [not found]         ` <20120911220750.GM19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-12 19:39           ` Martin K. Petersen
       [not found]             ` <yq1wqzzrpy1.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Martin K. Petersen @ 2012-09-12 19:39 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Vivek Goyal, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A, Martin K. Petersen

>>>>> "Kent" == Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:

Kent,

Kent> To fix the bug first, I'd have to reorder struct bio_pair and then
Kent> just delete two lines of code from bio_integrity_split(). But the
Kent> reordering is unnecessary with the refactoring.

Well, a bug is a bug and the fix needs to go into stable. So we will
need a patch that does not depend on your changes.

I don't have a problem with adding a pointer so clones can point to the
parent's vector. But embedding the vector into the bip was a feature.
If you check the git log you'll see that originally I did use separate
vector allocations.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf
       [not found]     ` <1347322957-25260-26-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-12 19:41       ` Martin K. Petersen
       [not found]         ` <yq1sjanrpu7.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Martin K. Petersen @ 2012-09-12 19:41 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM, Martin K. Petersen

>>>>> "Kent" == Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:

Kent> This was the only real user of BIO_CLONED, which didn't have very
Kent> clear semantics. Convert to its own flag so we can get rid of
Kent> BIO_CLONED.

I already have a patch in my queue that moves all integrity-relevant
flags from struct bio to the bip. So I'm ok with removing BIO_CLONED but
I'll send a separate patch for the integrity flags.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 05/26] block: Add bio_end()
       [not found]   ` <1347322957-25260-6-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-17  9:17     ` Steven Whitehouse
  2012-09-20 23:32     ` Tejun Heo
  1 sibling, 0 replies; 81+ messages in thread
From: Steven Whitehouse @ 2012-09-17  9:17 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

Hi,

On Mon, 2012-09-10 at 17:22 -0700, Kent Overstreet wrote:
> Just a little convenience macro - main reason to add it now is preparing
> for immutable bio vecs, it'll reduce the size of the patch that puts
> bi_sector/bi_size/bi_idx into a struct bvec_iter.
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>

GFS2 bits:
Acked-by: Steven Whitehouse <swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Steve.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [dm-devel] [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
       [not found]             ` <yq1wqzzrpy1.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
@ 2012-09-17 21:08               ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-17 21:08 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Vivek Goyal, linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A

On Wed, Sep 12, 2012 at 03:39:18PM -0400, Martin K. Petersen wrote:
> >>>>> "Kent" == Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:
> 
> Kent,
> 
> Kent> To fix the bug first, I'd have to reorder struct bio_pair and then
> Kent> just delete two lines of code from bio_integrity_split(). But the
> Kent> reordering is unnecessary with the refactoring.
> 
> Well, a bug is a bug and the fix needs to go into stable. So we will
> need a patch that does not depend on your changes.

Alright, good point.

> I don't have a problem with adding a pointer so clones can point to the
> parent's vector. But embedding the vector into the bip was a feature.
> If you check the git log you'll see that originally I did use separate
> vector allocations.

Looks like that was 7878cba9f0037f5599004b03a1260b32d9050360 - If I
follow your commit message your primary goal was to back the bip vecs by
a per bio set mempool?

I didn't break that (excepting the issue Vivek noted) - but it is true
that my patch adds another allocation (when nr_vecs > BIP_INLINE_VECS,
anyways).

I don't know how big of a deal you think that extra allocation is. If
you're against it, this patch isn't really necessary for the immutable
bvecs I'm working on - just need it if we want integrity bvecs to be
shared like regular bvecs will be.

Something else I noticed is bio_integrity_add_page() doesn't merge bvecs
when possible, like the regular bio_add_page(). If changing it to merge
bvecs wouldn't break anything, then probably most integrity bvecs would
be under BIP_INLINE_VECS.

Thoughts?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf
       [not found]         ` <yq1sjanrpu7.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
@ 2012-09-17 21:09           ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-17 21:09 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	tj-DgEjT+Ai2ygdnm+yROfE0A, neilb-l3A5Bk7waGM

On Wed, Sep 12, 2012 at 03:41:36PM -0400, Martin K. Petersen wrote:
> >>>>> "Kent" == Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes:
> 
> Kent> This was the only real user of BIO_CLONED, which didn't have very
> Kent> clear semantics. Convert to its own flag so we can get rid of
> Kent> BIO_CLONED.
> 
> I already have a patch in my queue that moves all integrity-relevant
> flags from struct bio to the bip. So I'm ok with removing BIO_CLONED but
> I'll send a separate patch for the integrity flags.

Cool - have a git repo I can take a look at?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix
       [not found]   ` <1347322957-25260-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-11 20:36     ` [dm-devel] " Vivek Goyal
@ 2012-09-20 21:53     ` Tejun Heo
  1 sibling, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 21:53 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM, Martin K. Petersen, Vivek Goyal

Hello,

On Mon, Sep 10, 2012 at 05:22:12PM -0700, Kent Overstreet wrote:
> This adds a pointer to the bvec array to struct bio_integrity_payload,
> instead of the bvecs always being inline; then the bvecs are allocated
> with bvec_alloc_bs().
> 
> This is needed eventually for immutable bio vecs - immutable bvecs
> aren't useful if we still have to copy them, hence the need for the
> pointer. Less code is always nice too, though.
> 
> Also fix an amusing bug in bio_integrity_split() - struct bio_pair
> doesn't have the integrity bvecs after the bio_integrity_payloads, so
> there was a buffer overrun. The code was confusing pointers with arrays.

Aside from what Martin and Vivek already pointed out, it generally
looks okay to me but here are some thoughts.

I'm quite doubtful how much we're gaining by this complex set of slabs
and mempools approach compared to just using kmalloc + one mempool for
the largest size.  I can't think of anything to be gained in terms of
cacheline hotness.  In terms of memory usage too, we're proabably
introducing more fragmentation with all the different slabs.

Jens, what are you thoughts?  For forward progress guarantee, single
mempool per bioset serving the largest one is enough and we can
significantly simplify the whole bioset thing.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 02/26] block: Add bio_advance()
       [not found]   ` <1347322957-25260-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 21:58     ` Tejun Heo
  2012-09-20 23:13       ` Kent Overstreet
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 21:58 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:13PM -0700, Kent Overstreet wrote:
> +/**
> + * bio_advance - increment/complete a bio by some number of bytes
> + * @bio:	bio to advance
> + * @bytes:	number of bytes to complete
> + *
> + * This updates bi_sector, bi_size and bi_idx; if the number of bytes to
> + * complete doesn't align with a bvec boundary, then bv_len and bv_offset will
> + * be updated on the last bvec as well.
> + *
> + * @bio will then represent the remaining, uncompleted portion of the io.
> + */
> +void bio_advance(struct bio *bio, unsigned bytes)
> +{
> +	if (bio_integrity(bio))
> +		bio_integrity_advance(bio, bytes);
> +
> +	bio->bi_sector += bytes >> 0;

Hmmm.... bytes >> 0?

> +	bio->bi_size -= bytes;
> +
> +	if (!bio->bi_size)
> +		return;
> +
> +	while (bytes) {
> +		if (unlikely(bio->bi_idx >= bio->bi_vcnt)) {
> +			printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n",

pr_err() is preferred but maybe WARN_ON_ONCE() is better fit here?
This happening would be a bug, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 02/26] block: Add bio_advance()
  2012-09-20 21:58     ` Tejun Heo
@ 2012-09-20 23:13       ` Kent Overstreet
       [not found]         ` <20120920231308.GB5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Thu, Sep 20, 2012 at 02:58:27PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:13PM -0700, Kent Overstreet wrote:
> > +/**
> > + * bio_advance - increment/complete a bio by some number of bytes
> > + * @bio:	bio to advance
> > + * @bytes:	number of bytes to complete
> > + *
> > + * This updates bi_sector, bi_size and bi_idx; if the number of bytes to
> > + * complete doesn't align with a bvec boundary, then bv_len and bv_offset will
> > + * be updated on the last bvec as well.
> > + *
> > + * @bio will then represent the remaining, uncompleted portion of the io.
> > + */
> > +void bio_advance(struct bio *bio, unsigned bytes)
> > +{
> > +	if (bio_integrity(bio))
> > +		bio_integrity_advance(bio, bytes);
> > +
> > +	bio->bi_sector += bytes >> 0;
> 
> Hmmm.... bytes >> 0?

Whoops...

> > +	bio->bi_size -= bytes;
> > +
> > +	if (!bio->bi_size)
> > +		return;
> > +
> > +	while (bytes) {
> > +		if (unlikely(bio->bi_idx >= bio->bi_vcnt)) {
> > +			printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n",
> 
> pr_err() is preferred but maybe WARN_ON_ONCE() is better fit here?
> This happening would be a bug, right?

I just cut and pasted that from blk_update_request(), which is what the
next patch refactors...

But yes it would be a bug. It gets converted to a BUG_ON() in a later
patch (not in this series), as this gets further abstracted into a
wrapper around bvec_advance_iter() which doesn't know about struct bio
(as bio integrity gets its own iterator).

Might drop it entirely, depending on what exactly I end up doing with
bi_vcnt...

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 03/26] block: Refactor blk_update_request()
       [not found]   ` <1347322957-25260-4-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:20     ` Tejun Heo
  2012-09-20 23:36       ` Kent Overstreet
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:20 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM, Vivek Goyal

Hello,

On Mon, Sep 10, 2012 at 05:22:14PM -0700, Kent Overstreet wrote:
>  static void req_bio_endio(struct request *rq, struct bio *bio,
>  			  unsigned int nbytes, int error)
>  {
> +	/*
> +	 * XXX: bio_endio() does this. only need this because of the weird
> +	 * flush seq thing.
> +	 */
>  	if (error)
>  		clear_bit(BIO_UPTODATE, &bio->bi_flags);
>  	else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
>  		error = -EIO;

Isn't this also necessary to record errors on partial completions?

Other than that, I definitely like this.  It would be nice to note
that the custom partial bio advancing in blk_update_request() is
replaced with multiple calls to req_bio_endio().  I don't think it has
any meaningful performance implications.  It's just nice to future
readers of the commit.

Also, it would be really nice if you can verify this actually works
with partial blk_update_request().  sector update bug in the previous
patch scares me a bit.  Implementing some debug hacks in the
completion path might be the easiest way to verify.  A subtle bug here
could be pretty painful.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 00/26] Prep work for immutable bio vecs
  2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
                   ` (19 preceding siblings ...)
       [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:22 ` Tejun Heo
  20 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:22 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb, Vivek Goyal

On Mon, Sep 10, 2012 at 05:22:11PM -0700, Kent Overstreet wrote:
> Random assortment of refactoring and trivial cleanups;
> 
> Immutable bio vecs and efficient bio splitting require auditing and
> removing pretty much all bi_idx uses, among other things.
> 
> The reason is that with immutable bio vecs we can't use the bvec array
> directly; if we have a partially completed bvec, that'll be indicated
> with a field in struct bvec_iter (which gets embedded in struct bio) -
> bi_bvec_done.
> 
> bio_for_each_segments() will handle this transparently, so code needs to
> be converted to use it or some other generic accessor.
> 
> Also, bio splitting means that when a driver gets a bio, bi_idx and
> bi_bvec_done may both be nonzero. Again, just need to use generic
> accessors.
> 
> v2: Patch series now has all the prep work to be done before abstracting
> out the bio iterator, I think.

Cc'ing Vivek.  Kent, can you please add Vivek to Cc on block layer
patches?

Vivek, can you please review this series?  It's generic block stuff
and definitely can use your review.

  http://thread.gmane.org/gmane.linux.kernel.bcache.devel/1055

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 02/26] block: Add bio_advance()
       [not found]         ` <20120920231308.GB5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:25           ` Tejun Heo
       [not found]             ` <20120920232506.GI7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:25 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

Hello,

On Thu, Sep 20, 2012 at 04:13:08PM -0700, Kent Overstreet wrote:
> I just cut and pasted that from blk_update_request(), which is what the
> next patch refactors...

Yeah, well, that was written when we didn't have WARNs.

> But yes it would be a bug. It gets converted to a BUG_ON() in a later
> patch (not in this series), as this gets further abstracted into a
> wrapper around bvec_advance_iter() which doesn't know about struct bio
> (as bio integrity gets its own iterator).

WARN() generally preferable unless there's no way at all to continue.
Storage layer could be a bit different if immediate danger for data
corruption exists but the general consensus seems that we're too
trigger happy with BUG_ON()s.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 04/26] md: Convert md_trim_bio() to use bio_advance()
       [not found]   ` <1347322957-25260-5-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:27     ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:27 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:15PM -0700, Kent Overstreet wrote:
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>

Looks good to me.  Neil?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 05/26] block: Add bio_end()
       [not found]   ` <1347322957-25260-6-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-17  9:17     ` Steven Whitehouse
@ 2012-09-20 23:32     ` Tejun Heo
       [not found]       ` <20120920233225.GK7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:32 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:16PM -0700, Kent Overstreet wrote:
> Just a little convenience macro - main reason to add it now is preparing
> for immutable bio vecs, it'll reduce the size of the patch that puts
> bi_sector/bi_size/bi_idx into a struct bvec_iter.
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 6763cdf..92bff0e 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -67,6 +67,7 @@
>  #define bio_offset(bio)		bio_iovec((bio))->bv_offset
>  #define bio_segments(bio)	((bio)->bi_vcnt - (bio)->bi_idx)
>  #define bio_sectors(bio)	((bio)->bi_size >> 9)
> +#define bio_end(bio)		((bio)->bi_sector + bio_sectors(bio))

Maybe bio_end_sector() is a better name?  bio_end() looks a bit too
close to bio_endio().

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 06/26] block: Use bio_sectors() more consistently
       [not found]   ` <1347322957-25260-7-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:36     ` Tejun Heo
       [not found]       ` <20120920233618.GL7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:36 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:17PM -0700, Kent Overstreet wrote:
> diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
> index 321de7b..6e4420a 100644
> --- a/drivers/block/aoe/aoeblk.c
> +++ b/drivers/block/aoe/aoeblk.c
> @@ -199,7 +199,7 @@ aoeblk_make_request(struct request_queue *q, struct bio *bio)
>  	buf->bio = bio;
>  	buf->resid = bio->bi_size;
>  	buf->sector = bio->bi_sector;
> -	buf->bv = &bio->bi_io_vec[bio->bi_idx];
> +	buf->bv = bio_iovec(bio);

Contamination?

Also, in general, please cc at least the maintainers of the files that
you modify.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 03/26] block: Refactor blk_update_request()
  2012-09-20 23:20     ` Tejun Heo
@ 2012-09-20 23:36       ` Kent Overstreet
       [not found]         ` <20120920233632.GC5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:36 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb, Vivek Goyal

On Thu, Sep 20, 2012 at 04:20:00PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Mon, Sep 10, 2012 at 05:22:14PM -0700, Kent Overstreet wrote:
> >  static void req_bio_endio(struct request *rq, struct bio *bio,
> >  			  unsigned int nbytes, int error)
> >  {
> > +	/*
> > +	 * XXX: bio_endio() does this. only need this because of the weird
> > +	 * flush seq thing.
> > +	 */
> >  	if (error)
> >  		clear_bit(BIO_UPTODATE, &bio->bi_flags);
> >  	else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
> >  		error = -EIO;
> 
> Isn't this also necessary to record errors on partial completions?

Ah yeah, you're right. Meant to delete that comment anyways.

> Other than that, I definitely like this.  It would be nice to note
> that the custom partial bio advancing in blk_update_request() is
> replaced with multiple calls to req_bio_endio().  I don't think it has
> any meaningful performance implications.  It's just nice to future
> readers of the commit.

The number of calls to req_bio_endio() isn't changing...
blk_update_request() called it for partial completions before. It's just
where the bio itself is updated that's getting shuffled around.

Or did you mean that bio_advance() is getting called on every bio
instead of the custom advancing in blk_update_request() before? That is
different, yeah - it's now always looping over the iovec, not just for
partial completions.

Yeah, I will note that in the commit message, in case Jens sees a
performance regression from it :)

> Also, it would be really nice if you can verify this actually works
> with partial blk_update_request().  sector update bug in the previous
> patch scares me a bit.  Implementing some debug hacks in the
> completion path might be the easiest way to verify.  A subtle bug here
> could be pretty painful.

Any suggestions on how to trigger partial updates?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 02/26] block: Add bio_advance()
       [not found]             ` <20120920232506.GI7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:38               ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 04:25:06PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Thu, Sep 20, 2012 at 04:13:08PM -0700, Kent Overstreet wrote:
> > I just cut and pasted that from blk_update_request(), which is what the
> > next patch refactors...
> 
> Yeah, well, that was written when we didn't have WARNs.
> 
> > But yes it would be a bug. It gets converted to a BUG_ON() in a later
> > patch (not in this series), as this gets further abstracted into a
> > wrapper around bvec_advance_iter() which doesn't know about struct bio
> > (as bio integrity gets its own iterator).
> 
> WARN() generally preferable unless there's no way at all to continue.
> Storage layer could be a bit different if immediate danger for data
> corruption exists but the general consensus seems that we're too
> trigger happy with BUG_ON()s.

Yeah. Changed it to a WARN_ONCE().

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 03/26] block: Refactor blk_update_request()
       [not found]         ` <20120920233632.GC5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:41           ` Tejun Heo
       [not found]             ` <20120920234133.GM7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:41 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM, Vivek Goyal

Hey,

On Thu, Sep 20, 2012 at 04:36:32PM -0700, Kent Overstreet wrote:
> > Other than that, I definitely like this.  It would be nice to note
> > that the custom partial bio advancing in blk_update_request() is
> > replaced with multiple calls to req_bio_endio().  I don't think it has
> > any meaningful performance implications.  It's just nice to future
> > readers of the commit.
> 
> The number of calls to req_bio_endio() isn't changing...
> blk_update_request() called it for partial completions before. It's just
> where the bio itself is updated that's getting shuffled around.
>
> Or did you mean that bio_advance() is getting called on every bio
> instead of the custom advancing in blk_update_request() before? That is
> different, yeah - it's now always looping over the iovec, not just for
> partial completions.
> 
> Yeah, I will note that in the commit message, in case Jens sees a
> performance regression from it :)

I don't think there's any performance implication.  It's just nice to
explain how the complexity went away.  If for nothing else, to point
out how silly the original code was. :)

> > Also, it would be really nice if you can verify this actually works
> > with partial blk_update_request().  sector update bug in the previous
> > patch scares me a bit.  Implementing some debug hacks in the
> > completion path might be the easiest way to verify.  A subtle bug here
> > could be pretty painful.
> 
> Any suggestions on how to trigger partial updates?

ide along with many legacy drivers do it.  Any SCSI driver including
libata only does full completion.  I don't know.  Even just trying to
call the function and comparing before & after with the original code
would be good.  I'd like to see at least some form of verification
because the manifested bugs could be extremely nasty and difficult to
track down.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 05/26] block: Add bio_end()
       [not found]       ` <20120920233225.GK7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:44         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 04:32:25PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:16PM -0700, Kent Overstreet wrote:
> > Just a little convenience macro - main reason to add it now is preparing
> > for immutable bio vecs, it'll reduce the size of the patch that puts
> > bi_sector/bi_size/bi_idx into a struct bvec_iter.
> > 
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > diff --git a/include/linux/bio.h b/include/linux/bio.h
> > index 6763cdf..92bff0e 100644
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -67,6 +67,7 @@
> >  #define bio_offset(bio)		bio_iovec((bio))->bv_offset
> >  #define bio_segments(bio)	((bio)->bi_vcnt - (bio)->bi_idx)
> >  #define bio_sectors(bio)	((bio)->bi_size >> 9)
> > +#define bio_end(bio)		((bio)->bi_sector + bio_sectors(bio))
> 
> Maybe bio_end_sector() is a better name?  bio_end() looks a bit too
> close to bio_endio().

Bit verbose for my tastes, but I tend to be more terse than most :P I'm
used to bio_end(), but I'll probably change it.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0
       [not found]   ` <1347322957-25260-8-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:45     ` Tejun Heo
       [not found]       ` <20120920234544.GN7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:45 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:18PM -0700, Kent Overstreet wrote:
> Prep work for immutable bio_vecs/efficient bio splitting: they require
> auditing and removing most uses of bi_idx.
> 
> So here we convert bio_split() to respect the current value of bi_idx
> and use the bio_iovec() macro, instead of assuming bi_idx will be 0.

I find the description a bit cryptic.

> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> ---
>  drivers/block/drbd/drbd_req.c | 6 +++---
>  drivers/md/raid0.c            | 3 +--
>  drivers/md/raid10.c           | 3 +--
>  fs/bio-integrity.c            | 4 ++--
>  fs/bio.c                      | 7 +++----
>  5 files changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> index af69a96..57eb253 100644
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -1155,11 +1155,11 @@ void drbd_make_request(struct request_queue *q, struct bio *bio)
>  
>  	/* can this bio be split generically?
>  	 * Maybe add our own split-arbitrary-bios function. */
> -	if (bio->bi_vcnt != 1 || bio->bi_idx != 0 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
> +	if (bio_segments(bio) != 1 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
>  		/* rather error out here than BUG in bio_split */
>  		dev_err(DEV, "bio would need to, but cannot, be split: "
> -		    "(vcnt=%u,idx=%u,size=%u,sector=%llu)\n",
> -		    bio->bi_vcnt, bio->bi_idx, bio->bi_size,
> +		    "(segments=%u,size=%u,sector=%llu)\n",
> +		    bio_segments(bio), bio->bi_size,
>  		    (unsigned long long)bio->bi_sector);
>  		bio_endio(bio, -EINVAL);
>  	} else {
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 387cb89..0587450 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -509,8 +509,7 @@ static void raid0_make_request(struct mddev *mddev, struct bio *bio)
>  		sector_t sector = bio->bi_sector;
>  		struct bio_pair *bp;
>  		/* Sanity check -- queue functions should prevent this happening */
> -		if (bio->bi_vcnt != 1 ||
> -		    bio->bi_idx != 0)
> +		if (bio_segments(bio) != 1)
>  			goto bad_map;
>  		/* This is a one page bio that upper layers
>  		 * refuse to split for us, so we need to split it.
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 9715aaf..bbd08f5 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1081,8 +1081,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
>  			 || conf->prev.near_copies < conf->prev.raid_disks))) {
>  		struct bio_pair *bp;
>  		/* Sanity check -- queue functions should prevent this happening */
> -		if (bio->bi_vcnt != 1 ||
> -		    bio->bi_idx != 0)
> +		if (bio_segments(bio) != 1)
>  			goto bad_map;
>  		/* This is a one page bio that upper layers
>  		 * refuse to split for us, so we need to split it.

And wonder how the description applies to the above.

> --- a/fs/bio.c
> +++ b/fs/bio.c
> @@ -1616,8 +1616,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
>  	trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
>  				bi->bi_sector + first_sectors);
>  
> -	BUG_ON(bi->bi_vcnt != 1);
> -	BUG_ON(bi->bi_idx != 0);
> +	BUG_ON(bio_segments(bi) != 1);
>  	atomic_set(&bp->cnt, 3);
>  	bp->error = 0;
>  	bp->bio1 = *bi;
> @@ -1626,8 +1625,8 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
>  	bp->bio2.bi_size -= first_sectors << 9;
>  	bp->bio1.bi_size = first_sectors << 9;
>  
> -	bp->bv1 = bi->bi_io_vec[0];
> -	bp->bv2 = bi->bi_io_vec[0];
> +	bp->bv1 = *bio_iovec(bi);
> +	bp->bv2 = *bio_iovec(bi);

This conflicts with a recent commit from Martin.  You probably wanna
rebase.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 06/26] block: Use bio_sectors() more consistently
       [not found]       ` <20120920233618.GL7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:47         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 04:36:18PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:17PM -0700, Kent Overstreet wrote:
> > diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
> > index 321de7b..6e4420a 100644
> > --- a/drivers/block/aoe/aoeblk.c
> > +++ b/drivers/block/aoe/aoeblk.c
> > @@ -199,7 +199,7 @@ aoeblk_make_request(struct request_queue *q, struct bio *bio)
> >  	buf->bio = bio;
> >  	buf->resid = bio->bi_size;
> >  	buf->sector = bio->bi_sector;
> > -	buf->bv = &bio->bi_io_vec[bio->bi_idx];
> > +	buf->bv = bio_iovec(bio);
> 
> Contamination?

Whoops, yes.

> Also, in general, please cc at least the maintainers of the files that
> you modify.

Meant to ask you about these patches that basically just rename things -
will do.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 08/26] block: Remove bi_idx references
       [not found]     ` <1347322957-25260-9-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:49       ` Tejun Heo
  2012-09-21  0:04         ` Kent Overstreet
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:49 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:19PM -0700, Kent Overstreet wrote:
> These were harmless but uneccessary,andt getting rid of them makes the
> code easier to audit since most of them need to be removed.

I find the descriptions a bit too terse.  Why do they need to be
removed?  So, I suppose you wanted to say explicit initializations to
0 are unnecessary, but there are bio_segments() conversions too.

The patch is simple and this isn't a big deal but I really hope for
better (correct) descriptions.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 03/26] block: Refactor blk_update_request()
       [not found]             ` <20120920234133.GM7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:50               ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-20 23:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM, Vivek Goyal

On Thu, Sep 20, 2012 at 04:41:33PM -0700, Tejun Heo wrote:
> Hey,
> 
> On Thu, Sep 20, 2012 at 04:36:32PM -0700, Kent Overstreet wrote:
> > > Other than that, I definitely like this.  It would be nice to note
> > > that the custom partial bio advancing in blk_update_request() is
> > > replaced with multiple calls to req_bio_endio().  I don't think it has
> > > any meaningful performance implications.  It's just nice to future
> > > readers of the commit.
> > 
> > The number of calls to req_bio_endio() isn't changing...
> > blk_update_request() called it for partial completions before. It's just
> > where the bio itself is updated that's getting shuffled around.
> >
> > Or did you mean that bio_advance() is getting called on every bio
> > instead of the custom advancing in blk_update_request() before? That is
> > different, yeah - it's now always looping over the iovec, not just for
> > partial completions.
> > 
> > Yeah, I will note that in the commit message, in case Jens sees a
> > performance regression from it :)
> 
> I don't think there's any performance implication.  It's just nice to
> explain how the complexity went away.  If for nothing else, to point
> out how silly the original code was. :)

New patch below - that commit message have what you're after?

> > > Also, it would be really nice if you can verify this actually works
> > > with partial blk_update_request().  sector update bug in the previous
> > > patch scares me a bit.  Implementing some debug hacks in the
> > > completion path might be the easiest way to verify.  A subtle bug here
> > > could be pretty painful.
> > 
> > Any suggestions on how to trigger partial updates?
> 
> ide along with many legacy drivers do it.  Any SCSI driver including
> libata only does full completion.  I don't know.  Even just trying to
> call the function and comparing before & after with the original code
> would be good.  I'd like to see at least some form of verification
> because the manifested bugs could be extremely nasty and difficult to
> track down.

Multiple partial completions should have the same semantics as a single
full completion, so maybe I'll try rigging up some test code that wraps
blk_update_request(), turning full completions into partial completions,
and verifies stuff...


commit fef0ddc82214f87de71ec6fb051eb28a6de0be74
Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Date:   Thu Sep 20 16:38:30 2012 -0700

    block: Refactor blk_update_request()
    
    Converts it to use bio_advance(), simplifying it quite a bit in the
    process.
    
    Note that req_bio_endio() now always calls bio_advance() - which means
    it always loops over the biovec, not just on partial completions. Don't
    expect it to affect performance, but worth noting.
    
    Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
    CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>

diff --git a/block/blk-core.c b/block/blk-core.c
index 2d739ca..9f8cb16 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -158,20 +158,10 @@ static void req_bio_endio(struct request *rq, struct bio *bio,
 	else if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
 		error = -EIO;
 
-	if (unlikely(nbytes > bio->bi_size)) {
-		printk(KERN_ERR "%s: want %u bytes done, %u left\n",
-		       __func__, nbytes, bio->bi_size);
-		nbytes = bio->bi_size;
-	}
-
 	if (unlikely(rq->cmd_flags & REQ_QUIET))
 		set_bit(BIO_QUIET, &bio->bi_flags);
 
-	bio->bi_size -= nbytes;
-	bio->bi_sector += (nbytes >> 9);
-
-	if (bio_integrity(bio))
-		bio_integrity_advance(bio, nbytes);
+	bio_advance(bio, nbytes);
 
 	/* don't actually finish bio if it's part of flush sequence */
 	if (bio->bi_size == 0 && !(rq->cmd_flags & REQ_FLUSH_SEQ))
@@ -2214,8 +2204,7 @@ EXPORT_SYMBOL(blk_fetch_request);
  **/
 bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 {
-	int total_bytes, bio_nbytes, next_idx = 0;
-	struct bio *bio;
+	int total_bytes;
 
 	if (!req->bio)
 		return false;
@@ -2259,56 +2248,21 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 
 	blk_account_io_completion(req, nr_bytes);
 
-	total_bytes = bio_nbytes = 0;
-	while ((bio = req->bio) != NULL) {
-		int nbytes;
+	total_bytes = 0;
+	while (req->bio) {
+		struct bio *bio = req->bio;
+		unsigned bio_bytes = min(bio->bi_size, nr_bytes);
 
-		if (nr_bytes >= bio->bi_size) {
+		if (bio_bytes == bio->bi_size)
 			req->bio = bio->bi_next;
-			nbytes = bio->bi_size;
-			req_bio_endio(req, bio, nbytes, error);
-			next_idx = 0;
-			bio_nbytes = 0;
-		} else {
-			int idx = bio->bi_idx + next_idx;
-
-			if (unlikely(idx >= bio->bi_vcnt)) {
-				blk_dump_rq_flags(req, "__end_that");
-				printk(KERN_ERR "%s: bio idx %d >= vcnt %d\n",
-				       __func__, idx, bio->bi_vcnt);
-				break;
-			}
-
-			nbytes = bio_iovec_idx(bio, idx)->bv_len;
-			BIO_BUG_ON(nbytes > bio->bi_size);
-
-			/*
-			 * not a complete bvec done
-			 */
-			if (unlikely(nbytes > nr_bytes)) {
-				bio_nbytes += nr_bytes;
-				total_bytes += nr_bytes;
-				break;
-			}
 
-			/*
-			 * advance to the next vector
-			 */
-			next_idx++;
-			bio_nbytes += nbytes;
-		}
+		req_bio_endio(req, bio, bio_bytes, error);
 
-		total_bytes += nbytes;
-		nr_bytes -= nbytes;
+		total_bytes += bio_bytes;
+		nr_bytes -= bio_bytes;
 
-		bio = req->bio;
-		if (bio) {
-			/*
-			 * end more in this run, or just return 'not-done'
-			 */
-			if (unlikely(nr_bytes <= 0))
-				break;
-		}
+		if (!nr_bytes)
+			break;
 	}
 
 	/*
@@ -2324,16 +2278,6 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 		return false;
 	}
 
-	/*
-	 * if the request wasn't completed, update state
-	 */
-	if (bio_nbytes) {
-		req_bio_endio(req, bio, bio_nbytes, error);
-		bio->bi_idx += next_idx;
-		bio_iovec(bio)->bv_offset += nr_bytes;
-		bio_iovec(bio)->bv_len -= nr_bytes;
-	}
-
 	req->__data_len -= total_bytes;
 	req->buffer = bio_data(req->bio);
 

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage
       [not found]   ` <1347322957-25260-10-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:51     ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:51 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:20PM -0700, Kent Overstreet wrote:
> More prep work for immutable bvecs/effecient bio splitting - usage of
> bi_vcnt has to be auditing, so getting rid of all the unnecessary usage
> makes that easier.
> 
> Plus, bio_segments() is really what this code wanted, as it respects the
> current value of bi_idx.

Looks good to me but definitely need reviews from SCSI people.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md
       [not found]   ` <1347322957-25260-11-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-20 23:56     ` Tejun Heo
       [not found]       ` <20120920235643.GQ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:56 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:21PM -0700, Kent Overstreet wrote:
> Random cleanup - this code was duplicated and it's not really specific
> to md.
> 
> Also added the ability to return the actual error code.
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>

Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -249,6 +249,7 @@ extern void bio_endio(struct bio *, int);
>  struct request_queue;
>  extern int bio_phys_segments(struct request_queue *, struct bio *);
>  
> +extern int submit_bio_wait(int rw, struct bio *bio);
>  void bio_advance(struct bio *, unsigned);

Heh, this is one of the reasons why I don't like extern on function
prototypes.  It's not necessary and people end up jumping between the
two forms.  :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 11/26] raid10: Use bio_reset()
  2012-09-11  0:22 ` [PATCH v2 11/26] raid10: Use bio_reset() Kent Overstreet
@ 2012-09-20 23:59   ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-20 23:59 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Mon, Sep 10, 2012 at 05:22:22PM -0700, Kent Overstreet wrote:
> More prep work for immutable bio vecs, mainly getting rid of references
> to bi_idx.
> 
> bio_reset was being open coded in a few places. The one in sync_request
> was a bit nontrivial to convert, so could use some extra eyeballs.
> 
> Signed-off-by: Kent Overstreet <koverstreet@google.com>
> CC: Jens Axboe <axboe@kernel.dk>
> CC: NeilBrown <neilb@suse.de>

No idea at all.  Neil?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0
       [not found]       ` <20120920234544.GN7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:00         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 04:45:44PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:18PM -0700, Kent Overstreet wrote:
> > Prep work for immutable bio_vecs/efficient bio splitting: they require
> > auditing and removing most uses of bi_idx.
> > 
> > So here we convert bio_split() to respect the current value of bi_idx
> > and use the bio_iovec() macro, instead of assuming bi_idx will be 0.
> 
> I find the description a bit cryptic.

Yeah, I wasn't able to come up with a better description at the time...
how's this:

Change bio_split() to respect the current value of bi_idx

In the current code bio_split() won't be seeing partially completed bios
so this doesn't change any behaviour, but this makes the code a bit
clearer as to what bio_split() actually requires.

The immediate purpose of the patch is removing unnecessary bi_idx
references, but the end goal is to allow partial completed bios to be
submitted, which along with immutable biovecs enables effecient bio
splitting.


> 
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > ---
> >  drivers/block/drbd/drbd_req.c | 6 +++---
> >  drivers/md/raid0.c            | 3 +--
> >  drivers/md/raid10.c           | 3 +--
> >  fs/bio-integrity.c            | 4 ++--
> >  fs/bio.c                      | 7 +++----
> >  5 files changed, 10 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> > index af69a96..57eb253 100644
> > --- a/drivers/block/drbd/drbd_req.c
> > +++ b/drivers/block/drbd/drbd_req.c
> > @@ -1155,11 +1155,11 @@ void drbd_make_request(struct request_queue *q, struct bio *bio)
> >  
> >  	/* can this bio be split generically?
> >  	 * Maybe add our own split-arbitrary-bios function. */
> > -	if (bio->bi_vcnt != 1 || bio->bi_idx != 0 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
> > +	if (bio_segments(bio) != 1 || bio->bi_size > DRBD_MAX_BIO_SIZE) {
> >  		/* rather error out here than BUG in bio_split */
> >  		dev_err(DEV, "bio would need to, but cannot, be split: "
> > -		    "(vcnt=%u,idx=%u,size=%u,sector=%llu)\n",
> > -		    bio->bi_vcnt, bio->bi_idx, bio->bi_size,
> > +		    "(segments=%u,size=%u,sector=%llu)\n",
> > +		    bio_segments(bio), bio->bi_size,
> >  		    (unsigned long long)bio->bi_sector);
> >  		bio_endio(bio, -EINVAL);
> >  	} else {
> > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> > index 387cb89..0587450 100644
> > --- a/drivers/md/raid0.c
> > +++ b/drivers/md/raid0.c
> > @@ -509,8 +509,7 @@ static void raid0_make_request(struct mddev *mddev, struct bio *bio)
> >  		sector_t sector = bio->bi_sector;
> >  		struct bio_pair *bp;
> >  		/* Sanity check -- queue functions should prevent this happening */
> > -		if (bio->bi_vcnt != 1 ||
> > -		    bio->bi_idx != 0)
> > +		if (bio_segments(bio) != 1)
> >  			goto bad_map;
> >  		/* This is a one page bio that upper layers
> >  		 * refuse to split for us, so we need to split it.
> > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> > index 9715aaf..bbd08f5 100644
> > --- a/drivers/md/raid10.c
> > +++ b/drivers/md/raid10.c
> > @@ -1081,8 +1081,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> >  			 || conf->prev.near_copies < conf->prev.raid_disks))) {
> >  		struct bio_pair *bp;
> >  		/* Sanity check -- queue functions should prevent this happening */
> > -		if (bio->bi_vcnt != 1 ||
> > -		    bio->bi_idx != 0)
> > +		if (bio_segments(bio) != 1)
> >  			goto bad_map;
> >  		/* This is a one page bio that upper layers
> >  		 * refuse to split for us, so we need to split it.
> 
> And wonder how the description applies to the above.
> 
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -1616,8 +1616,7 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> >  	trace_block_split(bdev_get_queue(bi->bi_bdev), bi,
> >  				bi->bi_sector + first_sectors);
> >  
> > -	BUG_ON(bi->bi_vcnt != 1);
> > -	BUG_ON(bi->bi_idx != 0);
> > +	BUG_ON(bio_segments(bi) != 1);
> >  	atomic_set(&bp->cnt, 3);
> >  	bp->error = 0;
> >  	bp->bio1 = *bi;
> > @@ -1626,8 +1625,8 @@ struct bio_pair *bio_split(struct bio *bi, int first_sectors)
> >  	bp->bio2.bi_size -= first_sectors << 9;
> >  	bp->bio1.bi_size = first_sectors << 9;
> >  
> > -	bp->bv1 = bi->bi_io_vec[0];
> > -	bp->bv2 = bi->bi_io_vec[0];
> > +	bp->bv1 = *bio_iovec(bi);
> > +	bp->bv2 = *bio_iovec(bi);
> 
> This conflicts with a recent commit from Martin.  You probably wanna
> rebase.
> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 08/26] block: Remove bi_idx references
  2012-09-20 23:49       ` Tejun Heo
@ 2012-09-21  0:04         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Thu, Sep 20, 2012 at 04:49:53PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:19PM -0700, Kent Overstreet wrote:
> > These were harmless but uneccessary,andt getting rid of them makes the
> > code easier to audit since most of them need to be removed.
> 
> I find the descriptions a bit too terse.  Why do they need to be
> removed?  So, I suppose you wanted to say explicit initializations to
> 0 are unnecessary, but there are bio_segments() conversions too.
> 
> The patch is simple and this isn't a big deal but I really hope for
> better (correct) descriptions.

It's because for the bvec iterator stuff and immutable bvecs, direct
bi_idx usage tends to be either wrong or unnecessary - I had to audit
all the uses in the kernel.

Reason for doing it now is a later patch moves bi_idx (also bi_sector
and bi_size) into a different struct - so doing these cleanup patches
first means a bit less code churn.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md
       [not found]       ` <20120920235643.GQ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:06         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:06 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 04:56:43PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:21PM -0700, Kent Overstreet wrote:
> > Random cleanup - this code was duplicated and it's not really specific
> > to md.
> > 
> > Also added the ability to return the actual error code.
> > 
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> 
> Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> 
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -249,6 +249,7 @@ extern void bio_endio(struct bio *, int);
> >  struct request_queue;
> >  extern int bio_phys_segments(struct request_queue *, struct bio *);
> >  
> > +extern int submit_bio_wait(int rw, struct bio *bio);
> >  void bio_advance(struct bio *, unsigned);
> 
> Heh, this is one of the reasons why I don't like extern on function
> prototypes.  It's not necessary and people end up jumping between the
> two forms.  :(

Yeah, I dislike it too but I was trying to follow the style in that file
- I did fix bio_advance() a few minutes ago.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 15/26] block: Add bio_copy_data()
       [not found]   ` <1347322957-25260-16-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:06     ` Tejun Heo
  2012-09-21  0:09       ` Kent Overstreet
  2012-09-21  0:09       ` Tejun Heo
  0 siblings, 2 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:06 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

Hello,

On Mon, Sep 10, 2012 at 05:22:26PM -0700, Kent Overstreet wrote:
> +void bio_copy_data(struct bio *dst, struct bio *src)
> +{
...
> +		src_p = kmap_atomic(src_bv->bv_page);
> +		dst_p = kmap_atomic(dst_bv->bv_page);
> +
> +		memcpy(dst_p + dst_bv->bv_offset,
> +		       src_p + src_bv->bv_offset,
> +		       bytes);
> +
> +		kunmap_atomic(dst_p);
> +		kunmap_atomic(src_p);

Wrap these in preempt_disable/enable() to allow the function to be
called from any context?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 15/26] block: Add bio_copy_data()
  2012-09-21  0:06     ` Tejun Heo
@ 2012-09-21  0:09       ` Kent Overstreet
       [not found]         ` <20120921000945.GK5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-21  0:09       ` Tejun Heo
  1 sibling, 1 reply; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Thu, Sep 20, 2012 at 05:06:32PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Mon, Sep 10, 2012 at 05:22:26PM -0700, Kent Overstreet wrote:
> > +void bio_copy_data(struct bio *dst, struct bio *src)
> > +{
> ...
> > +		src_p = kmap_atomic(src_bv->bv_page);
> > +		dst_p = kmap_atomic(dst_bv->bv_page);
> > +
> > +		memcpy(dst_p + dst_bv->bv_offset,
> > +		       src_p + src_bv->bv_offset,
> > +		       bytes);
> > +
> > +		kunmap_atomic(dst_p);
> > +		kunmap_atomic(src_p);
> 
> Wrap these in preempt_disable/enable() to allow the function to be
> called from any context?

I checked the implementation of kmap_atomic(), it already does
preempt_disable() so it's safe in process context - if I understand
correctly it needs local_irq_save()/restore() to be safe in any context
and I figured calling this from irq context is not the norm so that
should be the caller's responsibility.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 15/26] block: Add bio_copy_data()
  2012-09-21  0:06     ` Tejun Heo
  2012-09-21  0:09       ` Kent Overstreet
@ 2012-09-21  0:09       ` Tejun Heo
       [not found]         ` <20120921000947.GT7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:09 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Thu, Sep 20, 2012 at 05:06:32PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Mon, Sep 10, 2012 at 05:22:26PM -0700, Kent Overstreet wrote:
> > +void bio_copy_data(struct bio *dst, struct bio *src)
> > +{
> ...
> > +		src_p = kmap_atomic(src_bv->bv_page);
> > +		dst_p = kmap_atomic(dst_bv->bv_page);
> > +
> > +		memcpy(dst_p + dst_bv->bv_offset,
> > +		       src_p + src_bv->bv_offset,
> > +		       bytes);
> > +
> > +		kunmap_atomic(dst_p);
> > +		kunmap_atomic(src_p);
> 
> Wrap these in preempt_disable/enable() to allow the function to be
> called from any context?

Ooh, and maybe return the amount of copied data?

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 15/26] block: Add bio_copy_data()
       [not found]         ` <20120921000947.GT7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:13           ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 05:09:47PM -0700, Tejun Heo wrote:
> On Thu, Sep 20, 2012 at 05:06:32PM -0700, Tejun Heo wrote:
> > Hello,
> > 
> > On Mon, Sep 10, 2012 at 05:22:26PM -0700, Kent Overstreet wrote:
> > > +void bio_copy_data(struct bio *dst, struct bio *src)
> > > +{
> > ...
> > > +		src_p = kmap_atomic(src_bv->bv_page);
> > > +		dst_p = kmap_atomic(dst_bv->bv_page);
> > > +
> > > +		memcpy(dst_p + dst_bv->bv_offset,
> > > +		       src_p + src_bv->bv_offset,
> > > +		       bytes);
> > > +
> > > +		kunmap_atomic(dst_p);
> > > +		kunmap_atomic(src_p);
> > 
> > Wrap these in preempt_disable/enable() to allow the function to be
> > called from any context?
> 
> Ooh, and maybe return the amount of copied data?

Possibly, but I think I want to wait until a user needs it before adding
something like that.

From looking at other code that copies bio data, a parameter that
specifies the amount of data to be copied might be more useful.

I'm not sure I've seen all the places where bio data is copied yet, so
I've just been waiting until I find more uses to make it do more.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 15/26] block: Add bio_copy_data()
       [not found]         ` <20120921000945.GK5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:15           ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:15 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 05:09:45PM -0700, Kent Overstreet wrote:
> On Thu, Sep 20, 2012 at 05:06:32PM -0700, Tejun Heo wrote:
> > Hello,
> > 
> > On Mon, Sep 10, 2012 at 05:22:26PM -0700, Kent Overstreet wrote:
> > > +void bio_copy_data(struct bio *dst, struct bio *src)
> > > +{
> > ...
> > > +		src_p = kmap_atomic(src_bv->bv_page);
> > > +		dst_p = kmap_atomic(dst_bv->bv_page);
> > > +
> > > +		memcpy(dst_p + dst_bv->bv_offset,
> > > +		       src_p + src_bv->bv_offset,
> > > +		       bytes);
> > > +
> > > +		kunmap_atomic(dst_p);
> > > +		kunmap_atomic(src_p);
> > 
> > Wrap these in preempt_disable/enable() to allow the function to be
> > called from any context?
> 
> I checked the implementation of kmap_atomic(), it already does
> preempt_disable() so it's safe in process context - if I understand
> correctly it needs local_irq_save()/restore() to be safe in any context
> and I figured calling this from irq context is not the norm so that
> should be the caller's responsibility.

Ooh, that means the patch I just sent Andrew about sg_mapping_iter is
still too strict.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec
       [not found]     ` <1347322957-25260-20-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:25       ` Tejun Heo
  2012-09-21  0:29         ` Kent Overstreet
  2012-09-21  0:27       ` Tejun Heo
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:25 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

Hello, Kent.

On Mon, Sep 10, 2012 at 05:22:30PM -0700, Kent Overstreet wrote:
> A bunch of what __blk_queue_bounce() was doing was problematic for the
> immutable bvec work; this cleans that up and the code is quite a bit
> smaller, too.
> 
> The __bio_for_each_segment() in copy_to_high_bio_irq() was changed
> because that one's looping over the original bio, not the bounce bio -
> since the bounce code doesn't own that bio the __ version wasn't
> correct.

I do like the new implementation.  I think the function is broken
before and after tho.  Allocating from fs_bio_set from block layer is
never safe and nothing seems to prevent multiple allocators compete in
the bounce page mempool.  This will need a separate bioset and the
multiple mempool allocation would have to be put inside a mutex.

Also, how was this tested?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec
       [not found]     ` <1347322957-25260-20-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  2012-09-21  0:25       ` Tejun Heo
@ 2012-09-21  0:27       ` Tejun Heo
  2012-09-21  0:34         ` Kent Overstreet
  1 sibling, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:27 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:30PM -0700, Kent Overstreet wrote:
> A bunch of what __blk_queue_bounce() was doing was problematic for the
> immutable bvec work; this cleans that up and the code is quite a bit
> smaller, too.
> 
> The __bio_for_each_segment() in copy_to_high_bio_irq() was changed
> because that one's looping over the original bio, not the bounce bio -
> since the bounce code doesn't own that bio the __ version wasn't
> correct.

Also, I can't understand the above at all.  I can think why it
wouldn't be necessary but why is it wrong because bounce code doesn't
own it?

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec
  2012-09-21  0:25       ` Tejun Heo
@ 2012-09-21  0:29         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, linux-bcache, linux-kernel, dm-devel

On Thu, Sep 20, 2012 at 05:25:55PM -0700, Tejun Heo wrote:
> Hello, Kent.
> 
> On Mon, Sep 10, 2012 at 05:22:30PM -0700, Kent Overstreet wrote:
> > A bunch of what __blk_queue_bounce() was doing was problematic for the
> > immutable bvec work; this cleans that up and the code is quite a bit
> > smaller, too.
> > 
> > The __bio_for_each_segment() in copy_to_high_bio_irq() was changed
> > because that one's looping over the original bio, not the bounce bio -
> > since the bounce code doesn't own that bio the __ version wasn't
> > correct.
> 
> I do like the new implementation.  I think the function is broken
> before and after tho.  Allocating from fs_bio_set from block layer is
> never safe and nothing seems to prevent multiple allocators compete in
> the bounce page mempool.  This will need a separate bioset and the
> multiple mempool allocation would have to be put inside a mutex.

Yeah, I should've at least made a note of that.

I should really add "audit all uses of fs_bio_set" to my todo list.

> Also, how was this tested?

Changed queue_bounce_pfn() to return 0, forcing all io to be bounced.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec
  2012-09-21  0:27       ` Tejun Heo
@ 2012-09-21  0:34         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:34 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-bcache, linux-kernel, dm-devel, axboe, neilb

On Thu, Sep 20, 2012 at 05:27:06PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:30PM -0700, Kent Overstreet wrote:
> > A bunch of what __blk_queue_bounce() was doing was problematic for the
> > immutable bvec work; this cleans that up and the code is quite a bit
> > smaller, too.
> > 
> > The __bio_for_each_segment() in copy_to_high_bio_irq() was changed
> > because that one's looping over the original bio, not the bounce bio -
> > since the bounce code doesn't own that bio the __ version wasn't
> > correct.
> 
> Also, I can't understand the above at all.  I can think why it
> wouldn't be necessary but why is it wrong because bounce code doesn't
> own it?

Another prep work thing - in current code, it isn't really wrong
(slightly inconsistent though).

But the idea is that anything that doesn't own the bio shouldn't assume
anything about bi_idx; the bounce code should loop over the bio starting
from wherever it was when the bio got to the bounce code, not the start
of the bio.

A later patch makes this clearer - __bio_for_each_segment() gets removed
in favor of bio_for_each_segment_all(), and it documents that
bio_for_each_segment_all() is only for code that owns the bio.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 20/26] block: Add bio_for_each_segment_all()
       [not found]   ` <1347322957-25260-21-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:35     ` Tejun Heo
  0 siblings, 0 replies; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:35 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

Hello,

On Mon, Sep 10, 2012 at 05:22:31PM -0700, Kent Overstreet wrote:
> This is part of the immutable bvec prep work; bio_for_each_segment() is
> going to have a different implementation so these need to be split
> apart.
>
> This change is also to better document the intent of code that's using
> it - bio_for_each_segment_all() is only legal to use for code that owns
> the bio.

How about something like following?

 __bio_for_each_segment() iterates bvecs from the specified index
 instead of bio->bv_idx.  Currently, the only usage is to walk all the
 bvecs after the bio has been advanced by specifying 0 index.

 To help immutable bvec implementation, replace it with
 bio_for_each_segment_all() which also better documents the intent of
 code that's using it.  Note that bio_for_each_segment_all() should
 only be used by the code which owns the bio.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all()
       [not found]     ` <1347322957-25260-22-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:38       ` Tejun Heo
       [not found]         ` <20120921003832.GY7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:38 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:32PM -0700, Kent Overstreet wrote:
> A few places in the code were either open coding or using the wrong
> version - fix.
> 
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> ---
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -921,7 +921,7 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
>  	if (unlikely(!bvecs))
>  		return;
>  
> -	bio_for_each_segment(bvec, bio, i) {
> +	bio_for_each_segment_all(bvec, bio, i) {

I don't get this conversion.  Why is this necessary?

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 22/26] block: Add bio_alloc_pages()
       [not found]   ` <1347322957-25260-23-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:47     ` Tejun Heo
       [not found]       ` <20120921004711.GZ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:47 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:33PM -0700, Kent Overstreet wrote:
> +	bio_for_each_segment_all(bv, bio, i) {
> +		bv->bv_page = alloc_page(gfp_mask);
> +		if (!bv->bv_page) {
> +			while (bv-- != bio->bi_io_vec)
> +				__free_page(bv->bv_page);

I don't know.  I feel stupid.  I think it's because the loop variable
changes between loop condition test and actual body of loop.  How
about the following?  It is pointing to the member of the same array
so I think it's not even violating pointer comparison rules.

	while (--bv >= bio->bi_io_vec)
		__free_page(bv->bv_page);

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 23/26] raid1: use bio_alloc_pages()
       [not found]     ` <1347322957-25260-24-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:48       ` Tejun Heo
       [not found]         ` <20120921004827.GA7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Tejun Heo @ 2012-09-21  0:48 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Mon, Sep 10, 2012 at 05:22:34PM -0700, Kent Overstreet wrote:
> Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>

I think it's better to merge this and the previous patch.  It's not
like we're converting a lot of users.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all()
       [not found]         ` <20120921003832.GY7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  0:50           ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  0:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 05:38:32PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:32PM -0700, Kent Overstreet wrote:
> > A few places in the code were either open coding or using the wrong
> > version - fix.
> > 
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> > ---
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -921,7 +921,7 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
> >  	if (unlikely(!bvecs))
> >  		return;
> >  
> > -	bio_for_each_segment(bvec, bio, i) {
> > +	bio_for_each_segment_all(bvec, bio, i) {
> 
> I don't get this conversion.  Why is this necessary?

Not necessary, just a consistency thing - this bio is a clone that md
owns (and the clone was trimmed, so we know bi_idx is 0).

Also, it wasn't an issue here but after the patch that introduces the
bvec iter it's no longer possible to modify the biovec through
bio_for_each_segment_all() - it doesn't increment a pointer to the
current bvec, you pass in a struct bio_vec (not a pointer) which is
updated with what the current biovec would be (taking into account
bi_bvec_done and bi_size).

So because of that it is IMO more worthwhile to be consistent about
bio_for_each_segment()/bio_for_each_segment_all() usage.

Suppose I should stick all that in the patch description.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 22/26] block: Add bio_alloc_pages()
       [not found]       ` <20120921004711.GZ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  4:50         ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  4:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 05:47:11PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:33PM -0700, Kent Overstreet wrote:
> > +	bio_for_each_segment_all(bv, bio, i) {
> > +		bv->bv_page = alloc_page(gfp_mask);
> > +		if (!bv->bv_page) {
> > +			while (bv-- != bio->bi_io_vec)
> > +				__free_page(bv->bv_page);
> 
> I don't know.  I feel stupid.  I think it's because the loop variable
> changes between loop condition test and actual body of loop.  How
> about the following?  It is pointing to the member of the same array
> so I think it's not even violating pointer comparison rules.
> 
> 	while (--bv >= bio->bi_io_vec)
> 		__free_page(bv->bv_page);

I can't remember why I did it that way, but I think I like yours better
- I'll change it.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 23/26] raid1: use bio_alloc_pages()
       [not found]         ` <20120921004827.GA7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2012-09-21  4:51           ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-09-21  4:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	neilb-l3A5Bk7waGM

On Thu, Sep 20, 2012 at 05:48:27PM -0700, Tejun Heo wrote:
> On Mon, Sep 10, 2012 at 05:22:34PM -0700, Kent Overstreet wrote:
> > Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> > CC: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
> 
> I think it's better to merge this and the previous patch.  It's not
> like we're converting a lot of users.

Ok, will do.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage
  2012-10-15 20:08 [PATCH v4 00/24] " Kent Overstreet
@ 2012-10-15 20:09 ` Kent Overstreet
  0 siblings, 0 replies; 81+ messages in thread
From: Kent Overstreet @ 2012-10-15 20:09 UTC (permalink / raw)
  To: linux-bcache, linux-kernel, dm-devel; +Cc: Kent Overstreet, tj, axboe, neilb

More prep work for immutable bvecs/effecient bio splitting - usage of
bi_vcnt has to be auditing, so getting rid of all the unnecessary usage
makes that easier.

Plus, bio_segments() is really what this code wanted, as it respects the
current value of bi_idx.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
---
 drivers/message/fusion/mptsas.c          |  6 +++---
 drivers/scsi/libsas/sas_expander.c       |  6 +++---
 drivers/scsi/mpt2sas/mpt2sas_transport.c | 10 +++++-----
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 551262e..5406a9f 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -2235,10 +2235,10 @@ static int mptsas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	}
 
 	/* do we need to support multiple segments? */
-	if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) {
 		printk(MYIOC_s_ERR_FMT "%s: multiple segments req %u %u, rsp %u %u\n",
-		    ioc->name, __func__, req->bio->bi_vcnt, blk_rq_bytes(req),
-		    rsp->bio->bi_vcnt, blk_rq_bytes(rsp));
+		    ioc->name, __func__, bio_segments(req->bio), blk_rq_bytes(req),
+		    bio_segments(rsp->bio), blk_rq_bytes(rsp));
 		return -EINVAL;
 	}
 
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index efc6e72..ee331a7 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -2151,10 +2151,10 @@ int sas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	}
 
 	/* do we need to support multiple segments? */
-	if (req->bio->bi_vcnt > 1 || rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1 || bio_segments(rsp->bio) > 1) {
 		printk("%s: multiple segments req %u %u, rsp %u %u\n",
-		       __func__, req->bio->bi_vcnt, blk_rq_bytes(req),
-		       rsp->bio->bi_vcnt, blk_rq_bytes(rsp));
+		       __func__, bio_segments(req->bio), blk_rq_bytes(req),
+		       bio_segments(rsp->bio), blk_rq_bytes(rsp));
 		return -EINVAL;
 	}
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index c6cf20f..403a57b 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1939,7 +1939,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	ioc->transport_cmds.status = MPT2_CMD_PENDING;
 
 	/* Check if the request is split across multiple segments */
-	if (req->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1) {
 		u32 offset = 0;
 
 		/* Allocate memory and copy the request */
@@ -1971,7 +1971,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 
 	/* Check if the response needs to be populated across
 	 * multiple segments */
-	if (rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(rsp->bio) > 1) {
 		pci_addr_in = pci_alloc_consistent(ioc->pdev, blk_rq_bytes(rsp),
 		    &pci_dma_in);
 		if (!pci_addr_in) {
@@ -2038,7 +2038,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	sgl_flags = (MPI2_SGE_FLAGS_SIMPLE_ELEMENT |
 	    MPI2_SGE_FLAGS_END_OF_BUFFER | MPI2_SGE_FLAGS_HOST_TO_IOC);
 	sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT;
-	if (req->bio->bi_vcnt > 1) {
+	if (bio_segments(req->bio) > 1) {
 		ioc->base_add_sg_single(psge, sgl_flags |
 		    (blk_rq_bytes(req) - 4), pci_dma_out);
 	} else {
@@ -2054,7 +2054,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 	    MPI2_SGE_FLAGS_LAST_ELEMENT | MPI2_SGE_FLAGS_END_OF_BUFFER |
 	    MPI2_SGE_FLAGS_END_OF_LIST);
 	sgl_flags = sgl_flags << MPI2_SGE_FLAGS_SHIFT;
-	if (rsp->bio->bi_vcnt > 1) {
+	if (bio_segments(rsp->bio) > 1) {
 		ioc->base_add_sg_single(psge, sgl_flags |
 		    (blk_rq_bytes(rsp) + 4), pci_dma_in);
 	} else {
@@ -2099,7 +2099,7 @@ _transport_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
 		    le16_to_cpu(mpi_reply->ResponseDataLength);
 		/* check if the resp needs to be copied from the allocated
 		 * pci mem */
-		if (rsp->bio->bi_vcnt > 1) {
+		if (bio_segments(rsp->bio) > 1) {
 			u32 offset = 0;
 			u32 bytes_to_copy =
 			    le16_to_cpu(mpi_reply->ResponseDataLength);
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2012-10-15 20:09 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-11  0:22 [PATCH v2 00/26] Prep work for immutable bio vecs Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 01/26] block: Convert integrity to bvec_alloc_bs(), and a bugfix Kent Overstreet
     [not found]   ` <1347322957-25260-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 20:36     ` [dm-devel] " Vivek Goyal
2012-09-11 20:48       ` Kent Overstreet
2012-09-11 22:07       ` Kent Overstreet
     [not found]         ` <20120911220750.GM19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-12 19:39           ` Martin K. Petersen
     [not found]             ` <yq1wqzzrpy1.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
2012-09-17 21:08               ` Kent Overstreet
2012-09-20 21:53     ` Tejun Heo
2012-09-11  0:22 ` [PATCH v2 02/26] block: Add bio_advance() Kent Overstreet
     [not found]   ` <1347322957-25260-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 21:58     ` Tejun Heo
2012-09-20 23:13       ` Kent Overstreet
     [not found]         ` <20120920231308.GB5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:25           ` Tejun Heo
     [not found]             ` <20120920232506.GI7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:38               ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 03/26] block: Refactor blk_update_request() Kent Overstreet
     [not found]   ` <1347322957-25260-4-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:20     ` Tejun Heo
2012-09-20 23:36       ` Kent Overstreet
     [not found]         ` <20120920233632.GC5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:41           ` Tejun Heo
     [not found]             ` <20120920234133.GM7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:50               ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 04/26] md: Convert md_trim_bio() to use bio_advance() Kent Overstreet
     [not found]   ` <1347322957-25260-5-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:27     ` Tejun Heo
2012-09-11  0:22 ` [PATCH v2 05/26] block: Add bio_end() Kent Overstreet
     [not found]   ` <1347322957-25260-6-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-17  9:17     ` Steven Whitehouse
2012-09-20 23:32     ` Tejun Heo
     [not found]       ` <20120920233225.GK7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:44         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 06/26] block: Use bio_sectors() more consistently Kent Overstreet
     [not found]   ` <1347322957-25260-7-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:36     ` Tejun Heo
     [not found]       ` <20120920233618.GL7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:47         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 07/26] block: Don't use bi_idx in bio_split() or require it to be 0 Kent Overstreet
     [not found]   ` <1347322957-25260-8-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:45     ` Tejun Heo
     [not found]       ` <20120920234544.GN7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:00         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage Kent Overstreet
     [not found]   ` <1347322957-25260-10-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:51     ` Tejun Heo
2012-09-11  0:22 ` [PATCH v2 10/26] block: Add submit_bio_wait(), remove from md Kent Overstreet
     [not found]   ` <1347322957-25260-11-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:56     ` Tejun Heo
     [not found]       ` <20120920235643.GQ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:06         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 11/26] raid10: Use bio_reset() Kent Overstreet
2012-09-20 23:59   ` Tejun Heo
2012-09-11  0:22 ` [PATCH v2 13/26] raid5: use bio_reset() Kent Overstreet
     [not found]   ` <1347322957-25260-14-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11  5:03     ` NeilBrown
     [not found]       ` <20120911150326.79f066c0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-09-11 19:26         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 14/26] raid1: Refactor narrow_write_error() to not use bi_idx Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 15/26] block: Add bio_copy_data() Kent Overstreet
     [not found]   ` <1347322957-25260-16-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:06     ` Tejun Heo
2012-09-21  0:09       ` Kent Overstreet
     [not found]         ` <20120921000945.GK5519-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:15           ` Tejun Heo
2012-09-21  0:09       ` Tejun Heo
     [not found]         ` <20120921000947.GT7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:13           ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 16/26] pktcdvd: use bio_copy_data() Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 17/26] pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 18/26] raid1: use bio_copy_data() Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 20/26] block: Add bio_for_each_segment_all() Kent Overstreet
     [not found]   ` <1347322957-25260-21-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:35     ` Tejun Heo
2012-09-11  0:22 ` [PATCH v2 22/26] block: Add bio_alloc_pages() Kent Overstreet
     [not found]   ` <1347322957-25260-23-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:47     ` Tejun Heo
     [not found]       ` <20120921004711.GZ7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  4:50         ` Kent Overstreet
2012-09-11  0:22 ` [PATCH v2 24/26] block: Add an explicit bio flag for bios that own their bvec Kent Overstreet
     [not found] ` <1347322957-25260-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11  0:22   ` [PATCH v2 08/26] block: Remove bi_idx references Kent Overstreet
     [not found]     ` <1347322957-25260-9-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 23:49       ` Tejun Heo
2012-09-21  0:04         ` Kent Overstreet
2012-09-11  0:22   ` [PATCH v2 12/26] raid1: use bio_reset() Kent Overstreet
     [not found]     ` <1347322957-25260-13-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11  4:59       ` NeilBrown
2012-09-11 18:28         ` Kent Overstreet
     [not found]           ` <20120911182825.GG19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 21:17             ` NeilBrown
2012-09-11  0:22   ` [PATCH v2 19/26] bounce: Refactor __blk_queue_bounce to not use bi_io_vec Kent Overstreet
     [not found]     ` <1347322957-25260-20-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:25       ` Tejun Heo
2012-09-21  0:29         ` Kent Overstreet
2012-09-21  0:27       ` Tejun Heo
2012-09-21  0:34         ` Kent Overstreet
2012-09-11  0:22   ` [PATCH v2 21/26] block: Convert some code to bio_for_each_segment_all() Kent Overstreet
     [not found]     ` <1347322957-25260-22-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:38       ` Tejun Heo
     [not found]         ` <20120921003832.GY7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:50           ` Kent Overstreet
2012-09-11  0:22   ` [PATCH v2 23/26] raid1: use bio_alloc_pages() Kent Overstreet
     [not found]     ` <1347322957-25260-24-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  0:48       ` Tejun Heo
     [not found]         ` <20120921004827.GA7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21  4:51           ` Kent Overstreet
2012-09-11  0:22   ` [PATCH v2 25/26] bio-integrity: Add explicit field for owner of bip_buf Kent Overstreet
     [not found]     ` <1347322957-25260-26-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-12 19:41       ` Martin K. Petersen
     [not found]         ` <yq1sjanrpu7.fsf-+q57XtR/GgMb6DWv4sQWN6xOck334EZe@public.gmane.org>
2012-09-17 21:09           ` Kent Overstreet
2012-09-11  0:22   ` [PATCH v2 26/26] block: Add BIO_SUBMITTED flag, kill BIO_CLONED Kent Overstreet
2012-09-11  5:22   ` [PATCH v2 00/26] Prep work for immutable bio vecs NeilBrown
2012-09-20 23:22 ` Tejun Heo
2012-10-15 20:08 [PATCH v4 00/24] " Kent Overstreet
2012-10-15 20:09 ` [PATCH v2 09/26] block: Remove some unnecessary bi_vcnt usage Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).