linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] block: fix bio_will_gap()
@ 2016-02-26 15:40 Ming Lei
  2016-02-26 15:40 ` [PATCH v2 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ming Lei @ 2016-02-26 15:40 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Kent Overstreet,
	Keith Busch, Elliott Robert

Hi Guys,

The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core. Unfortunately bio_will_gap() just figures out the last
bvec via 'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way
is obviously wrong in case of fast-cloned bio.

It is observed that lots of BIOs are still merged even if
the virt boundary limit is violated by the merge, and the issue
was reported from Sagi Grimberg.

This patch introduces two helpers for getting the first and last
bvec of one bio and applys them to fix the issue. Sagi has confirmed
the fix.

Thanks for Sagi and Christoph's review.

V2:
	- remove unnecessary comment
	- add reviewed-by

V1:
	- get bvec directly for non-cloned bio
	- implement bio_get_last_bvec() with single bio_advance_iter(),
	and avoid to use bio_for_each_segment() which looks a bit inefficient
	- avoid to double check queue_virt_boundary() in bio_will_gap()

 block/blk-merge.c      |  8 ++------
 include/linux/bio.h    | 37 +++++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h | 23 +++++++++++++++++------
 3 files changed, 56 insertions(+), 12 deletions(-)


Thanks,
Ming

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
@ 2016-02-26 15:40 ` Ming Lei
  2016-02-26 15:40 ` [PATCH v2 2/4] block: check virt boundary in bio_will_gap() Ming Lei
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2016-02-26 15:40 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Kent Overstreet,
	Keith Busch, Elliott Robert, Ming Lei, stable

The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.

Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.

This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/linux/bio.h | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 5349e68..cb68888 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -310,6 +310,43 @@ static inline void bio_clear_flag(struct bio *bio, unsigned int bit)
 	bio->bi_flags &= ~(1U << bit);
 }
 
+static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
+{
+	*bv = bio_iovec(bio);
+}
+
+static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
+{
+	struct bvec_iter iter = bio->bi_iter;
+	int idx;
+
+	if (!bio_flagged(bio, BIO_CLONED)) {
+		*bv = bio->bi_io_vec[bio->bi_vcnt - 1];
+		return;
+	}
+
+	if (unlikely(!bio_multiple_segments(bio))) {
+		*bv = bio_iovec(bio);
+		return;
+	}
+
+	bio_advance_iter(bio, &iter, iter.bi_size);
+
+	if (!iter.bi_bvec_done)
+		idx = iter.bi_idx - 1;
+	else	/* in the middle of bvec */
+		idx = iter.bi_idx;
+
+	*bv = bio->bi_io_vec[idx];
+
+	/*
+	 * iter.bi_bvec_done records actual length of the last bvec
+	 * if this bio ends in the middle of one io vector
+	 */
+	if (iter.bi_bvec_done)
+		bv->bv_len = iter.bi_bvec_done;
+}
+
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/4] block: check virt boundary in bio_will_gap()
  2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
  2016-02-26 15:40 ` [PATCH v2 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
@ 2016-02-26 15:40 ` Ming Lei
  2016-02-26 15:40 ` [PATCH v2 3/4] block: get the 1st and last bvec via helpers Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2016-02-26 15:40 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Kent Overstreet,
	Keith Busch, Elliott Robert, Ming Lei

In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/linux/blkdev.h | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4571ef1..cd06a41 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1372,6 +1372,13 @@ static inline void put_dev_sector(Sector p)
 	page_cache_release(p.v);
 }
 
+static inline bool __bvec_gap_to_prev(struct request_queue *q,
+				struct bio_vec *bprv, unsigned int offset)
+{
+	return offset ||
+		((bprv->bv_offset + bprv->bv_len) & queue_virt_boundary(q));
+}
+
 /*
  * Check if adding a bio_vec after bprv with offset would create a gap in
  * the SG list. Most drivers don't care about this, but some do.
@@ -1381,18 +1388,17 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
 {
 	if (!queue_virt_boundary(q))
 		return false;
-	return offset ||
-		((bprv->bv_offset + bprv->bv_len) & queue_virt_boundary(q));
+	return __bvec_gap_to_prev(q, bprv, offset);
 }
 
 static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 			 struct bio *next)
 {
-	if (!bio_has_data(prev))
+	if (!bio_has_data(prev) || !queue_virt_boundary(q))
 		return false;
 
-	return bvec_gap_to_prev(q, &prev->bi_io_vec[prev->bi_vcnt - 1],
-				next->bi_io_vec[0].bv_offset);
+	return __bvec_gap_to_prev(q, &prev->bi_io_vec[prev->bi_vcnt - 1],
+				  next->bi_io_vec[0].bv_offset);
 }
 
 static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/4] block: get the 1st and last bvec via helpers
  2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
  2016-02-26 15:40 ` [PATCH v2 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
  2016-02-26 15:40 ` [PATCH v2 2/4] block: check virt boundary in bio_will_gap() Ming Lei
@ 2016-02-26 15:40 ` Ming Lei
  2016-02-26 15:40 ` [PATCH v2 4/4] block: merge: " Ming Lei
  2016-02-28  9:57 ` [PATCH v2 0/4] block: fix bio_will_gap() Sagi Grimberg
  4 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2016-02-26 15:40 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Kent Overstreet,
	Keith Busch, Elliott Robert, Ming Lei, stable

This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.

Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/linux/blkdev.h | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index cd06a41..d7f6bca 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1394,11 +1394,16 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
 static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 			 struct bio *next)
 {
-	if (!bio_has_data(prev) || !queue_virt_boundary(q))
-		return false;
+	if (bio_has_data(prev) && queue_virt_boundary(q)) {
+		struct bio_vec pb, nb;
+
+		bio_get_last_bvec(prev, &pb);
+		bio_get_first_bvec(next, &nb);
 
-	return __bvec_gap_to_prev(q, &prev->bi_io_vec[prev->bi_vcnt - 1],
-				  next->bi_io_vec[0].bv_offset);
+		return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
+	}
+
+	return false;
 }
 
 static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 4/4] block: merge: get the 1st and last bvec via helpers
  2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
                   ` (2 preceding siblings ...)
  2016-02-26 15:40 ` [PATCH v2 3/4] block: get the 1st and last bvec via helpers Ming Lei
@ 2016-02-26 15:40 ` Ming Lei
  2016-02-28  9:57 ` [PATCH v2 0/4] block: fix bio_will_gap() Sagi Grimberg
  4 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2016-02-26 15:40 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Kent Overstreet,
	Keith Busch, Elliott Robert, Ming Lei

This patch applies the two introduced helpers to
figure out the 1st and last bvec.

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/blk-merge.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 888a7fe..2613531 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -304,7 +304,6 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
 				   struct bio *nxt)
 {
 	struct bio_vec end_bv = { NULL }, nxt_bv;
-	struct bvec_iter iter;
 
 	if (!blk_queue_cluster(q))
 		return 0;
@@ -316,11 +315,8 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
 	if (!bio_has_data(bio))
 		return 1;
 
-	bio_for_each_segment(end_bv, bio, iter)
-		if (end_bv.bv_len == iter.bi_size)
-			break;
-
-	nxt_bv = bio_iovec(nxt);
+	bio_get_last_bvec(bio, &end_bv);
+	bio_get_first_bvec(nxt, &nxt_bv);
 
 	if (!BIOVEC_PHYS_MERGEABLE(&end_bv, &nxt_bv))
 		return 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/4] block: fix bio_will_gap()
  2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
                   ` (3 preceding siblings ...)
  2016-02-26 15:40 ` [PATCH v2 4/4] block: merge: " Ming Lei
@ 2016-02-28  9:57 ` Sagi Grimberg
  4 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2016-02-28  9:57 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Kent Overstreet, Keith Busch,
	Elliott Robert

Still looks good, still want it for 4.5 :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-28  9:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26 15:40 [PATCH v2 0/4] block: fix bio_will_gap() Ming Lei
2016-02-26 15:40 ` [PATCH v2 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
2016-02-26 15:40 ` [PATCH v2 2/4] block: check virt boundary in bio_will_gap() Ming Lei
2016-02-26 15:40 ` [PATCH v2 3/4] block: get the 1st and last bvec via helpers Ming Lei
2016-02-26 15:40 ` [PATCH v2 4/4] block: merge: " Ming Lei
2016-02-28  9:57 ` [PATCH v2 0/4] block: fix bio_will_gap() Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).