Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* fix nr_phys_segments vs iterators accounting v3
@ 2019-05-21  7:01 Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-21  7:01 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block

Hi all,

we have had a problem for a while where the number of segments that
the bvec iterators will iterate over don't match the value in
req->nr_phys_segments, causing problems for anyone looking at
nr_phys_segments and iterating over bvec directly instead of using
blk_rq_map_sg.  The first patch in this series fixes this by
making sure nr_phys_segments matches the actual number of segments.
Drivers using blk_rq_map_sg will still get the lower number returned
from function eventually, but the fact that we don't reduce the
value earlier will not allow some merges that we might otherwise
allow.

With that in place I also noticed that we do not properly account
segements sizes on devices with a virt_boundary, but it turns out that
segment sizes fundamentally don't make sense for such devices, as their
"segment" is a fixed size "device page", and not a variable sized
scatter/gather elements as in the block layer, so we make that fact
formal.

Once all that is sorted out it is pretty clear that there is no
good reason to have the front/back segement accounting to start
with.

Changes since v2:
  - fix some fixes tags

Changes since v1:
  - update a commit log
  - add fixes tags
  - drop the follow on patches not suitable for 5.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments
  2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
@ 2019-05-21  7:01 ` Christoph Hellwig
  2019-05-21  8:05   ` Ming Lei
  2019-05-21  7:01 ` [PATCH 2/4] block: force an unlimited segment size on queues with a virt boundary Christoph Hellwig
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-21  7:01 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block, Hannes Reinecke

Currently ll_merge_requests_fn, unlike all other merge functions,
reduces nr_phys_segments by one if the last segment of the previous,
and the first segment of the next segement are contigous.  While this
seems like a nice solution to avoid building smaller than possible
requests it causes a mismatch between the segments actually present
in the request and those iterated over by the bvec iterators, including
__rq_for_each_bio.  This can for example mistrigger the single segment
optimization in the nvme-pci driver, and might lead to mismatching
nr_phys_segments number when recalculating the number of request
when inserting a cloned request.

We could possibly work around this by making the bvec iterators take
the front and back segment size into account, but that would require
moving them from the bio to the bio_iter and spreading this mess
over all users of bvecs.  Or we could simply remove this optimization
under the assumption that most users already build good enough bvecs,
and that the bio merge patch never cared about this optimization
either.  The latter is what this patch does.

dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests").
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-merge.c | 23 +----------------------
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 21e87a714a73..80a5a0facb87 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -358,7 +358,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	unsigned front_seg_size;
 	struct bio *fbio, *bbio;
 	struct bvec_iter iter;
-	bool new_bio = false;
 
 	if (!bio)
 		return 0;
@@ -379,31 +378,12 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	nr_phys_segs = 0;
 	for_each_bio(bio) {
 		bio_for_each_bvec(bv, bio, iter) {
-			if (new_bio) {
-				if (seg_size + bv.bv_len
-				    > queue_max_segment_size(q))
-					goto new_segment;
-				if (!biovec_phys_mergeable(q, &bvprv, &bv))
-					goto new_segment;
-
-				seg_size += bv.bv_len;
-
-				if (nr_phys_segs == 1 && seg_size >
-						front_seg_size)
-					front_seg_size = seg_size;
-
-				continue;
-			}
-new_segment:
 			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
 					&front_seg_size, NULL, UINT_MAX);
-			new_bio = false;
 		}
 		bbio = bio;
-		if (likely(bio->bi_iter.bi_size)) {
+		if (likely(bio->bi_iter.bi_size))
 			bvprv = bv;
-			new_bio = true;
-		}
 	}
 
 	fbio->bi_seg_front_size = front_seg_size;
@@ -725,7 +705,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 			req->bio->bi_seg_front_size = seg_size;
 		if (next->nr_phys_segments == 1)
 			next->biotail->bi_seg_back_size = seg_size;
-		total_phys_segments--;
 	}
 
 	if (total_phys_segments > queue_max_segments(q))
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/4] block: force an unlimited segment size on queues with a virt boundary
  2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
@ 2019-05-21  7:01 ` Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 3/4] block: remove the segment size check in bio_will_gap Christoph Hellwig
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-21  7:01 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block, Hannes Reinecke

We currently fail to update the front/back segment size in the bio when
deciding to allow an otherwise gappy segement to a device with a
virt boundary.  The reason why this did not cause problems is that
devices with a virt boundary fundamentally don't use segments as we
know it and thus don't care.  Make that assumption formal by forcing
an unlimited segement size in this case.

Fixes: f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-settings.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 3facc41476be..2ae348c101a0 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -310,6 +310,9 @@ void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size)
 		       __func__, max_size);
 	}
 
+	/* see blk_queue_virt_boundary() for the explanation */
+	WARN_ON_ONCE(q->limits.virt_boundary_mask);
+
 	q->limits.max_segment_size = max_size;
 }
 EXPORT_SYMBOL(blk_queue_max_segment_size);
@@ -742,6 +745,14 @@ EXPORT_SYMBOL(blk_queue_segment_boundary);
 void blk_queue_virt_boundary(struct request_queue *q, unsigned long mask)
 {
 	q->limits.virt_boundary_mask = mask;
+
+	/*
+	 * Devices that require a virtual boundary do not support scatter/gather
+	 * I/O natively, but instead require a descriptor list entry for each
+	 * page (which might not be idential to the Linux PAGE_SIZE).  Because
+	 * of that they are not limited by our notion of "segment size".
+	 */
+	q->limits.max_segment_size = UINT_MAX;
 }
 EXPORT_SYMBOL(blk_queue_virt_boundary);
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/4] block: remove the segment size check in bio_will_gap
  2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 2/4] block: force an unlimited segment size on queues with a virt boundary Christoph Hellwig
@ 2019-05-21  7:01 ` Christoph Hellwig
  2019-05-21  7:01 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
  2019-05-21 12:47 ` fix nr_phys_segments vs iterators accounting v3 Jens Axboe
  4 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-21  7:01 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block, Hannes Reinecke

We fundamentally do not have a maximum segement size for devices with a
virt boundary.  So don't bother checking it, especially given that the
existing checks didn't properly work to start with as we never fully
update the front/back segment size and miss the bi_seg_front_size that
wuld have been required for some cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-merge.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 80a5a0facb87..eee2c02c50ce 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -12,23 +12,6 @@
 
 #include "blk.h"
 
-/*
- * Check if the two bvecs from two bios can be merged to one segment.  If yes,
- * no need to check gap between the two bios since the 1st bio and the 1st bvec
- * in the 2nd bio can be handled in one segment.
- */
-static inline bool bios_segs_mergeable(struct request_queue *q,
-		struct bio *prev, struct bio_vec *prev_last_bv,
-		struct bio_vec *next_first_bv)
-{
-	if (!biovec_phys_mergeable(q, prev_last_bv, next_first_bv))
-		return false;
-	if (prev->bi_seg_back_size + next_first_bv->bv_len >
-			queue_max_segment_size(q))
-		return false;
-	return true;
-}
-
 static inline bool bio_will_gap(struct request_queue *q,
 		struct request *prev_rq, struct bio *prev, struct bio *next)
 {
@@ -60,7 +43,7 @@ static inline bool bio_will_gap(struct request_queue *q,
 	 */
 	bio_get_last_bvec(prev, &pb);
 	bio_get_first_bvec(next, &nb);
-	if (bios_segs_mergeable(q, prev, &pb, &nb))
+	if (biovec_phys_mergeable(q, &pb, &nb))
 		return false;
 	return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio
  2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2019-05-21  7:01 ` [PATCH 3/4] block: remove the segment size check in bio_will_gap Christoph Hellwig
@ 2019-05-21  7:01 ` Christoph Hellwig
  2019-05-21  8:06   ` Ming Lei
  2019-05-21 12:47 ` fix nr_phys_segments vs iterators accounting v3 Jens Axboe
  4 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-21  7:01 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block, Hannes Reinecke

At this point these fields aren't used for anything, so we can remove
them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-merge.c         | 94 +++++----------------------------------
 include/linux/blk_types.h |  7 ---
 2 files changed, 12 insertions(+), 89 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index eee2c02c50ce..17713d7d98d5 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -162,8 +162,7 @@ static unsigned get_max_segment_size(struct request_queue *q,
  * variables.
  */
 static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
-		unsigned *nsegs, unsigned *last_seg_size,
-		unsigned *front_seg_size, unsigned *sectors, unsigned max_segs)
+		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
 {
 	unsigned len = bv->bv_len;
 	unsigned total_len = 0;
@@ -185,28 +184,12 @@ static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
 			break;
 	}
 
-	if (!new_nsegs)
-		return !!len;
-
-	/* update front segment size */
-	if (!*nsegs) {
-		unsigned first_seg_size;
-
-		if (new_nsegs == 1)
-			first_seg_size = get_max_segment_size(q, bv->bv_offset);
-		else
-			first_seg_size = queue_max_segment_size(q);
-
-		if (*front_seg_size < first_seg_size)
-			*front_seg_size = first_seg_size;
+	if (new_nsegs) {
+		*nsegs += new_nsegs;
+		if (sectors)
+			*sectors += total_len >> 9;
 	}
 
-	/* update other varibles */
-	*last_seg_size = seg_size;
-	*nsegs += new_nsegs;
-	if (sectors)
-		*sectors += total_len >> 9;
-
 	/* split in the middle of the bvec if len != 0 */
 	return !!len;
 }
@@ -218,8 +201,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 {
 	struct bio_vec bv, bvprv, *bvprvp = NULL;
 	struct bvec_iter iter;
-	unsigned seg_size = 0, nsegs = 0, sectors = 0;
-	unsigned front_seg_size = bio->bi_seg_front_size;
+	unsigned nsegs = 0, sectors = 0;
 	bool do_split = true;
 	struct bio *new = NULL;
 	const unsigned max_sectors = get_max_io_size(q, bio);
@@ -243,8 +225,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 				/* split in the middle of bvec */
 				bv.bv_len = (max_sectors - sectors) << 9;
 				bvec_split_segs(q, &bv, &nsegs,
-						&seg_size,
-						&front_seg_size,
 						&sectors, max_segs);
 			}
 			goto split;
@@ -258,12 +238,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 
 		if (bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
 			nsegs++;
-			seg_size = bv.bv_len;
 			sectors += bv.bv_len >> 9;
-			if (nsegs == 1 && seg_size > front_seg_size)
-				front_seg_size = seg_size;
-		} else if (bvec_split_segs(q, &bv, &nsegs, &seg_size,
-				    &front_seg_size, &sectors, max_segs)) {
+		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
+				max_segs)) {
 			goto split;
 		}
 	}
@@ -278,10 +255,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 			bio = new;
 	}
 
-	bio->bi_seg_front_size = front_seg_size;
-	if (seg_size > bio->bi_seg_back_size)
-		bio->bi_seg_back_size = seg_size;
-
 	return do_split ? new : NULL;
 }
 
@@ -336,17 +309,13 @@ EXPORT_SYMBOL(blk_queue_split);
 static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 					     struct bio *bio)
 {
-	struct bio_vec uninitialized_var(bv), bvprv = { NULL };
-	unsigned int seg_size, nr_phys_segs;
-	unsigned front_seg_size;
-	struct bio *fbio, *bbio;
+	unsigned int nr_phys_segs = 0;
 	struct bvec_iter iter;
+	struct bio_vec bv;
 
 	if (!bio)
 		return 0;
 
-	front_seg_size = bio->bi_seg_front_size;
-
 	switch (bio_op(bio)) {
 	case REQ_OP_DISCARD:
 	case REQ_OP_SECURE_ERASE:
@@ -356,23 +325,11 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 		return 1;
 	}
 
-	fbio = bio;
-	seg_size = 0;
-	nr_phys_segs = 0;
 	for_each_bio(bio) {
-		bio_for_each_bvec(bv, bio, iter) {
-			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
-					&front_seg_size, NULL, UINT_MAX);
-		}
-		bbio = bio;
-		if (likely(bio->bi_iter.bi_size))
-			bvprv = bv;
+		bio_for_each_bvec(bv, bio, iter)
+			bvec_split_segs(q, &bv, &nr_phys_segs, NULL, UINT_MAX);
 	}
 
-	fbio->bi_seg_front_size = front_seg_size;
-	if (seg_size > bbio->bi_seg_back_size)
-		bbio->bi_seg_back_size = seg_size;
-
 	return nr_phys_segs;
 }
 
@@ -392,24 +349,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio)
 	bio_set_flag(bio, BIO_SEG_VALID);
 }
 
-static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
-				   struct bio *nxt)
-{
-	struct bio_vec end_bv = { NULL }, nxt_bv;
-
-	if (bio->bi_seg_back_size + nxt->bi_seg_front_size >
-	    queue_max_segment_size(q))
-		return 0;
-
-	if (!bio_has_data(bio))
-		return 1;
-
-	bio_get_last_bvec(bio, &end_bv);
-	bio_get_first_bvec(nxt, &nxt_bv);
-
-	return biovec_phys_mergeable(q, &end_bv, &nxt_bv);
-}
-
 static inline struct scatterlist *blk_next_sg(struct scatterlist **sg,
 		struct scatterlist *sglist)
 {
@@ -669,8 +608,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 				struct request *next)
 {
 	int total_phys_segments;
-	unsigned int seg_size =
-		req->biotail->bi_seg_back_size + next->bio->bi_seg_front_size;
 
 	if (req_gap_back_merge(req, next->bio))
 		return 0;
@@ -683,13 +620,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 		return 0;
 
 	total_phys_segments = req->nr_phys_segments + next->nr_phys_segments;
-	if (blk_phys_contig_segment(q, req->biotail, next->bio)) {
-		if (req->nr_phys_segments == 1)
-			req->bio->bi_seg_front_size = seg_size;
-		if (next->nr_phys_segments == 1)
-			next->biotail->bi_seg_back_size = seg_size;
-	}
-
 	if (total_phys_segments > queue_max_segments(q))
 		return 0;
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index be418275763c..95202f80676c 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -159,13 +159,6 @@ struct bio {
 	 */
 	unsigned int		bi_phys_segments;
 
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
 	struct bvec_iter	bi_iter;
 
 	atomic_t		__bi_remaining;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments
  2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
@ 2019-05-21  8:05   ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2019-05-21  8:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, Hannes Reinecke

On Tue, May 21, 2019 at 09:01:40AM +0200, Christoph Hellwig wrote:
> Currently ll_merge_requests_fn, unlike all other merge functions,
> reduces nr_phys_segments by one if the last segment of the previous,
> and the first segment of the next segement are contigous.  While this
> seems like a nice solution to avoid building smaller than possible
> requests it causes a mismatch between the segments actually present
> in the request and those iterated over by the bvec iterators, including
> __rq_for_each_bio.  This can for example mistrigger the single segment
> optimization in the nvme-pci driver, and might lead to mismatching
> nr_phys_segments number when recalculating the number of request
> when inserting a cloned request.
> 
> We could possibly work around this by making the bvec iterators take
> the front and back segment size into account, but that would require
> moving them from the bio to the bio_iter and spreading this mess
> over all users of bvecs.  Or we could simply remove this optimization
> under the assumption that most users already build good enough bvecs,
> and that the bio merge patch never cared about this optimization
> either.  The latter is what this patch does.
> 
> dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests").
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> ---
>  block/blk-merge.c | 23 +----------------------
>  1 file changed, 1 insertion(+), 22 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 21e87a714a73..80a5a0facb87 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -358,7 +358,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
>  	unsigned front_seg_size;
>  	struct bio *fbio, *bbio;
>  	struct bvec_iter iter;
> -	bool new_bio = false;
>  
>  	if (!bio)
>  		return 0;
> @@ -379,31 +378,12 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
>  	nr_phys_segs = 0;
>  	for_each_bio(bio) {
>  		bio_for_each_bvec(bv, bio, iter) {
> -			if (new_bio) {
> -				if (seg_size + bv.bv_len
> -				    > queue_max_segment_size(q))
> -					goto new_segment;
> -				if (!biovec_phys_mergeable(q, &bvprv, &bv))
> -					goto new_segment;
> -
> -				seg_size += bv.bv_len;
> -
> -				if (nr_phys_segs == 1 && seg_size >
> -						front_seg_size)
> -					front_seg_size = seg_size;
> -
> -				continue;
> -			}
> -new_segment:
>  			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
>  					&front_seg_size, NULL, UINT_MAX);
> -			new_bio = false;
>  		}
>  		bbio = bio;
> -		if (likely(bio->bi_iter.bi_size)) {
> +		if (likely(bio->bi_iter.bi_size))
>  			bvprv = bv;
> -			new_bio = true;
> -		}
>  	}
>  
>  	fbio->bi_seg_front_size = front_seg_size;
> @@ -725,7 +705,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
>  			req->bio->bi_seg_front_size = seg_size;
>  		if (next->nr_phys_segments == 1)
>  			next->biotail->bi_seg_back_size = seg_size;
> -		total_phys_segments--;
>  	}
>  
>  	if (total_phys_segments > queue_max_segments(q))
> -- 
> 2.20.1
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio
  2019-05-21  7:01 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
@ 2019-05-21  8:06   ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2019-05-21  8:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, linux-block, Hannes Reinecke

On Tue, May 21, 2019 at 09:01:43AM +0200, Christoph Hellwig wrote:
> At this point these fields aren't used for anything, so we can remove
> them.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> ---
>  block/blk-merge.c         | 94 +++++----------------------------------
>  include/linux/blk_types.h |  7 ---
>  2 files changed, 12 insertions(+), 89 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index eee2c02c50ce..17713d7d98d5 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -162,8 +162,7 @@ static unsigned get_max_segment_size(struct request_queue *q,
>   * variables.
>   */
>  static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
> -		unsigned *nsegs, unsigned *last_seg_size,
> -		unsigned *front_seg_size, unsigned *sectors, unsigned max_segs)
> +		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
>  {
>  	unsigned len = bv->bv_len;
>  	unsigned total_len = 0;
> @@ -185,28 +184,12 @@ static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
>  			break;
>  	}
>  
> -	if (!new_nsegs)
> -		return !!len;
> -
> -	/* update front segment size */
> -	if (!*nsegs) {
> -		unsigned first_seg_size;
> -
> -		if (new_nsegs == 1)
> -			first_seg_size = get_max_segment_size(q, bv->bv_offset);
> -		else
> -			first_seg_size = queue_max_segment_size(q);
> -
> -		if (*front_seg_size < first_seg_size)
> -			*front_seg_size = first_seg_size;
> +	if (new_nsegs) {
> +		*nsegs += new_nsegs;
> +		if (sectors)
> +			*sectors += total_len >> 9;
>  	}
>  
> -	/* update other varibles */
> -	*last_seg_size = seg_size;
> -	*nsegs += new_nsegs;
> -	if (sectors)
> -		*sectors += total_len >> 9;
> -
>  	/* split in the middle of the bvec if len != 0 */
>  	return !!len;
>  }
> @@ -218,8 +201,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  {
>  	struct bio_vec bv, bvprv, *bvprvp = NULL;
>  	struct bvec_iter iter;
> -	unsigned seg_size = 0, nsegs = 0, sectors = 0;
> -	unsigned front_seg_size = bio->bi_seg_front_size;
> +	unsigned nsegs = 0, sectors = 0;
>  	bool do_split = true;
>  	struct bio *new = NULL;
>  	const unsigned max_sectors = get_max_io_size(q, bio);
> @@ -243,8 +225,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  				/* split in the middle of bvec */
>  				bv.bv_len = (max_sectors - sectors) << 9;
>  				bvec_split_segs(q, &bv, &nsegs,
> -						&seg_size,
> -						&front_seg_size,
>  						&sectors, max_segs);
>  			}
>  			goto split;
> @@ -258,12 +238,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  
>  		if (bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
>  			nsegs++;
> -			seg_size = bv.bv_len;
>  			sectors += bv.bv_len >> 9;
> -			if (nsegs == 1 && seg_size > front_seg_size)
> -				front_seg_size = seg_size;
> -		} else if (bvec_split_segs(q, &bv, &nsegs, &seg_size,
> -				    &front_seg_size, &sectors, max_segs)) {
> +		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
> +				max_segs)) {
>  			goto split;
>  		}
>  	}
> @@ -278,10 +255,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  			bio = new;
>  	}
>  
> -	bio->bi_seg_front_size = front_seg_size;
> -	if (seg_size > bio->bi_seg_back_size)
> -		bio->bi_seg_back_size = seg_size;
> -
>  	return do_split ? new : NULL;
>  }
>  
> @@ -336,17 +309,13 @@ EXPORT_SYMBOL(blk_queue_split);
>  static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
>  					     struct bio *bio)
>  {
> -	struct bio_vec uninitialized_var(bv), bvprv = { NULL };
> -	unsigned int seg_size, nr_phys_segs;
> -	unsigned front_seg_size;
> -	struct bio *fbio, *bbio;
> +	unsigned int nr_phys_segs = 0;
>  	struct bvec_iter iter;
> +	struct bio_vec bv;
>  
>  	if (!bio)
>  		return 0;
>  
> -	front_seg_size = bio->bi_seg_front_size;
> -
>  	switch (bio_op(bio)) {
>  	case REQ_OP_DISCARD:
>  	case REQ_OP_SECURE_ERASE:
> @@ -356,23 +325,11 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
>  		return 1;
>  	}
>  
> -	fbio = bio;
> -	seg_size = 0;
> -	nr_phys_segs = 0;
>  	for_each_bio(bio) {
> -		bio_for_each_bvec(bv, bio, iter) {
> -			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
> -					&front_seg_size, NULL, UINT_MAX);
> -		}
> -		bbio = bio;
> -		if (likely(bio->bi_iter.bi_size))
> -			bvprv = bv;
> +		bio_for_each_bvec(bv, bio, iter)
> +			bvec_split_segs(q, &bv, &nr_phys_segs, NULL, UINT_MAX);
>  	}
>  
> -	fbio->bi_seg_front_size = front_seg_size;
> -	if (seg_size > bbio->bi_seg_back_size)
> -		bbio->bi_seg_back_size = seg_size;
> -
>  	return nr_phys_segs;
>  }
>  
> @@ -392,24 +349,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio)
>  	bio_set_flag(bio, BIO_SEG_VALID);
>  }
>  
> -static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
> -				   struct bio *nxt)
> -{
> -	struct bio_vec end_bv = { NULL }, nxt_bv;
> -
> -	if (bio->bi_seg_back_size + nxt->bi_seg_front_size >
> -	    queue_max_segment_size(q))
> -		return 0;
> -
> -	if (!bio_has_data(bio))
> -		return 1;
> -
> -	bio_get_last_bvec(bio, &end_bv);
> -	bio_get_first_bvec(nxt, &nxt_bv);
> -
> -	return biovec_phys_mergeable(q, &end_bv, &nxt_bv);
> -}
> -
>  static inline struct scatterlist *blk_next_sg(struct scatterlist **sg,
>  		struct scatterlist *sglist)
>  {
> @@ -669,8 +608,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
>  				struct request *next)
>  {
>  	int total_phys_segments;
> -	unsigned int seg_size =
> -		req->biotail->bi_seg_back_size + next->bio->bi_seg_front_size;
>  
>  	if (req_gap_back_merge(req, next->bio))
>  		return 0;
> @@ -683,13 +620,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
>  		return 0;
>  
>  	total_phys_segments = req->nr_phys_segments + next->nr_phys_segments;
> -	if (blk_phys_contig_segment(q, req->biotail, next->bio)) {
> -		if (req->nr_phys_segments == 1)
> -			req->bio->bi_seg_front_size = seg_size;
> -		if (next->nr_phys_segments == 1)
> -			next->biotail->bi_seg_back_size = seg_size;
> -	}
> -
>  	if (total_phys_segments > queue_max_segments(q))
>  		return 0;
>  
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index be418275763c..95202f80676c 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -159,13 +159,6 @@ struct bio {
>  	 */
>  	unsigned int		bi_phys_segments;
>  
> -	/*
> -	 * To keep track of the max segment size, we account for the
> -	 * sizes of the first and last mergeable segments in this bio.
> -	 */
> -	unsigned int		bi_seg_front_size;
> -	unsigned int		bi_seg_back_size;
> -
>  	struct bvec_iter	bi_iter;
>  
>  	atomic_t		__bi_remaining;
> -- 
> 2.20.1
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: fix nr_phys_segments vs iterators accounting v3
  2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2019-05-21  7:01 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
@ 2019-05-21 12:47 ` Jens Axboe
  4 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2019-05-21 12:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: ming.lei, linux-block

On 5/21/19 1:01 AM, Christoph Hellwig wrote:
> Hi all,
> 
> we have had a problem for a while where the number of segments that
> the bvec iterators will iterate over don't match the value in
> req->nr_phys_segments, causing problems for anyone looking at
> nr_phys_segments and iterating over bvec directly instead of using
> blk_rq_map_sg.  The first patch in this series fixes this by
> making sure nr_phys_segments matches the actual number of segments.
> Drivers using blk_rq_map_sg will still get the lower number returned
> from function eventually, but the fact that we don't reduce the
> value earlier will not allow some merges that we might otherwise
> allow.
> 
> With that in place I also noticed that we do not properly account
> segements sizes on devices with a virt_boundary, but it turns out that
> segment sizes fundamentally don't make sense for such devices, as their
> "segment" is a fixed size "device page", and not a variable sized
> scatter/gather elements as in the block layer, so we make that fact
> formal.
> 
> Once all that is sorted out it is pretty clear that there is no
> good reason to have the front/back segement accounting to start
> with.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio
  2019-05-16  8:40 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
@ 2019-05-16  8:50   ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2019-05-16  8:50 UTC (permalink / raw)
  To: Christoph Hellwig, axboe; +Cc: ming.lei, linux-block

On 5/16/19 10:40 AM, Christoph Hellwig wrote:
> At this point these fields aren't used for anything, so we can remove
> them.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   block/blk-merge.c         | 94 +++++----------------------------------
>   include/linux/blk_types.h |  7 ---
>   2 files changed, 12 insertions(+), 89 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio
  2019-05-16  8:40 fix nr_phys_segments vs iterators accounting v2 Christoph Hellwig
@ 2019-05-16  8:40 ` Christoph Hellwig
  2019-05-16  8:50   ` Hannes Reinecke
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-05-16  8:40 UTC (permalink / raw)
  To: axboe; +Cc: ming.lei, linux-block

At this point these fields aren't used for anything, so we can remove
them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-merge.c         | 94 +++++----------------------------------
 include/linux/blk_types.h |  7 ---
 2 files changed, 12 insertions(+), 89 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index eee2c02c50ce..17713d7d98d5 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -162,8 +162,7 @@ static unsigned get_max_segment_size(struct request_queue *q,
  * variables.
  */
 static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
-		unsigned *nsegs, unsigned *last_seg_size,
-		unsigned *front_seg_size, unsigned *sectors, unsigned max_segs)
+		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
 {
 	unsigned len = bv->bv_len;
 	unsigned total_len = 0;
@@ -185,28 +184,12 @@ static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
 			break;
 	}
 
-	if (!new_nsegs)
-		return !!len;
-
-	/* update front segment size */
-	if (!*nsegs) {
-		unsigned first_seg_size;
-
-		if (new_nsegs == 1)
-			first_seg_size = get_max_segment_size(q, bv->bv_offset);
-		else
-			first_seg_size = queue_max_segment_size(q);
-
-		if (*front_seg_size < first_seg_size)
-			*front_seg_size = first_seg_size;
+	if (new_nsegs) {
+		*nsegs += new_nsegs;
+		if (sectors)
+			*sectors += total_len >> 9;
 	}
 
-	/* update other varibles */
-	*last_seg_size = seg_size;
-	*nsegs += new_nsegs;
-	if (sectors)
-		*sectors += total_len >> 9;
-
 	/* split in the middle of the bvec if len != 0 */
 	return !!len;
 }
@@ -218,8 +201,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 {
 	struct bio_vec bv, bvprv, *bvprvp = NULL;
 	struct bvec_iter iter;
-	unsigned seg_size = 0, nsegs = 0, sectors = 0;
-	unsigned front_seg_size = bio->bi_seg_front_size;
+	unsigned nsegs = 0, sectors = 0;
 	bool do_split = true;
 	struct bio *new = NULL;
 	const unsigned max_sectors = get_max_io_size(q, bio);
@@ -243,8 +225,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 				/* split in the middle of bvec */
 				bv.bv_len = (max_sectors - sectors) << 9;
 				bvec_split_segs(q, &bv, &nsegs,
-						&seg_size,
-						&front_seg_size,
 						&sectors, max_segs);
 			}
 			goto split;
@@ -258,12 +238,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 
 		if (bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
 			nsegs++;
-			seg_size = bv.bv_len;
 			sectors += bv.bv_len >> 9;
-			if (nsegs == 1 && seg_size > front_seg_size)
-				front_seg_size = seg_size;
-		} else if (bvec_split_segs(q, &bv, &nsegs, &seg_size,
-				    &front_seg_size, &sectors, max_segs)) {
+		} else if (bvec_split_segs(q, &bv, &nsegs, &sectors,
+				max_segs)) {
 			goto split;
 		}
 	}
@@ -278,10 +255,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 			bio = new;
 	}
 
-	bio->bi_seg_front_size = front_seg_size;
-	if (seg_size > bio->bi_seg_back_size)
-		bio->bi_seg_back_size = seg_size;
-
 	return do_split ? new : NULL;
 }
 
@@ -336,17 +309,13 @@ EXPORT_SYMBOL(blk_queue_split);
 static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 					     struct bio *bio)
 {
-	struct bio_vec uninitialized_var(bv), bvprv = { NULL };
-	unsigned int seg_size, nr_phys_segs;
-	unsigned front_seg_size;
-	struct bio *fbio, *bbio;
+	unsigned int nr_phys_segs = 0;
 	struct bvec_iter iter;
+	struct bio_vec bv;
 
 	if (!bio)
 		return 0;
 
-	front_seg_size = bio->bi_seg_front_size;
-
 	switch (bio_op(bio)) {
 	case REQ_OP_DISCARD:
 	case REQ_OP_SECURE_ERASE:
@@ -356,23 +325,11 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 		return 1;
 	}
 
-	fbio = bio;
-	seg_size = 0;
-	nr_phys_segs = 0;
 	for_each_bio(bio) {
-		bio_for_each_bvec(bv, bio, iter) {
-			bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
-					&front_seg_size, NULL, UINT_MAX);
-		}
-		bbio = bio;
-		if (likely(bio->bi_iter.bi_size))
-			bvprv = bv;
+		bio_for_each_bvec(bv, bio, iter)
+			bvec_split_segs(q, &bv, &nr_phys_segs, NULL, UINT_MAX);
 	}
 
-	fbio->bi_seg_front_size = front_seg_size;
-	if (seg_size > bbio->bi_seg_back_size)
-		bbio->bi_seg_back_size = seg_size;
-
 	return nr_phys_segs;
 }
 
@@ -392,24 +349,6 @@ void blk_recount_segments(struct request_queue *q, struct bio *bio)
 	bio_set_flag(bio, BIO_SEG_VALID);
 }
 
-static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
-				   struct bio *nxt)
-{
-	struct bio_vec end_bv = { NULL }, nxt_bv;
-
-	if (bio->bi_seg_back_size + nxt->bi_seg_front_size >
-	    queue_max_segment_size(q))
-		return 0;
-
-	if (!bio_has_data(bio))
-		return 1;
-
-	bio_get_last_bvec(bio, &end_bv);
-	bio_get_first_bvec(nxt, &nxt_bv);
-
-	return biovec_phys_mergeable(q, &end_bv, &nxt_bv);
-}
-
 static inline struct scatterlist *blk_next_sg(struct scatterlist **sg,
 		struct scatterlist *sglist)
 {
@@ -669,8 +608,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 				struct request *next)
 {
 	int total_phys_segments;
-	unsigned int seg_size =
-		req->biotail->bi_seg_back_size + next->bio->bi_seg_front_size;
 
 	if (req_gap_back_merge(req, next->bio))
 		return 0;
@@ -683,13 +620,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 		return 0;
 
 	total_phys_segments = req->nr_phys_segments + next->nr_phys_segments;
-	if (blk_phys_contig_segment(q, req->biotail, next->bio)) {
-		if (req->nr_phys_segments == 1)
-			req->bio->bi_seg_front_size = seg_size;
-		if (next->nr_phys_segments == 1)
-			next->biotail->bi_seg_back_size = seg_size;
-	}
-
 	if (total_phys_segments > queue_max_segments(q))
 		return 0;
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index be418275763c..95202f80676c 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -159,13 +159,6 @@ struct bio {
 	 */
 	unsigned int		bi_phys_segments;
 
-	/*
-	 * To keep track of the max segment size, we account for the
-	 * sizes of the first and last mergeable segments in this bio.
-	 */
-	unsigned int		bi_seg_front_size;
-	unsigned int		bi_seg_back_size;
-
 	struct bvec_iter	bi_iter;
 
 	atomic_t		__bi_remaining;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
2019-05-21  8:05   ` Ming Lei
2019-05-21  7:01 ` [PATCH 2/4] block: force an unlimited segment size on queues with a virt boundary Christoph Hellwig
2019-05-21  7:01 ` [PATCH 3/4] block: remove the segment size check in bio_will_gap Christoph Hellwig
2019-05-21  7:01 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
2019-05-21  8:06   ` Ming Lei
2019-05-21 12:47 ` fix nr_phys_segments vs iterators accounting v3 Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2019-05-16  8:40 fix nr_phys_segments vs iterators accounting v2 Christoph Hellwig
2019-05-16  8:40 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
2019-05-16  8:50   ` Hannes Reinecke

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox