All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/2] optimise split bio
@ 2021-01-05 19:43 Pavel Begunkov
  2021-01-05 19:43 ` [RFC 1/2] block: add a function for *segment_split fast path Pavel Begunkov
  2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
  0 siblings, 2 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-05 19:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Christoph Hellwig

With the no-copy bvec patches and nullb eats blk_bio_segment_split()
eats >7%. It adds yet another fast path for it.

        8K              16K             32K             64K
before: 932             904             868             788
after:  934             919             902             862

Would appreciate if anyone knows off the bat typical queue_max_segments, etc.
numbers for NVMe.

Pavel Begunkov (2):
  block: add a function for *segment_split fast path
  block: add a fast path for seg split of large bio

 block/blk-merge.c | 107 +++++++++++++++++++++++++++++-----------------
 1 file changed, 68 insertions(+), 39 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC 1/2] block: add a function for *segment_split fast path
  2021-01-05 19:43 [RFC 0/2] optimise split bio Pavel Begunkov
@ 2021-01-05 19:43 ` Pavel Begunkov
  2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
  1 sibling, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-05 19:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Christoph Hellwig

Don't keep blk_bio_segment_split()'s fast path hand coded in
__blk_queue_split(), extract it into a function. It's inlined perfectly
well.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/blk-merge.c | 84 ++++++++++++++++++++++++++---------------------
 1 file changed, 46 insertions(+), 38 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 808768f6b174..84b9635b5d57 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -223,29 +223,10 @@ static bool bvec_split_segs(const struct request_queue *q,
 	return len > 0 || bv->bv_len > max_len;
 }
 
-/**
- * blk_bio_segment_split - split a bio in two bios
- * @q:    [in] request queue pointer
- * @bio:  [in] bio to be split
- * @bs:	  [in] bio set to allocate the clone from
- * @segs: [out] number of segments in the bio with the first half of the sectors
- *
- * Clone @bio, update the bi_iter of the clone to represent the first sectors
- * of @bio and update @bio->bi_iter to represent the remaining sectors. The
- * following is guaranteed for the cloned bio:
- * - That it has at most get_max_io_size(@q, @bio) sectors.
- * - That it has at most queue_max_segments(@q) segments.
- *
- * Except for discard requests the cloned bio will point at the bi_io_vec of
- * the original bio. It is the responsibility of the caller to ensure that the
- * original bio is not freed before the cloned bio. The caller is also
- * responsible for ensuring that @bs is only destroyed after processing of the
- * split bio has finished.
- */
-static struct bio *blk_bio_segment_split(struct request_queue *q,
-					 struct bio *bio,
-					 struct bio_set *bs,
-					 unsigned *segs)
+static struct bio *__blk_bio_segment_split(struct request_queue *q,
+					   struct bio *bio,
+					   struct bio_set *bs,
+					   unsigned *segs)
 {
 	struct bio_vec bv, bvprv, *bvprvp = NULL;
 	struct bvec_iter iter;
@@ -290,6 +271,48 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	return bio_split(bio, sectors, GFP_NOIO, bs);
 }
 
+/**
+ * blk_bio_segment_split - split a bio in two bios
+ * @q:    [in] request queue pointer
+ * @bio:  [in] bio to be split
+ * @bs:	  [in] bio set to allocate the clone from
+ * @segs: [out] number of segments in the bio with the first half of the sectors
+ *
+ * Clone @bio, update the bi_iter of the clone to represent the first sectors
+ * of @bio and update @bio->bi_iter to represent the remaining sectors. The
+ * following is guaranteed for the cloned bio:
+ * - That it has at most get_max_io_size(@q, @bio) sectors.
+ * - That it has at most queue_max_segments(@q) segments.
+ *
+ * Except for discard requests the cloned bio will point at the bi_io_vec of
+ * the original bio. It is the responsibility of the caller to ensure that the
+ * original bio is not freed before the cloned bio. The caller is also
+ * responsible for ensuring that @bs is only destroyed after processing of the
+ * split bio has finished.
+ */
+static inline struct bio *blk_bio_segment_split(struct request_queue *q,
+						struct bio *bio,
+						struct bio_set *bs,
+						unsigned *nr_segs)
+{
+	/*
+	 * All drivers must accept single-segments bios that are <=
+	 * PAGE_SIZE.  This is a quick and dirty check that relies on
+	 * the fact that bi_io_vec[0] is always valid if a bio has data.
+	 * The check might lead to occasional false negatives when bios
+	 * are cloned, but compared to the performance impact of cloned
+	 * bios themselves the loop below doesn't matter anyway.
+	 */
+	if (!q->limits.chunk_sectors && bio->bi_vcnt == 1 &&
+	    (bio->bi_io_vec[0].bv_len +
+	     bio->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
+		*nr_segs = 1;
+		return NULL;
+	}
+
+	return __blk_bio_segment_split(q, bio, bs, nr_segs);
+}
+
 /**
  * __blk_queue_split - split a bio and submit the second half
  * @bio:     [in, out] bio to be split
@@ -322,21 +345,6 @@ void __blk_queue_split(struct bio **bio, unsigned int *nr_segs)
 				nr_segs);
 		break;
 	default:
-		/*
-		 * All drivers must accept single-segments bios that are <=
-		 * PAGE_SIZE.  This is a quick and dirty check that relies on
-		 * the fact that bi_io_vec[0] is always valid if a bio has data.
-		 * The check might lead to occasional false negatives when bios
-		 * are cloned, but compared to the performance impact of cloned
-		 * bios themselves the loop below doesn't matter anyway.
-		 */
-		if (!q->limits.chunk_sectors &&
-		    (*bio)->bi_vcnt == 1 &&
-		    ((*bio)->bi_io_vec[0].bv_len +
-		     (*bio)->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
-			*nr_segs = 1;
-			break;
-		}
 		split = blk_bio_segment_split(q, *bio, &q->bio_split, nr_segs);
 		break;
 	}
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-05 19:43 [RFC 0/2] optimise split bio Pavel Begunkov
  2021-01-05 19:43 ` [RFC 1/2] block: add a function for *segment_split fast path Pavel Begunkov
@ 2021-01-05 19:43 ` Pavel Begunkov
  2021-01-05 20:36   ` Pavel Begunkov
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-05 19:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Christoph Hellwig

blk_bio_segment_split() is very heavy, but the current fast path covers
only one-segment under PAGE_SIZE bios. Add another one by estimating an
upper bound of sectors a bio can contain.

One restricting factor here is queue_max_segment_size(), which it
compare against full iter size to not dig into bvecs. By default it's
64KB, and so for requests under 64KB, but for those falling under the
conditions it's much faster.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/blk-merge.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 84b9635b5d57..15d75f3ffc30 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -226,12 +226,12 @@ static bool bvec_split_segs(const struct request_queue *q,
 static struct bio *__blk_bio_segment_split(struct request_queue *q,
 					   struct bio *bio,
 					   struct bio_set *bs,
-					   unsigned *segs)
+					   unsigned *segs,
+					   const unsigned max_sectors)
 {
 	struct bio_vec bv, bvprv, *bvprvp = NULL;
 	struct bvec_iter iter;
 	unsigned nsegs = 0, sectors = 0;
-	const unsigned max_sectors = get_max_io_size(q, bio);
 	const unsigned max_segs = queue_max_segments(q);
 
 	bio_for_each_bvec(bv, bio, iter) {
@@ -295,6 +295,9 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
 						struct bio_set *bs,
 						unsigned *nr_segs)
 {
+	unsigned int max_sectors, q_max_sectors;
+	unsigned int bio_segs = bio->bi_vcnt;
+
 	/*
 	 * All drivers must accept single-segments bios that are <=
 	 * PAGE_SIZE.  This is a quick and dirty check that relies on
@@ -303,14 +306,32 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
 	 * are cloned, but compared to the performance impact of cloned
 	 * bios themselves the loop below doesn't matter anyway.
 	 */
-	if (!q->limits.chunk_sectors && bio->bi_vcnt == 1 &&
+	if (!q->limits.chunk_sectors && bio_segs == 1 &&
 	    (bio->bi_io_vec[0].bv_len +
 	     bio->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
 		*nr_segs = 1;
 		return NULL;
 	}
 
-	return __blk_bio_segment_split(q, bio, bs, nr_segs);
+	q_max_sectors = get_max_io_size(q, bio);
+	if (!queue_virt_boundary(q) && bio_segs < queue_max_segments(q) &&
+	    bio->bi_iter.bi_size <= queue_max_segment_size(q)) {
+		/*
+		 * Segments are contiguous, so only their ends may be not full.
+		 * An upper bound for them would to assume that each takes 1B
+		 * but adds a sector, and all left are just full sectors.
+		 * Note: it's ok to round size down because all not full
+		 * sectors are accounted by the first term.
+		 */
+		max_sectors = bio_segs * 2;
+		max_sectors += bio->bi_iter.bi_size >> 9;
+
+		if (max_sectors < q_max_sectors) {
+			*nr_segs = bio_segs;
+			return NULL;
+		}
+	}
+	return __blk_bio_segment_split(q, bio, bs, nr_segs, q_max_sectors);
 }
 
 /**
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
@ 2021-01-05 20:36   ` Pavel Begunkov
  2021-01-27 17:16   ` Christoph Hellwig
  2021-01-28 12:10   ` Ming Lei
  2 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-05 20:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Christoph Hellwig

On 05/01/2021 19:43, Pavel Begunkov wrote:
> blk_bio_segment_split() is very heavy, but the current fast path covers
> only one-segment under PAGE_SIZE bios. Add another one by estimating an
> upper bound of sectors a bio can contain.
> 
> One restricting factor here is queue_max_segment_size(), which it
> compare against full iter size to not dig into bvecs. By default it's
> 64KB, and so for requests under 64KB, but for those falling under the
> conditions it's much faster.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  block/blk-merge.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
[...]
>  
> -	return __blk_bio_segment_split(q, bio, bs, nr_segs);
> +	q_max_sectors = get_max_io_size(q, bio);
> +	if (!queue_virt_boundary(q) && bio_segs < queue_max_segments(q) &&
> +	    bio->bi_iter.bi_size <= queue_max_segment_size(q)) {

I think I miss a seg_boundary_mask check here. Any insights how to skip it?
Or why it's 2^31-1 by default, but not say ((unsigned long)-1)?


-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
  2021-01-05 20:36   ` Pavel Begunkov
@ 2021-01-27 17:16   ` Christoph Hellwig
  2021-01-28 11:56     ` Pavel Begunkov
  2021-01-28 12:10   ` Ming Lei
  2 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2021-01-27 17:16 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block, Ming Lei, Christoph Hellwig

On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
> blk_bio_segment_split() is very heavy, but the current fast path covers
> only one-segment under PAGE_SIZE bios. Add another one by estimating an
> upper bound of sectors a bio can contain.
> 
> One restricting factor here is queue_max_segment_size(), which it
> compare against full iter size to not dig into bvecs. By default it's
> 64KB, and so for requests under 64KB, but for those falling under the
> conditions it's much faster.

I think this works, but it is a pretty gross heuristic, which also
doesn't help us with NVMe, which is the I/O fast path of choice for
most people.  What is your use/test case?

> +		/*
> +		 * Segments are contiguous, so only their ends may be not full.
> +		 * An upper bound for them would to assume that each takes 1B
> +		 * but adds a sector, and all left are just full sectors.
> +		 * Note: it's ok to round size down because all not full
> +		 * sectors are accounted by the first term.
> +		 */
> +		max_sectors = bio_segs * 2;
> +		max_sectors += bio->bi_iter.bi_size >> 9;
> +
> +		if (max_sectors < q_max_sectors) {

I don't think we need the max_sectors variable here.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-27 17:16   ` Christoph Hellwig
@ 2021-01-28 11:56     ` Pavel Begunkov
  0 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-28 11:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Ming Lei

On 27/01/2021 17:16, Christoph Hellwig wrote:
> On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
>> blk_bio_segment_split() is very heavy, but the current fast path covers
>> only one-segment under PAGE_SIZE bios. Add another one by estimating an
>> upper bound of sectors a bio can contain.
>>
>> One restricting factor here is queue_max_segment_size(), which it
>> compare against full iter size to not dig into bvecs. By default it's
>> 64KB, and so for requests under 64KB, but for those falling under the
>> conditions it's much faster.
> 
> I think this works, but it is a pretty gross heuristic, which also

bio->bi_iter.bi_size <= queue_max_segment_size(q)

Do you mean this, right? I wouldn't say it's gross, but _very_ loose.

> doesn't help us with NVMe, which is the I/O fast path of choice for
> most people.  What is your use/test case?

Yeah, the idea to make it work for NVMe. Don't remember it restricting
segment size or whatever, maybe only atomicity. Which condition do you
see problematic? I can't recall without opening the spec.

> 
>> +		/*
>> +		 * Segments are contiguous, so only their ends may be not full.
>> +		 * An upper bound for them would to assume that each takes 1B
>> +		 * but adds a sector, and all left are just full sectors.
>> +		 * Note: it's ok to round size down because all not full
>> +		 * sectors are accounted by the first term.
>> +		 */
>> +		max_sectors = bio_segs * 2;
>> +		max_sectors += bio->bi_iter.bi_size >> 9;
>> +
>> +		if (max_sectors < q_max_sectors) {
> 
> I don't think we need the max_sectors variable here.

Even more, considering that it's already sector aligned we can kill off
all this section and just use (bi_iter.bi_size >> 9).

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
  2021-01-05 20:36   ` Pavel Begunkov
  2021-01-27 17:16   ` Christoph Hellwig
@ 2021-01-28 12:10   ` Ming Lei
  2021-01-28 12:27     ` Pavel Begunkov
  2 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2021-01-28 12:10 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
> blk_bio_segment_split() is very heavy, but the current fast path covers
> only one-segment under PAGE_SIZE bios. Add another one by estimating an
> upper bound of sectors a bio can contain.
> 
> One restricting factor here is queue_max_segment_size(), which it
> compare against full iter size to not dig into bvecs. By default it's
> 64KB, and so for requests under 64KB, but for those falling under the
> conditions it's much faster.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  block/blk-merge.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 84b9635b5d57..15d75f3ffc30 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -226,12 +226,12 @@ static bool bvec_split_segs(const struct request_queue *q,
>  static struct bio *__blk_bio_segment_split(struct request_queue *q,
>  					   struct bio *bio,
>  					   struct bio_set *bs,
> -					   unsigned *segs)
> +					   unsigned *segs,
> +					   const unsigned max_sectors)
>  {
>  	struct bio_vec bv, bvprv, *bvprvp = NULL;
>  	struct bvec_iter iter;
>  	unsigned nsegs = 0, sectors = 0;
> -	const unsigned max_sectors = get_max_io_size(q, bio);
>  	const unsigned max_segs = queue_max_segments(q);
>  
>  	bio_for_each_bvec(bv, bio, iter) {
> @@ -295,6 +295,9 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
>  						struct bio_set *bs,
>  						unsigned *nr_segs)
>  {
> +	unsigned int max_sectors, q_max_sectors;
> +	unsigned int bio_segs = bio->bi_vcnt;
> +
>  	/*
>  	 * All drivers must accept single-segments bios that are <=
>  	 * PAGE_SIZE.  This is a quick and dirty check that relies on
> @@ -303,14 +306,32 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
>  	 * are cloned, but compared to the performance impact of cloned
>  	 * bios themselves the loop below doesn't matter anyway.
>  	 */
> -	if (!q->limits.chunk_sectors && bio->bi_vcnt == 1 &&
> +	if (!q->limits.chunk_sectors && bio_segs == 1 &&
>  	    (bio->bi_io_vec[0].bv_len +
>  	     bio->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
>  		*nr_segs = 1;
>  		return NULL;
>  	}
>  
> -	return __blk_bio_segment_split(q, bio, bs, nr_segs);
> +	q_max_sectors = get_max_io_size(q, bio);
> +	if (!queue_virt_boundary(q) && bio_segs < queue_max_segments(q) &&
> +	    bio->bi_iter.bi_size <= queue_max_segment_size(q)) {

.bi_vcnt is 0 for fast cloned bio, so the above check may become true
when real nr_segment is > queue_max_segments(), especially in case that
max segments limit is small and segment size is big.


-- 
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-28 12:10   ` Ming Lei
@ 2021-01-28 12:27     ` Pavel Begunkov
  2021-01-29  2:00       ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2021-01-28 12:27 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On 28/01/2021 12:10, Ming Lei wrote:
> On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
>> blk_bio_segment_split() is very heavy, but the current fast path covers
>> only one-segment under PAGE_SIZE bios. Add another one by estimating an
>> upper bound of sectors a bio can contain.
>>
>> One restricting factor here is queue_max_segment_size(), which it
>> compare against full iter size to not dig into bvecs. By default it's
>> 64KB, and so for requests under 64KB, but for those falling under the
>> conditions it's much faster.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>  block/blk-merge.c | 29 +++++++++++++++++++++++++----
>>  1 file changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index 84b9635b5d57..15d75f3ffc30 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -226,12 +226,12 @@ static bool bvec_split_segs(const struct request_queue *q,
>>  static struct bio *__blk_bio_segment_split(struct request_queue *q,
>>  					   struct bio *bio,
>>  					   struct bio_set *bs,
>> -					   unsigned *segs)
>> +					   unsigned *segs,
>> +					   const unsigned max_sectors)
>>  {
>>  	struct bio_vec bv, bvprv, *bvprvp = NULL;
>>  	struct bvec_iter iter;
>>  	unsigned nsegs = 0, sectors = 0;
>> -	const unsigned max_sectors = get_max_io_size(q, bio);
>>  	const unsigned max_segs = queue_max_segments(q);
>>  
>>  	bio_for_each_bvec(bv, bio, iter) {
>> @@ -295,6 +295,9 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
>>  						struct bio_set *bs,
>>  						unsigned *nr_segs)
>>  {
>> +	unsigned int max_sectors, q_max_sectors;
>> +	unsigned int bio_segs = bio->bi_vcnt;
>> +
>>  	/*
>>  	 * All drivers must accept single-segments bios that are <=
>>  	 * PAGE_SIZE.  This is a quick and dirty check that relies on
>> @@ -303,14 +306,32 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
>>  	 * are cloned, but compared to the performance impact of cloned
>>  	 * bios themselves the loop below doesn't matter anyway.
>>  	 */
>> -	if (!q->limits.chunk_sectors && bio->bi_vcnt == 1 &&
>> +	if (!q->limits.chunk_sectors && bio_segs == 1 &&
>>  	    (bio->bi_io_vec[0].bv_len +
>>  	     bio->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
>>  		*nr_segs = 1;
>>  		return NULL;
>>  	}
>>  
>> -	return __blk_bio_segment_split(q, bio, bs, nr_segs);
>> +	q_max_sectors = get_max_io_size(q, bio);
>> +	if (!queue_virt_boundary(q) && bio_segs < queue_max_segments(q) &&
>> +	    bio->bi_iter.bi_size <= queue_max_segment_size(q)) {
> 
> .bi_vcnt is 0 for fast cloned bio, so the above check may become true
> when real nr_segment is > queue_max_segments(), especially in case that
> max segments limit is small and segment size is big.

I guess we can skip the fast path for them (i.e. bi_vcnt == 0)

I'm curious, why it's 0 but not the real number?

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-28 12:27     ` Pavel Begunkov
@ 2021-01-29  2:00       ` Ming Lei
  2021-02-01 10:59         ` Pavel Begunkov
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2021-01-29  2:00 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On Thu, Jan 28, 2021 at 12:27:39PM +0000, Pavel Begunkov wrote:
> On 28/01/2021 12:10, Ming Lei wrote:
> > On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
> >> blk_bio_segment_split() is very heavy, but the current fast path covers
> >> only one-segment under PAGE_SIZE bios. Add another one by estimating an
> >> upper bound of sectors a bio can contain.
> >>
> >> One restricting factor here is queue_max_segment_size(), which it
> >> compare against full iter size to not dig into bvecs. By default it's
> >> 64KB, and so for requests under 64KB, but for those falling under the
> >> conditions it's much faster.
> >>
> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >> ---
> >>  block/blk-merge.c | 29 +++++++++++++++++++++++++----
> >>  1 file changed, 25 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/block/blk-merge.c b/block/blk-merge.c
> >> index 84b9635b5d57..15d75f3ffc30 100644
> >> --- a/block/blk-merge.c
> >> +++ b/block/blk-merge.c
> >> @@ -226,12 +226,12 @@ static bool bvec_split_segs(const struct request_queue *q,
> >>  static struct bio *__blk_bio_segment_split(struct request_queue *q,
> >>  					   struct bio *bio,
> >>  					   struct bio_set *bs,
> >> -					   unsigned *segs)
> >> +					   unsigned *segs,
> >> +					   const unsigned max_sectors)
> >>  {
> >>  	struct bio_vec bv, bvprv, *bvprvp = NULL;
> >>  	struct bvec_iter iter;
> >>  	unsigned nsegs = 0, sectors = 0;
> >> -	const unsigned max_sectors = get_max_io_size(q, bio);
> >>  	const unsigned max_segs = queue_max_segments(q);
> >>  
> >>  	bio_for_each_bvec(bv, bio, iter) {
> >> @@ -295,6 +295,9 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
> >>  						struct bio_set *bs,
> >>  						unsigned *nr_segs)
> >>  {
> >> +	unsigned int max_sectors, q_max_sectors;
> >> +	unsigned int bio_segs = bio->bi_vcnt;
> >> +
> >>  	/*
> >>  	 * All drivers must accept single-segments bios that are <=
> >>  	 * PAGE_SIZE.  This is a quick and dirty check that relies on
> >> @@ -303,14 +306,32 @@ static inline struct bio *blk_bio_segment_split(struct request_queue *q,
> >>  	 * are cloned, but compared to the performance impact of cloned
> >>  	 * bios themselves the loop below doesn't matter anyway.
> >>  	 */
> >> -	if (!q->limits.chunk_sectors && bio->bi_vcnt == 1 &&
> >> +	if (!q->limits.chunk_sectors && bio_segs == 1 &&
> >>  	    (bio->bi_io_vec[0].bv_len +
> >>  	     bio->bi_io_vec[0].bv_offset) <= PAGE_SIZE) {
> >>  		*nr_segs = 1;
> >>  		return NULL;
> >>  	}
> >>  
> >> -	return __blk_bio_segment_split(q, bio, bs, nr_segs);
> >> +	q_max_sectors = get_max_io_size(q, bio);
> >> +	if (!queue_virt_boundary(q) && bio_segs < queue_max_segments(q) &&
> >> +	    bio->bi_iter.bi_size <= queue_max_segment_size(q)) {
> > 
> > .bi_vcnt is 0 for fast cloned bio, so the above check may become true
> > when real nr_segment is > queue_max_segments(), especially in case that
> > max segments limit is small and segment size is big.
> 
> I guess we can skip the fast path for them (i.e. bi_vcnt == 0)

But bi_vcnt can't represent real segment number, which can be bigger or
less than .bi_vcnt.

> I'm curious, why it's 0 but not the real number?

fast-cloned bio shares bvec table of original bio, so it doesn't have
.bi_vcnt.


-- 
Ming


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC 2/2] block: add a fast path for seg split of large bio
  2021-01-29  2:00       ` Ming Lei
@ 2021-02-01 10:59         ` Pavel Begunkov
  0 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2021-02-01 10:59 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On 29/01/2021 02:00, Ming Lei wrote:
> On Thu, Jan 28, 2021 at 12:27:39PM +0000, Pavel Begunkov wrote:
>> On 28/01/2021 12:10, Ming Lei wrote:
>>> On Tue, Jan 05, 2021 at 07:43:38PM +0000, Pavel Begunkov wrote:
>>> .bi_vcnt is 0 for fast cloned bio, so the above check may become true
>>> when real nr_segment is > queue_max_segments(), especially in case that
>>> max segments limit is small and segment size is big.
>>
>> I guess we can skip the fast path for them (i.e. bi_vcnt == 0)
> 
> But bi_vcnt can't represent real segment number, which can be bigger or
> less than .bi_vcnt.

Sounds like "go slow path on bi_vcnt==0" won't work. Ok, I need to
go look what is the real purpose of .bi_vcnt

> 
>> I'm curious, why it's 0 but not the real number?
> 
> fast-cloned bio shares bvec table of original bio, so it doesn't have
> .bi_vcnt.

Got it, thanks


-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-02-01 11:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 19:43 [RFC 0/2] optimise split bio Pavel Begunkov
2021-01-05 19:43 ` [RFC 1/2] block: add a function for *segment_split fast path Pavel Begunkov
2021-01-05 19:43 ` [RFC 2/2] block: add a fast path for seg split of large bio Pavel Begunkov
2021-01-05 20:36   ` Pavel Begunkov
2021-01-27 17:16   ` Christoph Hellwig
2021-01-28 11:56     ` Pavel Begunkov
2021-01-28 12:10   ` Ming Lei
2021-01-28 12:27     ` Pavel Begunkov
2021-01-29  2:00       ` Ming Lei
2021-02-01 10:59         ` Pavel Begunkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.