All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] block: fix bio_will_gap()
@ 2016-02-15  7:01 Ming Lei
  2016-02-15  7:01 ` [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-15  7:01 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel; +Cc: linux-block, Christoph Hellwig, Sagi Grimberg

Hi,

After bio splitting is introduced, the splitted bio can be fast-cloned,
which is correct because biovecs has become immutable since v3.13.
    
Unfortunately bio_will_gap() isn't ready for this kind of change,
because it figures out the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]'
directly.

It is observed that lots of BIOs are merges even the virt boundary
limit is violated, and the issue is reported from Sagi Grimberg.
    
This patchset try to fix the issue by introducing two helpers for getting
the first and last bvec of one bio by the bio iterator helpers.

 block/blk-merge.c      |  8 ++------
 include/linux/bio.h    | 20 ++++++++++++++++++++
 include/linux/blkdev.h | 11 ++++++++---
 3 files changed, 30 insertions(+), 9 deletions(-)

Thanks,
Ming

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  7:01 [PATCH 0/4] block: fix bio_will_gap() Ming Lei
@ 2016-02-15  7:01 ` Ming Lei
  2016-02-15  8:19   ` Sagi Grimberg
  2016-02-15  7:01 ` [PATCH 2/4] block: check virt boundary in bio_will_gap() Ming Lei
  2016-02-15  8:33 ` [PATCH 0/4] block: fix bio_will_gap() Sagi Grimberg
  2 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2016-02-15  7:01 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Ming Lei, stable

After bio splitting is introduced, the splitted bio can
be fast-cloned, which is correct because biovecs has become
immutable since v3.13.

Unfortunately bio_will_gap() isn't ready for this kind of change,
because it figures out the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]'.

This patch introduces two helpers for getting the first and last
bvec of one bio.

Cc: stable@vger.kernel.org # v4.3+
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/linux/bio.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 5349e68..56d2db8 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -265,6 +265,26 @@ static inline unsigned bio_segments(struct bio *bio)
 	return segs;
 }
 
+static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
+{
+	*bv = bio_iovec(bio);
+}
+
+/*
+ * bio_get_last_bvec() is introduced to get the last bvec of one
+ * bio for bio_will_gap().
+ *
+ * TODO: make it more efficient.
+ */
+static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
+{
+	struct bvec_iter iter;
+
+	bio_for_each_segment(*bv, bio, iter)
+		if (bv->bv_len == iter.bi_size)
+			break;
+}
+
 /*
  * get a reference to a bio, so it won't disappear. the intended use is
  * something like:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15  7:01 [PATCH 0/4] block: fix bio_will_gap() Ming Lei
  2016-02-15  7:01 ` [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
@ 2016-02-15  7:01 ` Ming Lei
  2016-02-15  8:22   ` Sagi Grimberg
  2016-02-15  8:33 ` [PATCH 0/4] block: fix bio_will_gap() Sagi Grimberg
  2 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2016-02-15  7:01 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, Christoph Hellwig, Sagi Grimberg, Ming Lei

In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.

Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/linux/blkdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4571ef1..b8ff6a3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
 static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 			 struct bio *next)
 {
-	if (!bio_has_data(prev))
+	if (!bio_has_data(prev) || !queue_virt_boundary(q))
 		return false;
 
 	return bvec_gap_to_prev(q, &prev->bi_io_vec[prev->bi_vcnt - 1],
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  7:01 ` [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
@ 2016-02-15  8:19   ` Sagi Grimberg
  2016-02-15  9:42     ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Sagi Grimberg @ 2016-02-15  8:19 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel; +Cc: linux-block, Christoph Hellwig, stable


> +/*
> + * bio_get_last_bvec() is introduced to get the last bvec of one
> + * bio for bio_will_gap().
> + *
> + * TODO: make it more efficient.
> + */
> +static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
> +{
> +	struct bvec_iter iter;
> +
> +	bio_for_each_segment(*bv, bio, iter)
> +		if (bv->bv_len == iter.bi_size)
> +			break;
> +}

This helper is used for each req/bio once or more. I'd say
it's critical to make it efficient and not settle for
a quick bail for drivers that don't have a virt_boundary
like you did in patch #2.

However, given that it's a regression bug fix I'm not sure it's the best
idea to add logic here.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15  7:01 ` [PATCH 2/4] block: check virt boundary in bio_will_gap() Ming Lei
@ 2016-02-15  8:22   ` Sagi Grimberg
  2016-02-15 10:27     ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Sagi Grimberg @ 2016-02-15  8:22 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel; +Cc: linux-block, Christoph Hellwig


> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 4571ef1..b8ff6a3 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
>   static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
>   			 struct bio *next)
>   {
> -	if (!bio_has_data(prev))
> +	if (!bio_has_data(prev) || !queue_virt_boundary(q))
>   		return false;

Can we not do that?

bvec_gap_to_prev is already checking the virt_boundary and I'd sorta
like to keep the motivation to optimize bio_get_last_bvec() to be O(1).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: fix bio_will_gap()
  2016-02-15  7:01 [PATCH 0/4] block: fix bio_will_gap() Ming Lei
  2016-02-15  7:01 ` [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
  2016-02-15  7:01 ` [PATCH 2/4] block: check virt boundary in bio_will_gap() Ming Lei
@ 2016-02-15  8:33 ` Sagi Grimberg
  2 siblings, 0 replies; 17+ messages in thread
From: Sagi Grimberg @ 2016-02-15  8:33 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-kernel; +Cc: linux-block, Christoph Hellwig


> Hi,
>
> After bio splitting is introduced, the splitted bio can be fast-cloned,
> which is correct because biovecs has become immutable since v3.13.
>
> Unfortunately bio_will_gap() isn't ready for this kind of change,
> because it figures out the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]'
> directly.
>
> It is observed that lots of BIOs are merges even the virt boundary
> limit is violated, and the issue is reported from Sagi Grimberg.

This set makes the virt_boundary violation go away...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  8:19   ` Sagi Grimberg
@ 2016-02-15  9:42     ` Ming Lei
  2016-02-15 20:06       ` Sagi Grimberg
                         ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-15  9:42 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, linux-kernel, linux-block, Christoph Hellwig, stable,
	tom.leiming, Keith Busch, Kent Overstreet


Hi,

On Mon, 15 Feb 2016 10:19:49 +0200
Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:

> 
> > +/*
> > + * bio_get_last_bvec() is introduced to get the last bvec of one
> > + * bio for bio_will_gap().
> > + *
> > + * TODO: make it more efficient.
> > + */
> > +static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
> > +{
> > +	struct bvec_iter iter;
> > +
> > +	bio_for_each_segment(*bv, bio, iter)
> > +		if (bv->bv_len == iter.bi_size)
> > +			break;
> > +}
> 
> This helper is used for each req/bio once or more. I'd say

No, the helper is only used for the non-splitted BIO, and all
splitted BIO is marked as non-merge.

> it's critical to make it efficient and not settle for
> a quick bail for drivers that don't have a virt_boundary
> like you did in patch #2.

Cc Kent and Keith.

Follows another version which should be more efficient.
Kent and Keith, I appreciate much if you may give a review on it.

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 56d2db8..ef45fec 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
  */
 static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
 {
-	struct bvec_iter iter;
+	struct bvec_iter iter = bio->bi_iter;
+	int idx;
+
+	bio_advance_iter(bio, &iter, iter.bi_size);
+
+	WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
+
+	if (!iter.bi_bvec_done)
+		idx = iter.bi_idx - 1;
+	else	/* in the middle of bvec */
+		idx = iter.bi_idx;
 
-	bio_for_each_segment(*bv, bio, iter)
-		if (bv->bv_len == iter.bi_size)
-			break;
+	*bv = bio->bi_io_vec[idx];
+	if (iter.bi_bvec_done)
+		bv->bv_len = iter.bi_bvec_done;
 }
 
 /*


> 
> However, given that it's a regression bug fix I'm not sure it's the best
> idea to add logic here.

But the issue is obviously in bio_will_gap(), isn't it?

Simply reverting 52cc6eead9095(block: blk-merge: fast-clone bio when splitting rw bios)
still might cause performance regression too.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15  8:22   ` Sagi Grimberg
@ 2016-02-15 10:27     ` Ming Lei
  2016-02-15 20:27       ` Sagi Grimberg
  0 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2016-02-15 10:27 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block, Christoph Hellwig

On Mon, Feb 15, 2016 at 4:22 PM, Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:
>
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 4571ef1..b8ff6a3 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct
>> request_queue *q,
>>   static inline bool bio_will_gap(struct request_queue *q, struct bio
>> *prev,
>>                          struct bio *next)
>>   {
>> -       if (!bio_has_data(prev))
>> +       if (!bio_has_data(prev) || !queue_virt_boundary(q))
>>                 return false;
>
>
> Can we not do that?

Given there are only 3 drivers which set virt boundary, I think
it is reasonable to do that.

>
> bvec_gap_to_prev is already checking the virt_boundary and I'd sorta
> like to keep the motivation to optimize bio_get_last_bvec() to be O(1).

Currently the approaches I thought of still need to iterate bvec by bvec,
not sure if O(1) can be reached easily, but I am happy to discuss the
optimized implementation.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  9:42     ` Ming Lei
@ 2016-02-15 20:06       ` Sagi Grimberg
  2016-02-16 13:03         ` Ming Lei
  2016-02-17  3:08       ` Elliott, Robert (Persistent Memory)
  2016-02-18  4:24       ` Kent Overstreet
  2 siblings, 1 reply; 17+ messages in thread
From: Sagi Grimberg @ 2016-02-15 20:06 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, linux-block, Christoph Hellwig, stable,
	tom.leiming, Keith Busch, Kent Overstreet


> Cc Kent and Keith.
>
> Follows another version which should be more efficient.
> Kent and Keith, I appreciate much if you may give a review on it.
>
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 56d2db8..ef45fec 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
>    */
>   static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>   {
> -	struct bvec_iter iter;
> +	struct bvec_iter iter = bio->bi_iter;
> +	int idx;
> +
> +	bio_advance_iter(bio, &iter, iter.bi_size);
> +
> +	WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
> +
> +	if (!iter.bi_bvec_done)
> +		idx = iter.bi_idx - 1;
> +	else	/* in the middle of bvec */
> +		idx = iter.bi_idx;
>
> -	bio_for_each_segment(*bv, bio, iter)
> -		if (bv->bv_len == iter.bi_size)
> -			break;
> +	*bv = bio->bi_io_vec[idx];
> +	if (iter.bi_bvec_done)
> +		bv->bv_len = iter.bi_bvec_done;
>   }
>
>   /*
>

This looks good too.

>
>>
>> However, given that it's a regression bug fix I'm not sure it's the best
>> idea to add logic here.
>
> But the issue is obviously in bio_will_gap(), isn't it?
>
> Simply reverting 52cc6eead9095(block: blk-merge: fast-clone bio when splitting rw bios)
> still might cause performance regression too.

That's correct. I assume that the bio splitting code affects
specific I/O pattern (gappy), however bio_will_gap is also tested
for bio merges (even if the bios won't merge eventually). This means
that each merge check will invoke bio_advance_iter() which is something
I'd like to avoid...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15 10:27     ` Ming Lei
@ 2016-02-15 20:27       ` Sagi Grimberg
  2016-02-16 13:05         ` Ming Lei
  2016-02-16 13:08         ` Ming Lei
  0 siblings, 2 replies; 17+ messages in thread
From: Sagi Grimberg @ 2016-02-15 20:27 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block, Christoph Hellwig


>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>> index 4571ef1..b8ff6a3 100644
>>> --- a/include/linux/blkdev.h
>>> +++ b/include/linux/blkdev.h
>>> @@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct
>>> request_queue *q,
>>>    static inline bool bio_will_gap(struct request_queue *q, struct bio
>>> *prev,
>>>                           struct bio *next)
>>>    {
>>> -       if (!bio_has_data(prev))
>>> +       if (!bio_has_data(prev) || !queue_virt_boundary(q))
>>>    bio_integrity_add_page              return false;
>>
>>
>> Can we not do that?
>
> Given there are only 3 drivers which set virt boundary, I think
> it is reasonable to do that.

3 drivers that are really performance critical. I don't think we
should add optimized branching for some of the drivers especially
when the drivers that do set virt_boundary *really* care about latency.

>> bvec_gap_to_prev is already checking the virt_boundary and I'd sorta
>> like to keep the motivation to optimize bio_get_last_bvec() to be O(1).
>
> Currently the approaches I thought of still need to iterate bvec by bvec,
> not sure if O(1) can be reached easily, but I am happy to discuss the
> optimized implementation.

Me too. Note that I don't mind if the bio split code won't be optimized,
but I do want req_gap_back_merge/req_gap_front_merge to be...

Also, are the bvec_gap_to_prev usages in bio_add_pc_page and
bio_integrity_add_page safe? I didn't test this stuff with integrity
payloads...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15 20:06       ` Sagi Grimberg
@ 2016-02-16 13:03         ` Ming Lei
  0 siblings, 0 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-16 13:03 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block,
	Christoph Hellwig, stable, Keith Busch, Kent Overstreet

On Tue, Feb 16, 2016 at 4:06 AM, Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:
>
>> Cc Kent and Keith.
>>
>> Follows another version which should be more efficient.
>> Kent and Keith, I appreciate much if you may give a review on it.
>>
>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>> index 56d2db8..ef45fec 100644
>> --- a/include/linux/bio.h
>> +++ b/include/linux/bio.h
>> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio
>> *bio, struct bio_vec *bv)
>>    */
>>   static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec
>> *bv)
>>   {
>> -       struct bvec_iter iter;
>> +       struct bvec_iter iter = bio->bi_iter;
>> +       int idx;
>> +
>> +       bio_advance_iter(bio, &iter, iter.bi_size);
>> +
>> +       WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
>> +
>> +       if (!iter.bi_bvec_done)
>> +               idx = iter.bi_idx - 1;
>> +       else    /* in the middle of bvec */
>> +               idx = iter.bi_idx;
>>
>> -       bio_for_each_segment(*bv, bio, iter)
>> -               if (bv->bv_len == iter.bi_size)
>> -                       break;
>> +       *bv = bio->bi_io_vec[idx];
>> +       if (iter.bi_bvec_done)
>> +               bv->bv_len = iter.bi_bvec_done;
>>   }
>>
>>   /*
>>
>
> This looks good too.
>
>>
>>>
>>> However, given that it's a regression bug fix I'm not sure it's the best
>>> idea to add logic here.
>>
>>
>> But the issue is obviously in bio_will_gap(), isn't it?
>>
>> Simply reverting 52cc6eead9095(block: blk-merge: fast-clone bio when
>> splitting rw bios)
>> still might cause performance regression too.
>
>
> That's correct. I assume that the bio splitting code affects
> specific I/O pattern (gappy), however bio_will_gap is also tested

I don't understand why bio splitting affects specific I/O pattern, could you
explain a bit?

>From commit b54ffb73c(block: remove bio_get_nr_vecs()), the upper
layer(fs, dm, dio,...) creates bio with its max size, and splitting should
be triggered easily.

> for bio merges (even if the bios won't merge eventually). This means

As I mentioned, bio_will_gap() is only called for non-splitted bio.

> that each merge check will invoke bio_advance_iter() which is something
> I'd like to avoid...

One idea is to use original way to compute the last bvec for non-cloned
bio, and use the approach in this patch for cloned bio(often splitted bio).
I will take this way in v1 if no one objects.

thanks,
Ming

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15 20:27       ` Sagi Grimberg
@ 2016-02-16 13:05         ` Ming Lei
  2016-02-16 13:08         ` Ming Lei
  1 sibling, 0 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-16 13:05 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block, Christoph Hellwig

On Tue, Feb 16, 2016 at 4:27 AM, Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:
>
>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>> index 4571ef1..b8ff6a3 100644
>>>> --- a/include/linux/blkdev.h
>>>> +++ b/include/linux/blkdev.h
>>>> @@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct
>>>> request_queue *q,
>>>>    static inline bool bio_will_gap(struct request_queue *q, struct bio
>>>> *prev,
>>>>                           struct bio *next)
>>>>    {
>>>> -       if (!bio_has_data(prev))
>>>> +       if (!bio_has_data(prev) || !queue_virt_boundary(q))
>>>>    bio_integrity_add_page              return false;
>>>
>>>
>>>
>>> Can we not do that?
>>
>>
>> Given there are only 3 drivers which set virt boundary, I think
>> it is reasonable to do that.
>
>
> 3 drivers that are really performance critical. I don't think we
> should add optimized branching for some of the drivers especially
> when the drivers that do set virt_boundary *really* care about latency.
>
>>> bvec_gap_to_prev is already checking the virt_boundary and I'd sorta
>>> like to keep the motivation to optimize bio_get_last_bvec() to be O(1).
>>
>>
>> Currently the approaches I thought of still need to iterate bvec by bvec,
>> not sure if O(1) can be reached easily, but I am happy to discuss the
>> optimized implementation.
>
>
> Me too. Note that I don't mind if the bio split code won't be optimized,
> but I do want req_gap_back_merge/req_gap_front_merge to be...
>
> Also, are the bvec_gap_to_prev usages in bio_add_pc_page and
> bio_integrity_add_page safe? I didn't test this stuff with integrity

Yes, because both are non-cloned bvec table.

> payloads...

Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] block: check virt boundary in bio_will_gap()
  2016-02-15 20:27       ` Sagi Grimberg
  2016-02-16 13:05         ` Ming Lei
@ 2016-02-16 13:08         ` Ming Lei
  1 sibling, 0 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-16 13:08 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Linux Kernel Mailing List, linux-block, Christoph Hellwig

On Tue, Feb 16, 2016 at 4:27 AM, Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:
>
>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>> index 4571ef1..b8ff6a3 100644
>>>> --- a/include/linux/blkdev.h
>>>> +++ b/include/linux/blkdev.h
>>>> @@ -1388,7 +1388,7 @@ static inline bool bvec_gap_to_prev(struct
>>>> request_queue *q,
>>>>    static inline bool bio_will_gap(struct request_queue *q, struct bio
>>>> *prev,
>>>>                           struct bio *next)
>>>>    {
>>>> -       if (!bio_has_data(prev))
>>>> +       if (!bio_has_data(prev) || !queue_virt_boundary(q))
>>>>    bio_integrity_add_page              return false;
>>>
>>>
>>>
>>> Can we not do that?
>>
>>
>> Given there are only 3 drivers which set virt boundary, I think
>> it is reasonable to do that.
>
>
> 3 drivers that are really performance critical. I don't think we
> should add optimized branching for some of the drivers especially
> when the drivers that do set virt_boundary *really* care about latency.

I don't think the extra check on bvec_gap_to_prev() can make any
difference, but if you do care we can introduce __bvec_gap_to_prev()
in which the check is moved into bio_will_gap().

Thanks,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  9:42     ` Ming Lei
  2016-02-15 20:06       ` Sagi Grimberg
@ 2016-02-17  3:08       ` Elliott, Robert (Persistent Memory)
  2016-02-18  4:24       ` Kent Overstreet
  2 siblings, 0 replies; 17+ messages in thread
From: Elliott, Robert (Persistent Memory) @ 2016-02-17  3:08 UTC (permalink / raw)
  To: Ming Lei, Sagi Grimberg
  Cc: Jens Axboe, linux-kernel, linux-block, Christoph Hellwig, stable,
	tom.leiming, Keith Busch, Kent Overstreet

> -----Original Message-----
> From: linux-block-owner@vger.kernel.org [mailto:linux-block-
> owner@vger.kernel.org] On Behalf Of Ming Lei
> Sent: Monday, February 15, 2016 3:42 AM
> Subject: Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and
> last bvec
...
> diff --git a/include/linux/bio.h b/include/linux/bio.h
...
>  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>  {
> -	struct bvec_iter iter;
> +	struct bvec_iter iter = bio->bi_iter;
> +	int idx;
> +
> +	bio_advance_iter(bio, &iter, iter.bi_size);
> +
> +	WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);

If this ever did trigger, I don't think you'd want it for every bio
with a problem.  WARN_ONCE would be safer.

---
Robert Elliott, HPE Persistent Memory

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-15  9:42     ` Ming Lei
  2016-02-15 20:06       ` Sagi Grimberg
  2016-02-17  3:08       ` Elliott, Robert (Persistent Memory)
@ 2016-02-18  4:24       ` Kent Overstreet
  2016-02-18  6:16         ` Ming Lei
  2 siblings, 1 reply; 17+ messages in thread
From: Kent Overstreet @ 2016-02-18  4:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Sagi Grimberg, Jens Axboe, linux-kernel, linux-block,
	Christoph Hellwig, stable, tom.leiming, Keith Busch

On Mon, Feb 15, 2016 at 05:42:12PM +0800, Ming Lei wrote:
> Cc Kent and Keith.
> 
> Follows another version which should be more efficient.
> Kent and Keith, I appreciate much if you may give a review on it.
> 
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 56d2db8..ef45fec 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
>   */
>  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>  {
> -	struct bvec_iter iter;
> +	struct bvec_iter iter = bio->bi_iter;
> +	int idx;
> +
> +	bio_advance_iter(bio, &iter, iter.bi_size);
> +
> +	WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
> +
> +	if (!iter.bi_bvec_done)
> +		idx = iter.bi_idx - 1;
> +	else	/* in the middle of bvec */
> +		idx = iter.bi_idx;
>  
> -	bio_for_each_segment(*bv, bio, iter)
> -		if (bv->bv_len == iter.bi_size)
> -			break;
> +	*bv = bio->bi_io_vec[idx];
> +	if (iter.bi_bvec_done)
> +		bv->bv_len = iter.bi_bvec_done;
>  }

It can't be done correctly without a loop.

The reason is that if the bio was split in the middle of a segment, bv->bv_len
on the last biovec will be larger than what's actually used by the bio (it's
being shared between the two splits!).

You have to iterate over all the biovecs so that you can see where
bi_iter->bi_size ends.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-18  4:24       ` Kent Overstreet
@ 2016-02-18  6:16         ` Ming Lei
  2016-02-19  1:47           ` Ming Lei
  0 siblings, 1 reply; 17+ messages in thread
From: Ming Lei @ 2016-02-18  6:16 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Sagi Grimberg, Jens Axboe, Linux Kernel Mailing List,
	linux-block, Christoph Hellwig, stable, Keith Busch

Hi Kent,

Thanks for your review.

On Thu, Feb 18, 2016 at 12:24 PM, Kent Overstreet
<kent.overstreet@gmail.com> wrote:
> On Mon, Feb 15, 2016 at 05:42:12PM +0800, Ming Lei wrote:
>> Cc Kent and Keith.
>>
>> Follows another version which should be more efficient.
>> Kent and Keith, I appreciate much if you may give a review on it.
>>
>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>> index 56d2db8..ef45fec 100644
>> --- a/include/linux/bio.h
>> +++ b/include/linux/bio.h
>> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
>>   */
>>  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>>  {
>> -     struct bvec_iter iter;
>> +     struct bvec_iter iter = bio->bi_iter;
>> +     int idx;
>> +
>> +     bio_advance_iter(bio, &iter, iter.bi_size);
>> +
>> +     WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
>> +
>> +     if (!iter.bi_bvec_done)
>> +             idx = iter.bi_idx - 1;
>> +     else    /* in the middle of bvec */
>> +             idx = iter.bi_idx;
>>
>> -     bio_for_each_segment(*bv, bio, iter)
>> -             if (bv->bv_len == iter.bi_size)
>> -                     break;
>> +     *bv = bio->bi_io_vec[idx];
>> +     if (iter.bi_bvec_done)
>> +             bv->bv_len = iter.bi_bvec_done;
>>  }
>
> It can't be done correctly without a loop.

As we discussed in gtalk, the only case this patch can't cope with
is that one single bvec doesn't use up the remained io vector,
but it can be handled by putting the following code at the
function entry:

if (!bio_multiple_segments(bio)) {
       *bv = bio_iovec(bio);
       return;
}

>
> The reason is that if the bio was split in the middle of a segment, bv->bv_len
> on the last biovec will be larger than what's actually used by the bio (it's
> being shared between the two splits!).

The last two lines in this helper should handle the situation.

>
> You have to iterate over all the biovecs so that you can see where
> bi_iter->bi_size ends.

I understand your concern is that this patch may not be much more
efficient than bio_for_each_segment().

IMO, one win of the patch is that 16bytes bvec copy is saved for all
vectors, and another 'win' is to just run bvec_iter_advance() once(
like move the outside for loop inside).

I will run some benchmark to see if there is any performance
difference between the two patches.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec
  2016-02-18  6:16         ` Ming Lei
@ 2016-02-19  1:47           ` Ming Lei
  0 siblings, 0 replies; 17+ messages in thread
From: Ming Lei @ 2016-02-19  1:47 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Sagi Grimberg, Jens Axboe, Linux Kernel Mailing List,
	linux-block, Christoph Hellwig, stable, Keith Busch

Hi Guys,

On Thu, Feb 18, 2016 at 2:16 PM, Ming Lei <ming.lei@canonical.com> wrote:
> Hi Kent,
>
> Thanks for your review.
>
> On Thu, Feb 18, 2016 at 12:24 PM, Kent Overstreet
> <kent.overstreet@gmail.com> wrote:
>> On Mon, Feb 15, 2016 at 05:42:12PM +0800, Ming Lei wrote:
>>> Cc Kent and Keith.
>>>
>>> Follows another version which should be more efficient.
>>> Kent and Keith, I appreciate much if you may give a review on it.
>>>
>>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>>> index 56d2db8..ef45fec 100644
>>> --- a/include/linux/bio.h
>>> +++ b/include/linux/bio.h
>>> @@ -278,11 +278,21 @@ static inline void bio_get_first_bvec(struct bio *bio, struct bio_vec *bv)
>>>   */
>>>  static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
>>>  {
>>> -     struct bvec_iter iter;
>>> +     struct bvec_iter iter = bio->bi_iter;
>>> +     int idx;
>>> +
>>> +     bio_advance_iter(bio, &iter, iter.bi_size);
>>> +
>>> +     WARN_ON(!iter.bi_idx && !iter.bi_bvec_done);
>>> +
>>> +     if (!iter.bi_bvec_done)
>>> +             idx = iter.bi_idx - 1;
>>> +     else    /* in the middle of bvec */
>>> +             idx = iter.bi_idx;
>>>
>>> -     bio_for_each_segment(*bv, bio, iter)
>>> -             if (bv->bv_len == iter.bi_size)
>>> -                     break;
>>> +     *bv = bio->bi_io_vec[idx];
>>> +     if (iter.bi_bvec_done)
>>> +             bv->bv_len = iter.bi_bvec_done;
>>>  }
>>
>> It can't be done correctly without a loop.
>
> As we discussed in gtalk, the only case this patch can't cope with
> is that one single bvec doesn't use up the remained io vector,
> but it can be handled by putting the following code at the
> function entry:
>
> if (!bio_multiple_segments(bio)) {
>        *bv = bio_iovec(bio);
>        return;
> }
>
>>
>> The reason is that if the bio was split in the middle of a segment, bv->bv_len
>> on the last biovec will be larger than what's actually used by the bio (it's
>> being shared between the two splits!).
>
> The last two lines in this helper should handle the situation.
>
>>
>> You have to iterate over all the biovecs so that you can see where
>> bi_iter->bi_size ends.
>
> I understand your concern is that this patch may not be much more
> efficient than bio_for_each_segment().
>
> IMO, one win of the patch is that 16bytes bvec copy is saved for all
> vectors, and another 'win' is to just run bvec_iter_advance() once(
> like move the outside for loop inside).
>
> I will run some benchmark to see if there is any performance
> difference between the two patches.

When I call bench_last_bvec()(see below) from null_queue_rq():
drivers/block/null_blk.c, IOPS with the 2nd patch in fio test(libaio,
randread, null_blk with default mod parameter) is better than the
1st one by > ~2%:

-------------------------
|BS      | IOPS          |
------------------------
|64K    |  +2%           |
-----------------------
|512K  |  +3%          |
------------------------

the 1st patch: use bio_for_each_segment()
the 2nd patch: use single bio_advance_iter()

static void bench_last_bvec(struct request *rq)
{
        static unsigned long total = 0;
        struct bio *bio;
        struct bio_vec bv = {0};

        __rq_for_each_bio(bio, rq) {
                bio_get_last_bvec(bio, &bv);
                total += bv.bv_len;
        }
}

Thanks
Ming

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-02-19  1:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-15  7:01 [PATCH 0/4] block: fix bio_will_gap() Ming Lei
2016-02-15  7:01 ` [PATCH 1/4] block: bio: introduce helpers to get the 1st and last bvec Ming Lei
2016-02-15  8:19   ` Sagi Grimberg
2016-02-15  9:42     ` Ming Lei
2016-02-15 20:06       ` Sagi Grimberg
2016-02-16 13:03         ` Ming Lei
2016-02-17  3:08       ` Elliott, Robert (Persistent Memory)
2016-02-18  4:24       ` Kent Overstreet
2016-02-18  6:16         ` Ming Lei
2016-02-19  1:47           ` Ming Lei
2016-02-15  7:01 ` [PATCH 2/4] block: check virt boundary in bio_will_gap() Ming Lei
2016-02-15  8:22   ` Sagi Grimberg
2016-02-15 10:27     ` Ming Lei
2016-02-15 20:27       ` Sagi Grimberg
2016-02-16 13:05         ` Ming Lei
2016-02-16 13:08         ` Ming Lei
2016-02-15  8:33 ` [PATCH 0/4] block: fix bio_will_gap() Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.