linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcache: dont reset bio opf in bch_data_insert_start
@ 2021-01-25  4:29 Dongsheng Yang
  2021-01-25  4:53 ` Coly Li
  2021-01-27 17:37 ` Christoph Hellwig
  0 siblings, 2 replies; 8+ messages in thread
From: Dongsheng Yang @ 2021-01-25  4:29 UTC (permalink / raw)
  To: colyli, linux-bcache, linux-kernel, mchristi; +Cc: Dongsheng Yang

commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf
modified by bio_set_op_attrs(). But there is a logical
problem in this commit:

                trace_bcache_cache_insert(k);
                bch_keylist_push(&op->insert_keys);

-               n->bi_rw |= REQ_WRITE;
+               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
                bch_submit_bbio(n, op->c, k, 0);
        } while (n != bio);

The old code add REQ_WRITE into bio n and keep other flags; the
new code set REQ_OP_WRITE to bi_opf, but reset all other flags.

This problem is discoverd in our performance testing:
(1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1
(2) start a fio with 1M x 128depth for write in /dev/escache0 (cache
device is /dev/nvme0n1p2)

We found the BW of reading is 2000+M/s, but the BW of writing is
0-100M/s. After some debugging, we found the problem is io submit in
writting is very slow.

bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as
cached_dev submit stack bio will be added into current->bio_list, and
return.Then __submit_bio_noacct() will submit the new bio in bio_list
into /dev/nvme0n1p1. This operation would be slow in
blk_mq_submit_bio() -> rq_qos_throttle(q, bio);

The rq_qos_throttle() will call wbt_should_throttle(),
static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
{
        switch (bio_op(bio)) {
        case REQ_OP_WRITE:
                /*
                 * Don't throttle WRITE_ODIRECT
                 */
                if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
                    (REQ_SYNC | REQ_IDLE))
                        return false;
... ...
}

As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write
bio will be considered as non-direct write.

After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE),
then fio for writing will get about 1000M/s bandwidth.

Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drivers/md/bcache/request.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index c7cadaafa947..eb734f7ddaac 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -244,7 +244,7 @@ static void bch_data_insert_start(struct closure *cl)
 		trace_bcache_cache_insert(k);
 		bch_keylist_push(&op->insert_keys);
 
-		bio_set_op_attrs(n, REQ_OP_WRITE, 0);
+		n->bi_opf |= REQ_OP_WRITE;
 		bch_submit_bbio(n, op->c, k, 0);
 	} while (n != bio);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-25  4:29 [PATCH] bcache: dont reset bio opf in bch_data_insert_start Dongsheng Yang
@ 2021-01-25  4:53 ` Coly Li
  2021-01-26  4:32   ` Dongsheng Yang
  2021-01-27 17:37 ` Christoph Hellwig
  1 sibling, 1 reply; 8+ messages in thread
From: Coly Li @ 2021-01-25  4:53 UTC (permalink / raw)
  To: Dongsheng Yang, linux-bcache, linux-kernel, mchristi

On 1/25/21 12:29 PM, Dongsheng Yang wrote:
> commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf
> modified by bio_set_op_attrs(). But there is a logical
> problem in this commit:
> 
>                 trace_bcache_cache_insert(k);
>                 bch_keylist_push(&op->insert_keys);
> 
> -               n->bi_rw |= REQ_WRITE;
> +               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
>                 bch_submit_bbio(n, op->c, k, 0);
>         } while (n != bio);
> 
> The old code add REQ_WRITE into bio n and keep other flags; the
> new code set REQ_OP_WRITE to bi_opf, but reset all other flags.
> 
> This problem is discoverd in our performance testing:
> (1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1
> (2) start a fio with 1M x 128depth for write in /dev/escache0 (cache
> device is /dev/nvme0n1p2)
> 
> We found the BW of reading is 2000+M/s, but the BW of writing is
> 0-100M/s. After some debugging, we found the problem is io submit in
> writting is very slow.
> 
> bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as
> cached_dev submit stack bio will be added into current->bio_list, and
> return.Then __submit_bio_noacct() will submit the new bio in bio_list
> into /dev/nvme0n1p1. This operation would be slow in
> blk_mq_submit_bio() -> rq_qos_throttle(q, bio);
> 
> The rq_qos_throttle() will call wbt_should_throttle(),
> static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
> {
>         switch (bio_op(bio)) {
>         case REQ_OP_WRITE:
>                 /*
>                  * Don't throttle WRITE_ODIRECT
>                  */
>                 if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
>                     (REQ_SYNC | REQ_IDLE))
>                         return false;
> ... ...
> }
> 
> As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write
> bio will be considered as non-direct write.
> 
> After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE),
> then fio for writing will get about 1000M/s bandwidth.
> 
> Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6

It should be,
Fixes: ad0d9e76a412 ("bcache: use bio op accessors")

> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>

Please CC the fixed patch author  Mike Christie <mchristi@redhat.com>.

> ---
>  drivers/md/bcache/request.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
> index c7cadaafa947..eb734f7ddaac 100644
> --- a/drivers/md/bcache/request.c
> +++ b/drivers/md/bcache/request.c
> @@ -244,7 +244,7 @@ static void bch_data_insert_start(struct closure *cl)
>  		trace_bcache_cache_insert(k);
>  		bch_keylist_push(&op->insert_keys);
>  
> -		bio_set_op_attrs(n, REQ_OP_WRITE, 0);
> +		n->bi_opf |= REQ_OP_WRITE;
>  		bch_submit_bbio(n, op->c, k, 0);
>  	} while (n != bio);

The fix is OK to me, I'd like to see opinion from Mike Christie too.

Thanks for the catch.

Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-25  4:53 ` Coly Li
@ 2021-01-26  4:32   ` Dongsheng Yang
  2021-01-26  4:34     ` Coly Li
  0 siblings, 1 reply; 8+ messages in thread
From: Dongsheng Yang @ 2021-01-26  4:32 UTC (permalink / raw)
  To: Coly Li, linux-bcache, linux-kernel, mchristi


在 2021/1/25 星期一 下午 12:53, Coly Li 写道:
> On 1/25/21 12:29 PM, Dongsheng Yang wrote:
>> commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf
>> modified by bio_set_op_attrs(). But there is a logical
>> problem in this commit:
>>
>>                  trace_bcache_cache_insert(k);
>>                  bch_keylist_push(&op->insert_keys);
>>
>> -               n->bi_rw |= REQ_WRITE;
>> +               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
>>                  bch_submit_bbio(n, op->c, k, 0);
>>          } while (n != bio);
>>
>> The old code add REQ_WRITE into bio n and keep other flags; the
>> new code set REQ_OP_WRITE to bi_opf, but reset all other flags.
>>
>> This problem is discoverd in our performance testing:
>> (1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1
>> (2) start a fio with 1M x 128depth for write in /dev/escache0 (cache
>> device is /dev/nvme0n1p2)
>>
>> We found the BW of reading is 2000+M/s, but the BW of writing is
>> 0-100M/s. After some debugging, we found the problem is io submit in
>> writting is very slow.
>>
>> bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as
>> cached_dev submit stack bio will be added into current->bio_list, and
>> return.Then __submit_bio_noacct() will submit the new bio in bio_list
>> into /dev/nvme0n1p1. This operation would be slow in
>> blk_mq_submit_bio() -> rq_qos_throttle(q, bio);
>>
>> The rq_qos_throttle() will call wbt_should_throttle(),
>> static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
>> {
>>          switch (bio_op(bio)) {
>>          case REQ_OP_WRITE:
>>                  /*
>>                   * Don't throttle WRITE_ODIRECT
>>                   */
>>                  if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
>>                      (REQ_SYNC | REQ_IDLE))
>>                          return false;
>> ... ...
>> }
>>
>> As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write
>> bio will be considered as non-direct write.
>>
>> After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE),
>> then fio for writing will get about 1000M/s bandwidth.
>>
>> Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6
> It should be,
> Fixes: ad0d9e76a412 ("bcache: use bio op accessors")
>
>> Signed-off-by: Dongsheng Yang<dongsheng.yang@easystack.cn>
> Please CC the fixed patch author  Mike Christie<mchristi@redhat.com>.


Hi Coly,

     Should I send a V2 for commit message update?

Or you can help to fix it when you take it from maillist?


Thanx

Yang

>> ---
>>   drivers/md/bcache/request.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
>> index c7cadaafa947..eb734f7ddaac 100644
>> --- a/drivers/md/bcache/request.c
>> +++ b/drivers/md/bcache/request.c
>> @@ -244,7 +244,7 @@ static void bch_data_insert_start(struct closure *cl)
>>   		trace_bcache_cache_insert(k);
>>   		bch_keylist_push(&op->insert_keys);
>>   
>> -		bio_set_op_attrs(n, REQ_OP_WRITE, 0);
>> +		n->bi_opf |= REQ_OP_WRITE;
>>   		bch_submit_bbio(n, op->c, k, 0);
>>   	} while (n != bio);
> The fix is OK to me, I'd like to see opinion from Mike Christie too.
>
> Thanks for the catch.
>
> Coly Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-26  4:32   ` Dongsheng Yang
@ 2021-01-26  4:34     ` Coly Li
  2021-01-26  4:41       ` Dongsheng Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Coly Li @ 2021-01-26  4:34 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: linux-bcache, linux-kernel, mchristi

On 1/26/21 12:32 PM, Dongsheng Yang wrote:
> 
> 在 2021/1/25 星期一 下午 12:53, Coly Li 写道:
>> On 1/25/21 12:29 PM, Dongsheng Yang wrote:
>>> commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf
>>> modified by bio_set_op_attrs(). But there is a logical
>>> problem in this commit:
>>>
>>>                  trace_bcache_cache_insert(k);
>>>                  bch_keylist_push(&op->insert_keys);
>>>
>>> -               n->bi_rw |= REQ_WRITE;
>>> +               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
>>>                  bch_submit_bbio(n, op->c, k, 0);
>>>          } while (n != bio);
>>>
>>> The old code add REQ_WRITE into bio n and keep other flags; the
>>> new code set REQ_OP_WRITE to bi_opf, but reset all other flags.
>>>
>>> This problem is discoverd in our performance testing:
>>> (1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1
>>> (2) start a fio with 1M x 128depth for write in /dev/escache0 (cache
>>> device is /dev/nvme0n1p2)
>>>
>>> We found the BW of reading is 2000+M/s, but the BW of writing is
>>> 0-100M/s. After some debugging, we found the problem is io submit in
>>> writting is very slow.
>>>
>>> bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as
>>> cached_dev submit stack bio will be added into current->bio_list, and
>>> return.Then __submit_bio_noacct() will submit the new bio in bio_list
>>> into /dev/nvme0n1p1. This operation would be slow in
>>> blk_mq_submit_bio() -> rq_qos_throttle(q, bio);
>>>
>>> The rq_qos_throttle() will call wbt_should_throttle(),
>>> static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio
>>> *bio)
>>> {
>>>          switch (bio_op(bio)) {
>>>          case REQ_OP_WRITE:
>>>                  /*
>>>                   * Don't throttle WRITE_ODIRECT
>>>                   */
>>>                  if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
>>>                      (REQ_SYNC | REQ_IDLE))
>>>                          return false;
>>> ... ...
>>> }
>>>
>>> As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write
>>> bio will be considered as non-direct write.
>>>
>>> After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE),
>>> then fio for writing will get about 1000M/s bandwidth.
>>>
>>> Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6
>> It should be,
>> Fixes: ad0d9e76a412 ("bcache: use bio op accessors")
>>
>>> Signed-off-by: Dongsheng Yang<dongsheng.yang@easystack.cn>
>> Please CC the fixed patch author  Mike Christie<mchristi@redhat.com>.
> 
> 
> Hi Coly,
> 
>     Should I send a V2 for commit message update?
> 
> Or you can help to fix it when you take it from maillist?
> 

Yes, please fix it in v2 version. And let's wait for response from Mike,
maybe he has better suggestion to fix.

Thanks.

Coly Li


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-26  4:34     ` Coly Li
@ 2021-01-26  4:41       ` Dongsheng Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Dongsheng Yang @ 2021-01-26  4:41 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache, linux-kernel, mchristi


在 2021/1/26 星期二 下午 12:34, Coly Li 写道:
> On 1/26/21 12:32 PM, Dongsheng Yang wrote:
>> 在 2021/1/25 星期一 下午 12:53, Coly Li 写道:
>>> On 1/25/21 12:29 PM, Dongsheng Yang wrote:
>>>> commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf
>>>> modified by bio_set_op_attrs(). But there is a logical
>>>> problem in this commit:
>>>>
>>>>                   trace_bcache_cache_insert(k);
>>>>                   bch_keylist_push(&op->insert_keys);
>>>>
>>>> -               n->bi_rw |= REQ_WRITE;
>>>> +               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
>>>>                   bch_submit_bbio(n, op->c, k, 0);
>>>>           } while (n != bio);
>>>>
>>>> The old code add REQ_WRITE into bio n and keep other flags; the
>>>> new code set REQ_OP_WRITE to bi_opf, but reset all other flags.
>>>>
>>>> This problem is discoverd in our performance testing:
>>>> (1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1
>>>> (2) start a fio with 1M x 128depth for write in /dev/escache0 (cache
>>>> device is /dev/nvme0n1p2)
>>>>
>>>> We found the BW of reading is 2000+M/s, but the BW of writing is
>>>> 0-100M/s. After some debugging, we found the problem is io submit in
>>>> writting is very slow.
>>>>
>>>> bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as
>>>> cached_dev submit stack bio will be added into current->bio_list, and
>>>> return.Then __submit_bio_noacct() will submit the new bio in bio_list
>>>> into /dev/nvme0n1p1. This operation would be slow in
>>>> blk_mq_submit_bio() -> rq_qos_throttle(q, bio);
>>>>
>>>> The rq_qos_throttle() will call wbt_should_throttle(),
>>>> static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio
>>>> *bio)
>>>> {
>>>>           switch (bio_op(bio)) {
>>>>           case REQ_OP_WRITE:
>>>>                   /*
>>>>                    * Don't throttle WRITE_ODIRECT
>>>>                    */
>>>>                   if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) ==
>>>>                       (REQ_SYNC | REQ_IDLE))
>>>>                           return false;
>>>> ... ...
>>>> }
>>>>
>>>> As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write
>>>> bio will be considered as non-direct write.
>>>>
>>>> After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE),
>>>> then fio for writing will get about 1000M/s bandwidth.
>>>>
>>>> Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6
>>> It should be,
>>> Fixes: ad0d9e76a412 ("bcache: use bio op accessors")
>>>
>>>> Signed-off-by: Dongsheng Yang<dongsheng.yang@easystack.cn>
>>> Please CC the fixed patch author  Mike Christie<mchristi@redhat.com>.
>>
>> Hi Coly,
>>
>>      Should I send a V2 for commit message update?
>>
>> Or you can help to fix it when you take it from maillist?
>>
> Yes, please fix it in v2 version. And let's wait for response from Mike,
> maybe he has better suggestion to fix.


okey,actually, Mike is in my cc list of first mail (but not note in 
commit message), so he can receive my patch.

But anyway, I will send a v2

>
> Thanks.
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-25  4:29 [PATCH] bcache: dont reset bio opf in bch_data_insert_start Dongsheng Yang
  2021-01-25  4:53 ` Coly Li
@ 2021-01-27 17:37 ` Christoph Hellwig
  2021-01-28  9:10   ` Dongsheng Yang
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-27 17:37 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: colyli, linux-bcache, linux-kernel, mchristi

But the old code is also completely broken.  We can't just OR in
the op, as that implicitly assumes the old op was 0 (REQ_OP_READ).

Please fix this to explicitly set the exact op and flags that you want
instead of this fragile magic.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-27 17:37 ` Christoph Hellwig
@ 2021-01-28  9:10   ` Dongsheng Yang
  2021-01-28 10:45     ` Dongsheng Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Dongsheng Yang @ 2021-01-28  9:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: colyli, linux-bcache, linux-kernel, mchristi

Hi Christop:

在 2021/1/28 星期四 上午 1:37, Christoph Hellwig 写道:
> But the old code is also completely broken.  We can't just OR in
> the op, as that implicitly assumes the old op was 0 (REQ_OP_READ).


Yes, indeed, there is an assume that the op is just possible to be 0 
(REQ_OP_READ) or 1 (REQ_OP_WRITE).

REQ_OP_WRITE is from cached_dev_submit_bio() which would be submitted by 
upper user.

REQ_OP_READ is from bcache itself, such as cached_dev_read_done() (when 
we found cache miss, we will read

data from backing and then we want to insert it into cache device. then 
there is a read bio with data reach here, we

need to set the bio_op to REQ_OP_WRITE, and send this bio to cache device).

> Please fix this to explicitly set the exact op and flags that you want
> instead of this fragile magic.blk_rq_map_kern

This commit only want to fix the logic bug introduced in ad0d9e76a412 
("bcache: use bio op accessors"),

that's more likely a partial revert.


I agree that we can make it more clearly and explicitly.

But I found there is no accessor to set op only, besides, the 
bio_set_op_attrs() was marked as obsolete.

There are some others doing similar things as below:

blk_rq_map_kern():

bio->bi_opf &= ~REQ_OP_MASK;

bio->bi_opf |= req_op(rq);


So what about below:

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index c7cadaafa947..bacc7366002f 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -244,7 +244,14 @@ static void bch_data_insert_start(struct closure *cl)
                 trace_bcache_cache_insert(k);
                 bch_keylist_push(&op->insert_keys);

-               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
+               /*
+                * n here would be REQ_OP_READ, if
+                * we are inserting data read from
+                * backing device in cache miss or
+                * inserting data in movinggc.
+                */
+               n->bi_opf &= ~REQ_OP_MASK;
+               n->bi_opf |= REQ_OP_WRITE;
                 bch_submit_bbio(n, op->c, k, 0);
         } while (n != bio);


Thanx

Yang


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] bcache: dont reset bio opf in bch_data_insert_start
  2021-01-28  9:10   ` Dongsheng Yang
@ 2021-01-28 10:45     ` Dongsheng Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Dongsheng Yang @ 2021-01-28 10:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: colyli, linux-bcache, linux-kernel, mchristi

Hi Christoph,

在 2021/1/28 星期四 下午 5:10, Dongsheng Yang 写道:
> Hi Christop:
> 
> 在 2021/1/28 星期四 上午 1:37, Christoph Hellwig 写道:
>> But the old code is also completely broken.  We can't just OR in
>> the op, as that implicitly assumes the old op was 0 (REQ_OP_READ).
> 
> 
> Yes, indeed, there is an assume that the op is just possible to be 0 
> (REQ_OP_READ) or 1 (REQ_OP_WRITE).
> 
> REQ_OP_WRITE is from cached_dev_submit_bio() which would be submitted by 
> upper user.
> 
> REQ_OP_READ is from bcache itself, such as cached_dev_read_done() (when 
> we found cache miss, we will read
> 
> data from backing and then we want to insert it into cache device. then 
> there is a read bio with data reach here, we
> 
> need to set the bio_op to REQ_OP_WRITE, and send this bio to cache device).
> 
>> Please fix this to explicitly set the exact op and flags that you want
>> instead of this fragile magic.blk_rq_map_kern
> 
> This commit only want to fix the logic bug introduced in ad0d9e76a412 
> ("bcache: use bio op accessors"),
> 
> that's more likely a partial revert.
> 
> 
> I agree that we can make it more clearly and explicitly.
> 
> But I found there is no accessor to set op only, besides, the 
> bio_set_op_attrs() was marked as obsolete.
> 
> There are some others doing similar things as below:
> 
> blk_rq_map_kern():
> 
> bio->bi_opf &= ~REQ_OP_MASK;
> 
> bio->bi_opf |= req_op(rq);
> 
> 
> So what about below:
> 
> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
> index c7cadaafa947..bacc7366002f 100644
> --- a/drivers/md/bcache/request.c
> +++ b/drivers/md/bcache/request.c
> @@ -244,7 +244,14 @@ static void bch_data_insert_start(struct closure *cl)
>                  trace_bcache_cache_insert(k);
>                  bch_keylist_push(&op->insert_keys);
> 
> -               bio_set_op_attrs(n, REQ_OP_WRITE, 0);
> +               /*
> +                * n here would be REQ_OP_READ, if
> +                * we are inserting data read from
> +                * backing device in cache miss or
> +                * inserting data in movinggc.
> +                */
> +               n->bi_opf &= ~REQ_OP_MASK;
> +               n->bi_opf |= REQ_OP_WRITE;
>                  bch_submit_bbio(n, op->c, k, 0);
>          } while (n != bio);

Another solution is introducing an accessor to set op only, something 
like bio_set_op(). Then we should keep the bcache patch as what it was 
to fix the bug.

And send another patch to introduce bio_set_op():

diff --git a/block/blk-map.c b/block/blk-map.c
index 6e804892d5ec..83bc33a59fa5 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -587,9 +587,7 @@ static int __blk_rq_map_user_iov(struct request *rq,
         if (IS_ERR(bio))
                 return PTR_ERR(bio);

-       bio->bi_opf &= ~REQ_OP_MASK;
-       bio->bi_opf |= req_op(rq);
-
+       bio_set_op(bio, req_op(rq));
         orig_bio = bio;

         /*
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index eb734f7ddaac..d8839300805e 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -244,7 +244,13 @@ static void bch_data_insert_start(struct closure *cl)
                 trace_bcache_cache_insert(k);
                 bch_keylist_push(&op->insert_keys);

-               n->bi_opf |= REQ_OP_WRITE;
+               /*
+                * n here would be REQ_OP_READ, if
+                * we are inserting data read from
+                * backing device in cache miss or
+                * inserting data in movinggc.
+                */
+               bio_set_op(n, REQ_OP_WRITE);
                 bch_submit_bbio(n, op->c, k, 0);
         } while (n != bio);

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index b3fc5d3dd8ea..2affd3269bdc 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -439,6 +439,12 @@ static inline void bio_set_op_attrs(struct bio 
*bio, unsigned op,
         bio->bi_opf = op | op_flags;
  }

+static inline void bio_set_op(struct bio *bio, unsigned op)
+{
+       bio->bi_opf &= ~REQ_OP_MASK;
+       bio->bi_opf |= op;
+}
+
  static inline bool op_is_write(unsigned int op)
  {
         return (op & 1);

> 
> 
> Thanx
> 
> Yang
> 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-28 11:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25  4:29 [PATCH] bcache: dont reset bio opf in bch_data_insert_start Dongsheng Yang
2021-01-25  4:53 ` Coly Li
2021-01-26  4:32   ` Dongsheng Yang
2021-01-26  4:34     ` Coly Li
2021-01-26  4:41       ` Dongsheng Yang
2021-01-27 17:37 ` Christoph Hellwig
2021-01-28  9:10   ` Dongsheng Yang
2021-01-28 10:45     ` Dongsheng Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).