All of lore.kernel.org
 help / color / mirror / Atom feed
From: JeffleXu <jefflexu@linux.alibaba.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, caspar@linux.alibaba.com,
	linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com,
	dm-devel@redhat.com, mpatocka@redhat.com,
	io-uring@vger.kernel.org
Subject: Re: [dm-devel] [PATCH v5 10/12] block: fastpath for bio-based polling
Date: Fri, 12 Mar 2021 10:26:24 +0800	[thread overview]
Message-ID: <99a31346-0720-d586-4e9f-ba6ee855f9b5@linux.alibaba.com> (raw)
In-Reply-To: <b6f0799a-1a11-3778-9b8a-702c3c103db5@linux.alibaba.com>



On 3/11/21 2:36 PM, JeffleXu wrote:
> 
> 
> On 3/11/21 7:18 AM, Mike Snitzer wrote:
>> On Wed, Mar 03 2021 at  6:57am -0500,
>> Jeffle Xu <jefflexu@linux.alibaba.com> wrote:
>>
>>> Offer one fastpath for bio-based polling when bio submitted to dm
>>> device is not split.
>>>
>>> In this case, there will be only one bio submitted to only one polling
>>> hw queue of one underlying mq device, and thus we don't need to track
>>> all split bios or iterate through all polling hw queues. The pointer to
>>> the polling hw queue the bio submitted to is returned here as the
>>> returned cookie. In this case, the polling routine will call
>>> mq_ops->poll() directly with the hw queue converted from the input
>>> cookie.
>>>
>>> If the original bio submitted to dm device is split to multiple bios and
>>> thus submitted to multiple polling hw queues, the polling routine will
>>> fall back to iterating all hw queues (in polling mode) of all underlying
>>> mq devices.
>>>
>>> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
>>> ---
>>>  block/blk-core.c          | 73 +++++++++++++++++++++++++++++++++++++--
>>>  include/linux/blk_types.h | 66 +++++++++++++++++++++++++++++++++--
>>>  include/linux/types.h     |  2 +-
>>>  3 files changed, 135 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 6d7d53030d7c..e5cd4ff08f5c 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -947,14 +947,22 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
>>>  {
>>>  	struct bio_list bio_list_on_stack[2];
>>>  	blk_qc_t ret = BLK_QC_T_NONE;
>>> +	struct request_queue *top_q;
>>> +	bool poll_on;
>>>  
>>>  	BUG_ON(bio->bi_next);
>>>  
>>>  	bio_list_init(&bio_list_on_stack[0]);
>>>  	current->bio_list = bio_list_on_stack;
>>>  
>>> +	top_q = bio->bi_bdev->bd_disk->queue;
>>> +	poll_on = test_bit(QUEUE_FLAG_POLL, &top_q->queue_flags) &&
>>> +		  (bio->bi_opf & REQ_HIPRI);
>>> +
>>>  	do {
>>> -		struct request_queue *q = bio->bi_bdev->bd_disk->queue;
>>> +		blk_qc_t cookie;
>>> +		struct block_device *bdev = bio->bi_bdev;
>>> +		struct request_queue *q = bdev->bd_disk->queue;
>>>  		struct bio_list lower, same;
>>>  
>>>  		if (unlikely(bio_queue_enter(bio) != 0))
>>> @@ -966,7 +974,23 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
>>>  		bio_list_on_stack[1] = bio_list_on_stack[0];
>>>  		bio_list_init(&bio_list_on_stack[0]);
>>>  
>>> -		ret = __submit_bio(bio);
>>> +		cookie = __submit_bio(bio);
>>> +
>>> +		if (poll_on && blk_qc_t_valid(cookie)) {
>>> +			unsigned int queue_num = blk_qc_t_to_queue_num(cookie);
>>> +			unsigned int devt = bdev_whole(bdev)->bd_dev;
>>> +
>>> +			cookie = blk_qc_t_get_by_devt(devt, queue_num);
>>
>> The need to rebuild the cookie here is pretty awkward.  This
>> optimization living in block core may be worthwhile but the duality of
>> block core conditionally overriding the driver's returned cookie (that
>> is meant to be opaque to upper layer) is not great.
> 
> I agree that currently this design pattern has caused blk-core and dm
> being tightly coupled together. Maybe it's the most serous problem of
> this patchset, personally.
> 
> I can explain it though... Since the nature of the IO path of dm, dm
> itself doesn't know if the original bio be split to multiple split bios
> and thus submitted to multiple underlying devices or not. Currently I
> can only get this information in __submit_bio_noacct(), and that's why
> there's so much logic specific to dm is coupled with blk-core here.
> 

I didn't point out the most important part of that. dm can't get the
(really valid) cookie returned by mq. Suppose one dm device is built
upon one nvme, then dm_submit_bio() only remaps and the remapped bio is
actually *buffered* in the bio_list. In fact, he remapped bio is later
submitted in __submit_bio_noacct(). So dm doesn't know the cookie
(returned by underlying mq), while blk-core does.

dm could "predict" the cookie of following submitted IO to mq (dev_t and
index of hw queue), and return it (built by dev_t and hw queue index) in
advance, but this "prediction" is quite fragile, since the IO submitting
process could be migrated to another CPU, or the hw queue mapping of the
underlying mq device could change before __submit_bio_noacct() really
submits the IO.


-- 
Thanks,
Jeffle

WARNING: multiple messages have this Message-ID (diff)
From: JeffleXu <jefflexu@linux.alibaba.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, caspar@linux.alibaba.com,
	linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com,
	dm-devel@redhat.com, mpatocka@redhat.com,
	io-uring@vger.kernel.org
Subject: Re: [dm-devel] [PATCH v5 10/12] block: fastpath for bio-based polling
Date: Fri, 12 Mar 2021 10:26:24 +0800	[thread overview]
Message-ID: <99a31346-0720-d586-4e9f-ba6ee855f9b5@linux.alibaba.com> (raw)
In-Reply-To: <b6f0799a-1a11-3778-9b8a-702c3c103db5@linux.alibaba.com>



On 3/11/21 2:36 PM, JeffleXu wrote:
> 
> 
> On 3/11/21 7:18 AM, Mike Snitzer wrote:
>> On Wed, Mar 03 2021 at  6:57am -0500,
>> Jeffle Xu <jefflexu@linux.alibaba.com> wrote:
>>
>>> Offer one fastpath for bio-based polling when bio submitted to dm
>>> device is not split.
>>>
>>> In this case, there will be only one bio submitted to only one polling
>>> hw queue of one underlying mq device, and thus we don't need to track
>>> all split bios or iterate through all polling hw queues. The pointer to
>>> the polling hw queue the bio submitted to is returned here as the
>>> returned cookie. In this case, the polling routine will call
>>> mq_ops->poll() directly with the hw queue converted from the input
>>> cookie.
>>>
>>> If the original bio submitted to dm device is split to multiple bios and
>>> thus submitted to multiple polling hw queues, the polling routine will
>>> fall back to iterating all hw queues (in polling mode) of all underlying
>>> mq devices.
>>>
>>> Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
>>> ---
>>>  block/blk-core.c          | 73 +++++++++++++++++++++++++++++++++++++--
>>>  include/linux/blk_types.h | 66 +++++++++++++++++++++++++++++++++--
>>>  include/linux/types.h     |  2 +-
>>>  3 files changed, 135 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 6d7d53030d7c..e5cd4ff08f5c 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -947,14 +947,22 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
>>>  {
>>>  	struct bio_list bio_list_on_stack[2];
>>>  	blk_qc_t ret = BLK_QC_T_NONE;
>>> +	struct request_queue *top_q;
>>> +	bool poll_on;
>>>  
>>>  	BUG_ON(bio->bi_next);
>>>  
>>>  	bio_list_init(&bio_list_on_stack[0]);
>>>  	current->bio_list = bio_list_on_stack;
>>>  
>>> +	top_q = bio->bi_bdev->bd_disk->queue;
>>> +	poll_on = test_bit(QUEUE_FLAG_POLL, &top_q->queue_flags) &&
>>> +		  (bio->bi_opf & REQ_HIPRI);
>>> +
>>>  	do {
>>> -		struct request_queue *q = bio->bi_bdev->bd_disk->queue;
>>> +		blk_qc_t cookie;
>>> +		struct block_device *bdev = bio->bi_bdev;
>>> +		struct request_queue *q = bdev->bd_disk->queue;
>>>  		struct bio_list lower, same;
>>>  
>>>  		if (unlikely(bio_queue_enter(bio) != 0))
>>> @@ -966,7 +974,23 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
>>>  		bio_list_on_stack[1] = bio_list_on_stack[0];
>>>  		bio_list_init(&bio_list_on_stack[0]);
>>>  
>>> -		ret = __submit_bio(bio);
>>> +		cookie = __submit_bio(bio);
>>> +
>>> +		if (poll_on && blk_qc_t_valid(cookie)) {
>>> +			unsigned int queue_num = blk_qc_t_to_queue_num(cookie);
>>> +			unsigned int devt = bdev_whole(bdev)->bd_dev;
>>> +
>>> +			cookie = blk_qc_t_get_by_devt(devt, queue_num);
>>
>> The need to rebuild the cookie here is pretty awkward.  This
>> optimization living in block core may be worthwhile but the duality of
>> block core conditionally overriding the driver's returned cookie (that
>> is meant to be opaque to upper layer) is not great.
> 
> I agree that currently this design pattern has caused blk-core and dm
> being tightly coupled together. Maybe it's the most serous problem of
> this patchset, personally.
> 
> I can explain it though... Since the nature of the IO path of dm, dm
> itself doesn't know if the original bio be split to multiple split bios
> and thus submitted to multiple underlying devices or not. Currently I
> can only get this information in __submit_bio_noacct(), and that's why
> there's so much logic specific to dm is coupled with blk-core here.
> 

I didn't point out the most important part of that. dm can't get the
(really valid) cookie returned by mq. Suppose one dm device is built
upon one nvme, then dm_submit_bio() only remaps and the remapped bio is
actually *buffered* in the bio_list. In fact, he remapped bio is later
submitted in __submit_bio_noacct(). So dm doesn't know the cookie
(returned by underlying mq), while blk-core does.

dm could "predict" the cookie of following submitted IO to mq (dev_t and
index of hw queue), and return it (built by dev_t and hw queue index) in
advance, but this "prediction" is quite fragile, since the IO submitting
process could be migrated to another CPU, or the hw queue mapping of the
underlying mq device could change before __submit_bio_noacct() really
submits the IO.


-- 
Thanks,
Jeffle

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-03-12  2:27 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 11:57 [PATCH v5 00/12] dm: support polling Jeffle Xu
2021-03-03 11:57 ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 01/12] block: move definition of blk_qc_t to types.h Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 02/12] block: add queue_to_disk() to get gendisk from request_queue Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 03/12] block: add poll method to support bio-based IO polling Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-10 22:01   ` Mike Snitzer
2021-03-10 22:01     ` [dm-devel] " Mike Snitzer
2021-03-11  5:31     ` JeffleXu
2021-03-11  5:31       ` JeffleXu
2021-03-03 11:57 ` [PATCH v5 04/12] block: add poll_capable " Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-10 22:21   ` Mike Snitzer
2021-03-10 22:21     ` [dm-devel] " Mike Snitzer
2021-03-11  5:43     ` JeffleXu
2021-03-11  5:43       ` JeffleXu
2021-03-03 11:57 ` [PATCH v5 05/12] blk-mq: extract one helper function polling hw queue Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 06/12] blk-mq: add iterator for polling hw queues Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 07/12] blk-mq: add one helper function getting hw queue Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 08/12] dm: always return BLK_QC_T_NONE for bio-based device Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 09/12] nvme/pci: don't wait for locked polling queue Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-10 21:57   ` Mike Snitzer
2021-03-10 21:57     ` [dm-devel] " Mike Snitzer
2021-03-03 11:57 ` [PATCH v5 10/12] block: fastpath for bio-based polling Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-10 23:18   ` Mike Snitzer
2021-03-10 23:18     ` [dm-devel] " Mike Snitzer
2021-03-11  6:36     ` JeffleXu
2021-03-11  6:36       ` JeffleXu
2021-03-12  2:26       ` JeffleXu [this message]
2021-03-12  2:26         ` JeffleXu
2021-03-11 13:56   ` Ming Lei
2021-03-11 13:56     ` [dm-devel] " Ming Lei
2021-03-12  1:56     ` JeffleXu
2021-03-12  1:56       ` JeffleXu
2021-03-03 11:57 ` [PATCH v5 11/12] block: sub-fastpath " Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu
2021-03-03 11:57 ` [PATCH v5 12/12] dm: support IO polling for bio-based dm device Jeffle Xu
2021-03-03 11:57   ` [dm-devel] " Jeffle Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99a31346-0720-d586-4e9f-ba6ee855f9b5@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=caspar@linux.alibaba.com \
    --cc=dm-devel@redhat.com \
    --cc=io-uring@vger.kernel.org \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.