All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: snitzer@redhat.com, axboe@kernel.dk, linux-block@vger.kernel.org,
	dm-devel@redhat.com
Subject: Re: [PATCH] block: introduce QUEUE_FLAG_POLL_CAP flag
Date: Mon, 19 Apr 2021 21:36:18 +0800	[thread overview]
Message-ID: <YH2HUs8FV0pNDluk@T590> (raw)
In-Reply-To: <c0085cb4-2396-b0c2-c880-c6fa8fb7e491@linux.alibaba.com>

On Mon, Apr 19, 2021 at 01:40:21PM +0800, JeffleXu wrote:
> 
> 
> On 4/19/21 10:21 AM, Ming Lei wrote:
> > On Sat, Apr 17, 2021 at 10:06:53PM +0800, JeffleXu wrote:
> >>
> >>
> >> On 4/16/21 5:07 PM, Ming Lei wrote:
> >>> On Fri, Apr 16, 2021 at 04:00:37PM +0800, Jeffle Xu wrote:
> >>>> Hi,
> >>>> How about this patch to remove the extra poll_capable() method?
> >>>>
> >>>> And the following 'dm: support IO polling for bio-based dm device' needs
> >>>> following change.
> >>>>
> >>>> ```
> >>>> +       /*
> >>>> +        * Check for request-based device is remained to
> >>>> +        * dm_mq_init_request_queue()->blk_mq_init_allocated_queue().
> >>>> +        * For bio-based device, only set QUEUE_FLAG_POLL when all underlying
> >>>> +        * devices supporting polling.
> >>>> +        */
> >>>> +       if (__table_type_bio_based(t->type)) {
> >>>> +               if (dm_table_supports_poll(t)) {
> >>>> +                       blk_queue_flag_set(QUEUE_FLAG_POLL_CAP, q);
> >>>> +                       blk_queue_flag_set(QUEUE_FLAG_POLL, q);
> >>>> +               }
> >>>> +               else {
> >>>> +                       blk_queue_flag_clear(QUEUE_FLAG_POLL, q);
> >>>> +                       blk_queue_flag_clear(QUEUE_FLAG_POLL_CAP, q);
> >>>> +               }
> >>>> +       }
> >>>> ```
> >>>
> >>> Frankly speaking, I don't see any value of using QUEUE_FLAG_POLL_CAP for
> >>> DM, and the result is basically subset of treating DM as always being capable
> >>> of polling.
> >>>
> >>> Also underlying queue change(either limits or flag) won't be propagated
> >>> to DM/MD automatically. Strictly speaking it doesn't matter if all underlying
> >>> queues are capable of supporting polling at the exact time of 'write sysfs/poll',
> >>> cause any of them may change in future.
> >>>
> >>> So why not start with the simplest approach(always capable of polling)
> >>> which does meet normal bio based polling requirement?
> >>>
> >>
> >> I find one scenario where this issue may matter. Consider the scenario
> >> where HIPRI bios are submitted to DM device though **all** underlying
> >> devices has been disabled for polling. In this case, a **valid** cookie
> >> (pid of current submitting process) is still returned. Then if @spin of
> >> the following blk_poll() is true, blk_poll() will get stuck in dead loop
> >> because blk_mq_poll() always returns 0, since previously submitted bios
> >> are all enqueued into IRQ hw queue.
> >>
> >> Maybe you need to re-remove the bio from the poll context if the
> >> returned cookie is BLK_QC_T_NONE?
> > 
> > It won't be one issue, see blk_bio_poll_preprocess() which is called
> > from submit_bio_checks(), so any bio's HIPRI will be cleared if the
> > queue doesn't support POLL, that code does cover underlying bios.
> 
> Sorry there may be some confusion in my description. Let's discuss in
> the following scenario: MD/DM advertise QUEUE_FLAG_POLL, though **all**
> underlying devices are without QUEUE_FLAG_POLL. This scenario is
> possible, if you want to enable MD/DM's polling without checking the
> capability of underlying devices.
> 
> In this case, it seems that REQ_HIPRI is kept for both MD/DM and
> underlying blk-mq devices. I used to think that REQ_HIPRI will be
> cleared for underlying blk-mq deivces, but now it seems that REQ_HIPRI
> of bios submitted to underlying blk-mq deivces won't be cleared, since
> submit_bio_checks() is only called in the entry of submit_bio(), not in
> the while() loop of __submit_bio_noacct_ctx(). Though these underlying
> blk-mq devices don't support IO polling at all, or they all have been
> disabled for polling, REQ_HIPRI bios are finally submitted down.
> 
> Or do I miss something?

No matter the loop, the bios are actually submitted to the
current->bio_list via submit_bio_noacct() or submit_bio().
'grep -r submit_bio drivers/md' will show you the point.

Also it is a bug if one underlying bio is submitted without being checked.

You can observe it by the following bpftrace when you run io_uring on dm
disk:

#include <linux/blkdev.h>

kprobe:blk_mq_submit_bio
/strncmp(((struct bio *)arg0)->bi_bdev->bd_disk->disk_name, "nvme", 4) == 0/
{
	$b = (struct bio *)arg0;
	$hipri = $b->bi_opf & (1 << __REQ_HIPRI);

	printf("%s %d: %s %lu %lu high prio %d\n", comm, tid, $b->bi_bdev->bd_disk->disk_name,
		$b->bi_iter.bi_sector, $b->bi_iter.bi_size, $hipri);
}


> 
> 
> > 
> >>
> >>
> >> Something like:
> >>
> >> -static blk_qc_t __submit_bio_noacct(struct bio *bio)
> >> +static blk_qc_t __submit_bio_noacct_ctx(struct bio *bio, struct
> >> io_context *ioc)
> >>  {
> >>  	struct bio_list bio_list_on_stack[2];
> >>  	blk_qc_t ret = BLK_QC_T_NONE;
> >> @@ -1047,7 +1163,15 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
> >>  		bio_list_on_stack[1] = bio_list_on_stack[0];
> >>  		bio_list_init(&bio_list_on_stack[0]);
> >>
> >> 		if (ioc && queue_is_mq(q) && (bio->bi_opf & REQ_HIPRI)) {
> > 
> > REQ_HIPRI won't be set for underlying bios which queue doesn't support
> > poll, so this branch won't be reached. 
> 
> Sorry I missed the '(bio->bi_opf & REQ_HIPRI)' condition here. Indeed
> bio without REQ_HIPRI won't be enqueued into the poll_context.

Even though these bios are queued, blk_poll() still can handle them
easily by just ignoring queues which aren't POLL_TYPE. However, I still
think their HIPRI will be cleared.

Thanks,
Ming


WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com>
To: JeffleXu <jefflexu@linux.alibaba.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	dm-devel@redhat.com, snitzer@redhat.com
Subject: Re: [dm-devel] [PATCH] block: introduce QUEUE_FLAG_POLL_CAP flag
Date: Mon, 19 Apr 2021 21:36:18 +0800	[thread overview]
Message-ID: <YH2HUs8FV0pNDluk@T590> (raw)
In-Reply-To: <c0085cb4-2396-b0c2-c880-c6fa8fb7e491@linux.alibaba.com>

On Mon, Apr 19, 2021 at 01:40:21PM +0800, JeffleXu wrote:
> 
> 
> On 4/19/21 10:21 AM, Ming Lei wrote:
> > On Sat, Apr 17, 2021 at 10:06:53PM +0800, JeffleXu wrote:
> >>
> >>
> >> On 4/16/21 5:07 PM, Ming Lei wrote:
> >>> On Fri, Apr 16, 2021 at 04:00:37PM +0800, Jeffle Xu wrote:
> >>>> Hi,
> >>>> How about this patch to remove the extra poll_capable() method?
> >>>>
> >>>> And the following 'dm: support IO polling for bio-based dm device' needs
> >>>> following change.
> >>>>
> >>>> ```
> >>>> +       /*
> >>>> +        * Check for request-based device is remained to
> >>>> +        * dm_mq_init_request_queue()->blk_mq_init_allocated_queue().
> >>>> +        * For bio-based device, only set QUEUE_FLAG_POLL when all underlying
> >>>> +        * devices supporting polling.
> >>>> +        */
> >>>> +       if (__table_type_bio_based(t->type)) {
> >>>> +               if (dm_table_supports_poll(t)) {
> >>>> +                       blk_queue_flag_set(QUEUE_FLAG_POLL_CAP, q);
> >>>> +                       blk_queue_flag_set(QUEUE_FLAG_POLL, q);
> >>>> +               }
> >>>> +               else {
> >>>> +                       blk_queue_flag_clear(QUEUE_FLAG_POLL, q);
> >>>> +                       blk_queue_flag_clear(QUEUE_FLAG_POLL_CAP, q);
> >>>> +               }
> >>>> +       }
> >>>> ```
> >>>
> >>> Frankly speaking, I don't see any value of using QUEUE_FLAG_POLL_CAP for
> >>> DM, and the result is basically subset of treating DM as always being capable
> >>> of polling.
> >>>
> >>> Also underlying queue change(either limits or flag) won't be propagated
> >>> to DM/MD automatically. Strictly speaking it doesn't matter if all underlying
> >>> queues are capable of supporting polling at the exact time of 'write sysfs/poll',
> >>> cause any of them may change in future.
> >>>
> >>> So why not start with the simplest approach(always capable of polling)
> >>> which does meet normal bio based polling requirement?
> >>>
> >>
> >> I find one scenario where this issue may matter. Consider the scenario
> >> where HIPRI bios are submitted to DM device though **all** underlying
> >> devices has been disabled for polling. In this case, a **valid** cookie
> >> (pid of current submitting process) is still returned. Then if @spin of
> >> the following blk_poll() is true, blk_poll() will get stuck in dead loop
> >> because blk_mq_poll() always returns 0, since previously submitted bios
> >> are all enqueued into IRQ hw queue.
> >>
> >> Maybe you need to re-remove the bio from the poll context if the
> >> returned cookie is BLK_QC_T_NONE?
> > 
> > It won't be one issue, see blk_bio_poll_preprocess() which is called
> > from submit_bio_checks(), so any bio's HIPRI will be cleared if the
> > queue doesn't support POLL, that code does cover underlying bios.
> 
> Sorry there may be some confusion in my description. Let's discuss in
> the following scenario: MD/DM advertise QUEUE_FLAG_POLL, though **all**
> underlying devices are without QUEUE_FLAG_POLL. This scenario is
> possible, if you want to enable MD/DM's polling without checking the
> capability of underlying devices.
> 
> In this case, it seems that REQ_HIPRI is kept for both MD/DM and
> underlying blk-mq devices. I used to think that REQ_HIPRI will be
> cleared for underlying blk-mq deivces, but now it seems that REQ_HIPRI
> of bios submitted to underlying blk-mq deivces won't be cleared, since
> submit_bio_checks() is only called in the entry of submit_bio(), not in
> the while() loop of __submit_bio_noacct_ctx(). Though these underlying
> blk-mq devices don't support IO polling at all, or they all have been
> disabled for polling, REQ_HIPRI bios are finally submitted down.
> 
> Or do I miss something?

No matter the loop, the bios are actually submitted to the
current->bio_list via submit_bio_noacct() or submit_bio().
'grep -r submit_bio drivers/md' will show you the point.

Also it is a bug if one underlying bio is submitted without being checked.

You can observe it by the following bpftrace when you run io_uring on dm
disk:

#include <linux/blkdev.h>

kprobe:blk_mq_submit_bio
/strncmp(((struct bio *)arg0)->bi_bdev->bd_disk->disk_name, "nvme", 4) == 0/
{
	$b = (struct bio *)arg0;
	$hipri = $b->bi_opf & (1 << __REQ_HIPRI);

	printf("%s %d: %s %lu %lu high prio %d\n", comm, tid, $b->bi_bdev->bd_disk->disk_name,
		$b->bi_iter.bi_sector, $b->bi_iter.bi_size, $hipri);
}


> 
> 
> > 
> >>
> >>
> >> Something like:
> >>
> >> -static blk_qc_t __submit_bio_noacct(struct bio *bio)
> >> +static blk_qc_t __submit_bio_noacct_ctx(struct bio *bio, struct
> >> io_context *ioc)
> >>  {
> >>  	struct bio_list bio_list_on_stack[2];
> >>  	blk_qc_t ret = BLK_QC_T_NONE;
> >> @@ -1047,7 +1163,15 @@ static blk_qc_t __submit_bio_noacct(struct bio *bio)
> >>  		bio_list_on_stack[1] = bio_list_on_stack[0];
> >>  		bio_list_init(&bio_list_on_stack[0]);
> >>
> >> 		if (ioc && queue_is_mq(q) && (bio->bi_opf & REQ_HIPRI)) {
> > 
> > REQ_HIPRI won't be set for underlying bios which queue doesn't support
> > poll, so this branch won't be reached. 
> 
> Sorry I missed the '(bio->bi_opf & REQ_HIPRI)' condition here. Indeed
> bio without REQ_HIPRI won't be enqueued into the poll_context.

Even though these bios are queued, blk_poll() still can handle them
easily by just ignoring queues which aren't POLL_TYPE. However, I still
think their HIPRI will be cleared.

Thanks,
Ming

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-04-19 13:37 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  2:19 [PATCH V5 00/12] block: support bio based io polling Ming Lei
2021-04-01  2:19 ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 01/12] block: add helper of blk_queue_poll Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 02/12] block: add one helper to free io_context Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 03/12] block: create io poll context for submission and poll task Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12 10:19   ` Christoph Hellwig
2021-04-12 10:19     ` [dm-devel] " Christoph Hellwig
2021-04-01  2:19 ` [PATCH V5 04/12] block: add req flag of REQ_POLL_CTX Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 05/12] block: add new field into 'struct bvec_iter' Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12  9:26   ` Christoph Hellwig
2021-04-12  9:26     ` [dm-devel] " Christoph Hellwig
2021-04-13  9:36     ` Ming Lei
2021-04-13  9:36       ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 06/12] block/mq: extract one helper function polling hw queue Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12  9:29   ` Christoph Hellwig
2021-04-12  9:29     ` [dm-devel] " Christoph Hellwig
2021-04-01  2:19 ` [PATCH V5 07/12] block: prepare for supporting bio_list via other link Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12 10:18   ` Christoph Hellwig
2021-04-12 10:18     ` [dm-devel] " Christoph Hellwig
2021-04-12 11:37     ` Ming Lei
2021-04-12 11:37       ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 08/12] block: use per-task poll context to implement bio based io polling Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12  9:54   ` Christoph Hellwig
2021-04-12  9:54     ` [dm-devel] " Christoph Hellwig
2021-04-12 10:20     ` Ming Lei
2021-04-12 10:20       ` [dm-devel] " Ming Lei
2021-04-12 10:29       ` Christoph Hellwig
2021-04-12 10:29         ` [dm-devel] " Christoph Hellwig
2021-04-12 11:42         ` Ming Lei
2021-04-12 11:42           ` [dm-devel] " Ming Lei
2021-04-12 10:16   ` Christoph Hellwig
2021-04-12 10:16     ` [dm-devel] " Christoph Hellwig
2021-04-12 10:37     ` Ming Lei
2021-04-12 10:37       ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 09/12] blk-mq: limit hw queues to be polled in each blk_poll() Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-01  2:19 ` [PATCH V5 10/12] block: add queue_to_disk() to get gendisk from request_queue Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12 12:52   ` Jens Axboe
2021-04-12 12:52     ` [dm-devel] " Jens Axboe
2021-04-01  2:19 ` [PATCH V5 11/12] block: add poll_capable method to support bio-based IO polling Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-12  9:38   ` Christoph Hellwig
2021-04-12  9:38     ` [dm-devel] " Christoph Hellwig
2021-04-14  8:38     ` JeffleXu
2021-04-14  8:38       ` [dm-devel] " JeffleXu
2021-04-14 11:24       ` Ming Lei
2021-04-14 11:24         ` [dm-devel] " Ming Lei
2021-04-15  1:34         ` JeffleXu
2021-04-15  1:34           ` [dm-devel] " JeffleXu
2021-04-15  7:43           ` Ming Lei
2021-04-15  7:43             ` [dm-devel] " Ming Lei
2021-04-15  9:21             ` JeffleXu
2021-04-15  9:21               ` [dm-devel] " JeffleXu
2021-04-15 10:06               ` Ming Lei
2021-04-15 10:06                 ` [dm-devel] " Ming Lei
2021-04-15 11:21                 ` JeffleXu
2021-04-15 11:21                   ` [dm-devel] " JeffleXu
2021-04-15 13:08                   ` Ming Lei
2021-04-15 13:08                     ` [dm-devel] " Ming Lei
2021-04-16  8:00   ` [PATCH] block: introduce QUEUE_FLAG_POLL_CAP flag Jeffle Xu
2021-04-16  8:00     ` [dm-devel] " Jeffle Xu
2021-04-16  8:42     ` JeffleXu
2021-04-16  8:42       ` [dm-devel] " JeffleXu
2021-04-16  9:07     ` Ming Lei
2021-04-16  9:07       ` [dm-devel] " Ming Lei
2021-04-16 10:20       ` JeffleXu
2021-04-16 10:20         ` [dm-devel] " JeffleXu
2021-04-17 14:06       ` JeffleXu
2021-04-17 14:06         ` [dm-devel] " JeffleXu
2021-04-19  2:21         ` Ming Lei
2021-04-19  2:21           ` [dm-devel] " Ming Lei
2021-04-19  5:40           ` JeffleXu
2021-04-19  5:40             ` [dm-devel] " JeffleXu
2021-04-19 13:36             ` Ming Lei [this message]
2021-04-19 13:36               ` Ming Lei
2021-04-20  7:25               ` JeffleXu
2021-04-20  7:25                 ` [dm-devel] " JeffleXu
2021-04-01  2:19 ` [PATCH V5 12/12] dm: support IO polling for bio-based dm device Ming Lei
2021-04-01  2:19   ` [dm-devel] " Ming Lei
2021-04-09 15:39 ` [PATCH V5 00/12] block: support bio based io polling Ming Lei
2021-04-09 15:39   ` [dm-devel] " Ming Lei
2021-04-12  9:46 ` Christoph Hellwig
2021-04-12  9:46   ` [dm-devel] " Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YH2HUs8FV0pNDluk@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=jefflexu@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.