All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: John Garry <john.garry@huawei.com>,
	axboe@kernel.dk, jejb@linux.ibm.com, martin.petersen@oracle.com,
	don.brace@microsemi.com, Sumit Saxena <sumit.saxena@broadcom.com>,
	bvanassche@acm.org, hare@suse.com, hch@lst.de,
	Shivasharan Srikanteshwara 
	<shivasharan.srikanteshwara@broadcom.com>,
	linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	esc.storagedev@microsemi.com, chenxiang66@hisilicon.com,
	"PDL,MEGARAIDLINUX" <megaraidlinux.pdl@broadcom.com>
Subject: Re: [PATCH RFC v7 10/12] megaraid_sas: switch fusion adapters to MQ
Date: Wed, 29 Jul 2020 23:36:48 +0800	[thread overview]
Message-ID: <20200729153648.GA1698748@T590> (raw)
In-Reply-To: <965cf22eea98c00618570da8424d0d94@mail.gmail.com>

On Wed, Jul 29, 2020 at 10:55:22AM +0530, Kashyap Desai wrote:
> > On Tue, Jul 28, 2020 at 08:54:27AM +0100, John Garry wrote:
> > > On 24/07/2020 03:47, Ming Lei wrote:
> > > > On Thu, Jul 23, 2020 at 06:29:01PM +0100, John Garry wrote:
> > > > > > > As I see, since megaraid will have 1:1 mapping of CPU to hw
> > > > > > > queue, will there only ever possibly a single bit set in
> > > > > > > ctx_map? If so, it seems a waste to always check every sbitmap
> > > > > > > map. But adding logic for this may negate any possible gains.
> > > > > >
> > > > > > It really depends on min and max cpu id in the map, then sbitmap
> > > > > > depth can be reduced to (max - min + 1). I'd suggest to double
> > > > > > check that cost of sbitmap_any_bit_set() really matters.
> > > > >
> > > > > Hi Ming,
> > > > >
> > > > > I'm not sure that reducing the search range would help much, as we
> > > > > still need to load some indexes of map[], and at best this may be
> > > > > reduced from 2/3
> > > > > -> 1 elements, depending on nr_cpus.
> > > >
> > > > I believe you misunderstood my idea, and you have to think it from
> > > > implementation viewpoint.
> > > >
> > > > The only workable way is to store the min cpu id as 'offset' and set
> > > > the sbitmap depth as (max - min + 1), isn't it? Then the actual cpu
> > > > id can be figured out via 'offset' + nr_bit. And the whole indexes
> > > > are just spread on the actual depth. BTW, max & min is the max / min
> > > > cpu id in hctx->cpu_map. So we can improve not only on 1:1, and I
> > > > guess most of MQ cases can benefit from the change, since it
> shouldn't be
> > usual for one ctx_map to cover both 0 & nr_cpu_id - 1.
> > > >
> > > > Meantime, we need to allocate the sbitmap dynamically.
> > >
> > > OK, so dynamically allocating the sbitmap could be good. I was
> > > thinking previously that we still allocate for nr_cpus size, and
> > > search a limited range - but this would have heavier runtime overhead.
> > >
> > > So if you really think that this may have some value, then let me
> > > know, so we can look to take it forward.
> >
> > Forget to mention, the in-tree code has been this shape for long time,
> please
> > see sbitmap_resize() called from blk_mq_map_swqueue().
> >
> > Another update is that V4 of 'scsi: core: only re-run queue in
> > scsi_end_request() if device queue is busy' is quite hard to implement
> since
> > commit b4fd63f42647110c9 ("Revert "scsi: core: run queue if SCSI device
> > queue isn't ready and queue is idle").
> 
> Ming -
> 
> Update from my testing. I found only one case of IO stall. I can discuss
> this specific topic if you like to send separate patch. It is too much
> interleaved discussion in this thread.
> 
> I noted you mentioned that V4 of 'scsi: core: only re-run queue in
> scsi_end_request() if device queue is busy' need underlying support of
> "scsi: core: run queue if SCSI device queue isn't ready and queue is idle"
> patch which is already reverted in mainline.

Right.

> Overall idea of running h/w queues conditionally in your patch " scsi:
> core: only re-run queue in scsi_end_request" is still worth. There can be

I agree.

> some race if we use this patch and for that you have concern. Am I
> correct. ?

If the patch of "scsi: core: run queue if SCSI device queue isn't ready and
queue is idle" is re-added, the approach should work. However, it looks a bit
complicated, and I was thinking if one simpler approach can be figured out.

> 
> One of the race I found in my testing is fixed by below patch -
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 54f9015..bcfd33a 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -173,8 +173,10 @@ static int blk_mq_do_dispatch_ctx(struct
> blk_mq_hw_ctx *hctx)
>                 if (!sbitmap_any_bit_set(&hctx->ctx_map))
>                         break;
> 
> -               if (!blk_mq_get_dispatch_budget(hctx))
> +               if (!blk_mq_get_dispatch_budget(hctx)) {
> +                       blk_mq_delay_run_hw_queue(hctx,
> BLK_MQ_BUDGET_DELAY);
>                         break;
> +               }

Actually all hw queues need to be run, instead of this hctx, cause
the budget stuff is request queue wide.

> 
>                 rq = blk_mq_dequeue_from_ctx(hctx, ctx);
>                 if (!rq) {
> 
> 
> In my test setup, I have your V3 'scsi: core: only re-run queue in
> scsi_end_request() if device queue is busy' rebased on 5.8 which does not
> have
> "scsi: core: run queue if SCSI device queue isn't ready and queue is idle"
> since it is already reverted in mainline.

If you added the above patch, I believe you can remove the run queue in
scsi_end_request() unconditionally. However, the delay run queue may
degrade io performance.

Actually the re-run queue in scsi_end_request() is only for dequeuing
request from sw/scheduler queue. And no such issue if request stays in
hctx->dispatch list.

> 
> I have 24 SAS SSD and I reduced QD = 16 so that I hit budget contention
> frequently.  I am running ioscheduler=none.
> If hctx0 has 16 IO inflight (All those IO will be posted to h/q queue
> directly). Next IO (17th IO) will see budget contention and it will be
> queued into s/w queue.
> It is expected that queue will be kicked from scsi_end_request. It is
> possible that one of the  IO completed and it reduce sdev->device_busy,
> but it has not yet reach the code which will kicked the h/w queue.
> Releasing budget and restarting h/w queue is not atomic. At the same time,
> another IO (18th IO) from submission path get the budget and it will be
> posted from below path. This IO will reset sdev->restart and it will not
> allow h/w queue to be restarted from completion path. This will lead one

Maybe re-run queue is needed before resetting sdev->restart if sdev->restart is 1.


Thanks, 
Ming


  reply	other threads:[~2020-07-29 15:37 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10 17:29 [PATCH RFC v7 00/12] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2020-06-10 17:29 ` [PATCH RFC v7 01/12] blk-mq: rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED John Garry
2020-06-10 17:29 ` [PATCH RFC v7 02/12] blk-mq: rename blk_mq_update_tag_set_depth() John Garry
2020-06-11  2:57   ` Ming Lei
2020-06-11  8:26     ` John Garry
2020-06-23 11:25       ` John Garry
2020-06-23 14:23         ` Hannes Reinecke
2020-06-24  8:13           ` Kashyap Desai
2020-06-29 16:18             ` John Garry
2020-08-10 16:51           ` Kashyap Desai
2020-08-11  8:01             ` John Garry
2020-08-11 16:34               ` Kashyap Desai
2020-06-10 17:29 ` [PATCH RFC v7 03/12] blk-mq: Use pointers for blk_mq_tags bitmap tags John Garry
2020-06-10 17:29 ` [PATCH RFC v7 04/12] blk-mq: Facilitate a shared sbitmap per tagset John Garry
2020-06-11  3:37   ` Ming Lei
2020-06-11 10:09     ` John Garry
2020-06-10 17:29 ` [PATCH RFC v7 05/12] blk-mq: Record nr_active_requests per queue for when using shared sbitmap John Garry
2020-06-11  4:04   ` Ming Lei
2020-06-11 10:22     ` John Garry
2020-06-10 17:29 ` [PATCH RFC v7 06/12] blk-mq: Record active_queues_shared_sbitmap per tag_set " John Garry
2020-06-11 13:16   ` Hannes Reinecke
2020-06-11 14:22     ` John Garry
2020-06-10 17:29 ` [PATCH RFC v7 07/12] blk-mq: Add support in hctx_tags_bitmap_show() for a " John Garry
2020-06-11 13:19   ` Hannes Reinecke
2020-06-11 14:33     ` John Garry
2020-06-12  6:06       ` Hannes Reinecke
2020-06-29 15:32         ` About sbitmap_bitmap_show() and cleared bits (was Re: [PATCH RFC v7 07/12] blk-mq: Add support in hctx_tags_bitmap_show() for a shared sbitmap) John Garry
2020-06-30  6:33           ` Hannes Reinecke
2020-06-30  7:30             ` John Garry
2020-06-30 11:36               ` John Garry
2020-06-30 14:55           ` Bart Van Assche
2020-07-13  9:41         ` [PATCH RFC v7 07/12] blk-mq: Add support in hctx_tags_bitmap_show() for a shared sbitmap John Garry
2020-07-13 12:20           ` Hannes Reinecke
2020-06-10 17:29 ` [PATCH RFC v7 08/12] scsi: Add template flag 'host_tagset' John Garry
2020-06-10 17:29 ` [PATCH RFC v7 09/12] scsi: hisi_sas: Switch v3 hw to MQ John Garry
2020-06-10 17:29 ` [PATCH RFC v7 10/12] megaraid_sas: switch fusion adapters " John Garry
2020-07-02 10:23   ` Kashyap Desai
2020-07-06  8:23     ` John Garry
2020-07-06  8:45       ` Hannes Reinecke
2020-07-06  9:26         ` John Garry
2020-07-06  9:40           ` Hannes Reinecke
2020-07-06 19:19       ` Kashyap Desai
2020-07-07  7:58         ` John Garry
2020-07-07 14:45           ` Kashyap Desai
2020-07-07 16:17             ` John Garry
2020-07-09 19:01               ` Kashyap Desai
2020-07-10  8:10                 ` John Garry
2020-07-13  7:55                   ` Kashyap Desai
2020-07-13  8:42                     ` John Garry
2020-07-19 19:07                       ` Kashyap Desai
2020-07-20  7:23                       ` Kashyap Desai
2020-07-20  9:18                         ` John Garry
2020-07-21  1:13                         ` Ming Lei
2020-07-21  6:53                           ` Kashyap Desai
2020-07-22  4:12                             ` Ming Lei
2020-07-22  5:30                               ` Kashyap Desai
2020-07-22  8:04                                 ` Ming Lei
2020-07-22  9:32                                   ` John Garry
2020-07-23 14:07                                     ` Ming Lei
2020-07-23 17:29                                       ` John Garry
2020-07-24  2:47                                         ` Ming Lei
2020-07-28  7:54                                           ` John Garry
2020-07-28  8:45                                             ` Ming Lei
2020-07-29  5:25                                               ` Kashyap Desai
2020-07-29 15:36                                                 ` Ming Lei [this message]
2020-07-29 18:31                                                   ` Kashyap Desai
2020-08-04  8:36                                                     ` Ming Lei
2020-08-04  9:27                                                       ` Kashyap Desai
2020-08-05  8:40                                                         ` Ming Lei
2020-08-06 10:25                                                           ` Kashyap Desai
2020-08-06 13:38                                                             ` Ming Lei
2020-08-06 14:37                                                               ` Kashyap Desai
2020-08-06 15:29                                                                 ` Ming Lei
2020-08-08 19:05                                                                   ` Kashyap Desai
2020-08-09  2:16                                                                     ` Ming Lei
2020-08-10 16:38                                                                       ` Kashyap Desai
2020-08-11  8:09                                                                         ` John Garry
2020-08-04 17:00                                               ` John Garry
2020-08-05  2:56                                                 ` Ming Lei
2020-07-28  8:01                                   ` Kashyap Desai
2020-07-08 11:31         ` John Garry
2020-06-10 17:29 ` [PATCH RFC v7 11/12] smartpqi: enable host tagset John Garry
2020-07-14 13:16   ` John Garry
2020-07-14 13:31     ` John Garry
2020-07-14 18:16       ` Don.Brace
2020-07-15  7:28         ` John Garry
2020-07-14 14:02     ` Hannes Reinecke
2020-08-18  8:33       ` John Garry
2020-06-10 17:29 ` [PATCH RFC v7 12/12] hpsa: enable host_tagset and switch to MQ John Garry
2020-07-14  7:37   ` John Garry
2020-07-14  7:41     ` Hannes Reinecke
2020-07-14  7:52       ` John Garry
2020-07-14  8:06         ` Ming Lei
2020-07-14  9:53           ` John Garry
2020-07-14 10:14             ` Ming Lei
2020-07-14 10:43               ` Hannes Reinecke
2020-07-14 10:19             ` Hannes Reinecke
2020-07-14 10:35               ` John Garry
2020-07-14 10:44               ` Ming Lei
2020-07-14 10:52                 ` John Garry
2020-07-14 12:04                   ` Ming Lei
2020-08-03 20:39         ` Don.Brace
2020-08-04  9:27           ` John Garry
2020-08-04 15:18             ` Don.Brace
2020-08-05 11:21               ` John Garry
2020-08-14 21:04                 ` Don.Brace
2020-08-17  8:00                   ` John Garry
2020-08-17 18:39                     ` Don.Brace
2020-08-18  7:14                       ` Hannes Reinecke
2020-07-16 16:14     ` Don.Brace
2020-07-16 19:45     ` Don.Brace
2020-07-17 10:11       ` John Garry
2020-06-11  3:07 ` [PATCH RFC v7 00/12] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs Ming Lei
2020-06-11  9:35   ` John Garry
2020-06-12 18:47     ` Kashyap Desai
2020-06-15  2:13       ` Ming Lei
2020-06-15  6:57         ` Kashyap Desai
2020-06-16  1:00           ` Ming Lei
2020-06-17 11:26             ` Kashyap Desai
2020-06-22  6:24               ` Hannes Reinecke
2020-06-23  0:55                 ` Ming Lei
2020-06-23 11:50                   ` Kashyap Desai
2020-06-23 12:11                   ` Kashyap Desai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200729153648.GA1698748@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=don.brace@microsemi.com \
    --cc=esc.storagedev@microsemi.com \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=megaraidlinux.pdl@broadcom.com \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.