linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	linux-block <linux-block@vger.kernel.org>,
	Bart Van Assche <bvanassche@acm.org>,
	"Hannes Reinecke" <hare@suse.com>, Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Keith Busch <keith.busch@intel.com>
Subject: Re: [PATCH V3 0/5] blk-mq: improvement on handling IO during CPU hotplug
Date: Fri, 11 Oct 2019 15:10:03 +0100	[thread overview]
Message-ID: <b1a561c1-9594-cc25-dcab-bad5c342264f@huawei.com> (raw)
In-Reply-To: <CACVXFVN2K-GYTdSwXZ2fZ9=Kgq+jXa3RCkqw+v_DcvaFBvgpew@mail.gmail.com>

On 11/10/2019 12:55, Ming Lei wrote:
> On Fri, Oct 11, 2019 at 4:54 PM John Garry <john.garry@huawei.com> wrote:
>>
>> On 10/10/2019 12:21, John Garry wrote:
>>>
>>>>
>>>> As discussed before, tags of hisilicon V3 is HBA wide. If you switch
>>>> to real hw queue, each hw queue has to own its independent tags.
>>>> However, that isn't supported by V3 hardware.
>>>
>>> I am generating the tag internally in the driver now, so that hostwide
>>> tags issue should not be an issue.
>>>
>>> And, to be clear, I am not paying too much attention to performance, but
>>> rather just hotplugging while running IO.
>>>
>>> An update on testing:
>>> I did some scripted overnight testing. The script essentially loops like
>>> this:
>>> - online all CPUS
>>> - run fio binded on a limited bunch of CPUs to cover a hctx mask for 1
>>> minute
>>> - offline those CPUs
>>> - wait 1 minute (> SCSI or NVMe timeout)
>>> - and repeat
>>>
>>> SCSI is actually quite stable, but NVMe isn't. For NVMe I am finding
>>> some fio processes never dying with IOPS @ 0. I don't see any NVMe
>>> timeout reported. Did you do any NVMe testing of this sort?
>>>
>>
>> Yeah, so for NVMe, I see some sort of regression, like this:
>> Jobs: 1 (f=1): [_R] [0.0% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 1158037877d:17h:18m:22s]
>
> I can reproduce this issue, and looks there are requests in ->dispatch.

OK, that may match with what I see:
- the problem occuring coincides with this callpath with 
BLK_MQ_S_INTERNAL_STOPPED set:

blk_mq_request_bypass_insert
(__)blk_mq_try_issue_list_directly
blk_mq_sched_insert_requests
blk_mq_flush_plug_list
blk_flush_plug_list
blk_finish_plug
blkdev_direct_IO
generic_file_read_iter
blkdev_read_iter
aio_read
io_submit_one

blk_mq_request_bypass_insert() adds to the dispatch list, and looking at 
debugfs, could this be that dispatched request sitting:
root@(none)$ more /sys/kernel/debug/block/nvme0n1/hctx18/dispatch
00000000ac28511d {.op=READ, .cmd_flags=, .rq_flags=IO_STAT, .state=idle, 
.tag=56, .internal_tag=-1}

So could there be some race here?

> I am a bit busy this week, please feel free to investigate it and debugfs
> can help you much. I may have time next week for looking this issue.
>

OK, appreciated

John

> Thanks,
> Ming Lei
>
>



  reply	other threads:[~2019-10-11 14:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  4:18 [PATCH V3 0/5] blk-mq: improvement on handling IO during CPU hotplug Ming Lei
2019-10-08  4:18 ` [PATCH V3 1/5] blk-mq: add new state of BLK_MQ_S_INTERNAL_STOPPED Ming Lei
2019-10-08  4:18 ` [PATCH V3 2/5] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei
2019-10-08  4:18 ` [PATCH V3 3/5] blk-mq: stop to handle IO and drain IO before hctx becomes dead Ming Lei
2019-10-08 17:03   ` John Garry
2019-10-08  4:18 ` [PATCH V3 4/5] blk-mq: re-submit IO in case that hctx is dead Ming Lei
2019-10-08  4:18 ` [PATCH V3 5/5] blk-mq: handle requests dispatched from IO scheduler " Ming Lei
2019-10-08  9:06 ` [PATCH V3 0/5] blk-mq: improvement on handling IO during CPU hotplug John Garry
2019-10-08 17:15   ` John Garry
2019-10-09  8:39     ` Ming Lei
2019-10-09  8:49       ` John Garry
2019-10-10 10:30         ` Ming Lei
2019-10-10 11:21           ` John Garry
2019-10-11  8:51             ` John Garry
2019-10-11 11:55               ` Ming Lei
2019-10-11 14:10                 ` John Garry [this message]
2019-10-14  1:25                   ` Ming Lei
2019-10-14  8:29                     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1a561c1-9594-cc25-dcab-bad5c342264f@huawei.com \
    --to=john.garry@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).