From: Ming Lei <email@example.com> To: John Garry <firstname.lastname@example.org> Cc: Jens Axboe <email@example.com>, firstname.lastname@example.org, Bart Van Assche <email@example.com>, Hannes Reinecke <firstname.lastname@example.org>, Christoph Hellwig <email@example.com>, Thomas Gleixner <firstname.lastname@example.org>, Keith Busch <email@example.com> Subject: Re: [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug Date: Mon, 21 Oct 2019 17:34:49 +0800 [thread overview] Message-ID: <20191021093448.GA22635@ming.t460p> (raw) In-Reply-To: <firstname.lastname@example.org> On Mon, Oct 21, 2019 at 10:19:18AM +0100, John Garry wrote: > On 20/10/2019 11:14, Ming Lei wrote: > > > > ght? If so, I need to find some simple sysfs entry to > > > > > > tell me of this occurrence, to trigger the capture. Or add something. My > > > > > > script is pretty dump. > > > > > > > > > > > > BTW, I did notice that we the dump_stack in __blk_mq_run_hw_queue() > > > > > > pretty soon before the problem happens - maybe a clue or maybe coincidence. > > > > > > > > > > > > > > I finally got to capture that debugfs dump at the point the SCSI IOs > > > > timeout, as attached. Let me know if any problem receiving it. > > > > > > > > Here's a kernel log snippet at that point (I added some print for the > > > > timeout): > > > > > > > > 609] psci: CPU6 killed. > > > > [ 547.722217] CPU5: shutdown > > > > [ 547.724926] psci: CPU5 killed. > > > > [ 547.749951] irq_shutdown > > > > [ 547.752701] IRQ 800: no longer affine to CPU4 > > > > [ 547.757265] CPU4: shutdown > > > > [ 547.759971] psci: CPU4 killed. > > > > [ 547.790348] CPU3: shutdown > > > > [ 547.793052] psci: CPU3 killed. > > > > [ 547.818330] CPU2: shutdown > > > > [ 547.821033] psci: CPU2 killed. > > > > [ 547.854285] CPU1: shutdown > > > > [ 547.856989] psci: CPU1 killed. > > > > [ 575.925307] scsi_timeout req=0xffff0023b0dd9c00 reserved=0 > > > > [ 575.930794] scsi_timeout req=0xffff0023b0df2700 reserved=0 > > > From the debugfs log, 66 requests are dumped, and 63 of them has > > been submitted to device, and the other 3 is in ->dispatch list > > via requeue after timeout is handled. > > > > Hi Ming, > > > You mentioned that: > > > > " - I added some debug prints in blk_mq_hctx_drain_inflight_rqs() for when > > inflights rqs !=0, and I don't see them for this timeout" > > > > There might be two reasons: > > > > 1) You are still testing a multiple reply-queue device? > > As before, I am testing by exposing mutliple queues to the SCSI midlayer. I > had to make this change locally, as on mainline we still only expose a > single queue and use the internal reply queue when enabling managed > interrupts. > > As I > > mentioned last times, it is hard to map reply-queue into blk-mq > > hctx correctly. > > Here's my branch, if you want to check: > > https://github.com/hisilicon/kernel-dev/commits/private-topic-sas-5.4-mq-v4 > > It's a bit messy (sorry), but you can see that the reply-queue in the LLDD > is removed in commit 087b95af374. > > I am now thinking of actually making this change to the LLDD in mainline to > avoid any doubt in future. As I mentioned last time, you do share tags among all MQ queues on your hardware given your hardware is actually SQ HBA, so commit 087b95af374 is definitely wrong, isn't it? It can be very hard to partition the single tags among multiple hctx. Thanks, Ming
next prev parent reply other threads:[~2019-10-21 9:35 UTC|newest] Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-10-14 1:50 Ming Lei 2019-10-14 1:50 ` [PATCH V4 1/5] blk-mq: add new state of BLK_MQ_S_INTERNAL_STOPPED Ming Lei 2019-10-14 1:50 ` [PATCH V4 2/5] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei 2019-10-14 1:50 ` [PATCH V4 3/5] blk-mq: stop to handle IO and drain IO before hctx becomes dead Ming Lei 2019-11-28 9:29 ` John Garry 2019-10-14 1:50 ` [PATCH V4 4/5] blk-mq: re-submit IO in case that hctx is dead Ming Lei 2019-10-14 1:50 ` [PATCH V4 5/5] blk-mq: handle requests dispatched from IO scheduler " Ming Lei 2019-10-16 8:58 ` [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug John Garry 2019-10-16 12:07 ` Ming Lei 2019-10-16 16:19 ` John Garry [not found] ` <email@example.com> 2019-10-20 10:14 ` Ming Lei 2019-10-21 9:19 ` John Garry 2019-10-21 9:34 ` Ming Lei [this message] 2019-10-21 9:47 ` John Garry 2019-10-21 10:24 ` Ming Lei 2019-10-21 11:49 ` John Garry 2019-10-21 12:53 ` Ming Lei 2019-10-21 14:02 ` John Garry 2019-10-22 0:16 ` Ming Lei 2019-10-22 11:19 ` John Garry 2019-10-22 13:45 ` Ming Lei 2019-10-25 16:33 ` John Garry 2019-10-28 10:42 ` Ming Lei 2019-10-28 11:55 ` John Garry 2019-10-29 1:50 ` Ming Lei 2019-10-29 9:22 ` John Garry 2019-10-29 10:05 ` Ming Lei 2019-10-29 17:54 ` John Garry 2019-10-31 16:28 ` John Garry 2019-11-28 1:09 ` chenxiang (M) 2019-11-28 2:02 ` Ming Lei 2019-11-28 10:45 ` John Garry
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191021093448.GA22635@ming.t460p \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).