From: Hannes Reinecke <hare@suse.de>
To: Balsundar.P@microchip.com, martin.petersen@oracle.com
Cc: hch@lst.de, james.bottomley@hansenpartnership.com,
linux-scsi@vger.kernel.org, balsundar.p@microsemi.com,
aacraid@microsemi.com
Subject: Re: [PATCH 08/11] aacraid: use scsi_host_quiesce() to wait for I/O to complete
Date: Wed, 4 Dec 2019 16:02:06 +0100 [thread overview]
Message-ID: <6696e47b-9f72-4034-a724-16e39e531409@suse.de> (raw)
In-Reply-To: <4ec87e61-2e2f-a3b5-00f6-1e6abf9cb261@suse.de>
On 11/28/19 1:09 PM, Hannes Reinecke wrote:
> On 11/28/19 12:45 PM, Balsundar.P@microchip.com wrote:
>> NAK
>>
>> After applying this patch, while IOs were running on physical drive,
>> issued controller reset from management utility.
>> Observed below call trace. It is from scsi_device_quiesce().
>>
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.799311] INFO: task arcconf:2386 blocked for more than 120 seconds.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.799841] Not tainted 5.4.0-rc1+ #2
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800678] arcconf D 0 2386 2173 0x00004000
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800682] Call Trace:
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800699] __schedule+0x291/0x6f0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800703] schedule+0x33/0xa0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800710] blk_mq_freeze_queue_wait+0x4b/0xb0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800717] ? wait_woken+0x80/0x80
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800721] blk_mq_freeze_queue+0x1a/0x20
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800727] scsi_device_quiesce+0x5d/0xb0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800730] scsi_host_quiesce+0x41/0x60
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800742] aac_send_shutdown+0x7c/0x180 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800749] aac_reset_adapter+0x29f/0x760 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800757] ? security_capable+0x3f/0x60
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800762] aac_store_reset_adapter+0x41/0x60 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800770] dev_attr_store+0x17/0x30
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800777] sysfs_kf_write+0x3c/0x50
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800779] kernfs_fop_write+0x125/0x1a0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800785] __vfs_write+0x1b/0x40
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800789] vfs_write+0xb1/0x1a0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800792] ksys_write+0xa7/0xe0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800795] __x64_sys_write+0x1a/0x20
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800802] do_syscall_64+0x57/0x190
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800806] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800810] RIP: 0033:0x7f67cfb7c2b7
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800819] Code: Bad RIP value.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800821] RSP: 002b:00007ffeb23ca8c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800823] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f67cfb7c2b7
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800825] RDX: 0000000000000002 RSI: 00007ffeb23ca8f0 RDI: 0000000000000006
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800826] RBP: 00007ffeb23ca8f0 R08: 0000000000000000 R09: 0000000000000000
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800828] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800829] R13: 00007ffeb23cad20 R14: 00007ffeb23cadf0 R15: 00007ffeb23cac60
>>
> Thanks for testing.
> I'll have a look at the call trace and will come back to you with an
> updated version.
>
After testing I've discovered that we can't use freeze_queue here.
Point is, when resetting the HBA there will be commands outstanding
(which will keep the q_usage_counter to non-zero), but we should _not_
terminate those commands as I/O processing will be resumed after reset.
Hence the blk_mq_freeze_queue_wait() will never complete.
So for the next iteration I've reverted back to use a busy iterator, as
we just need to complete the commands currently held by the firmware;
all other commands can (and should) be left alone.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
next prev parent reply other threads:[~2019-12-04 15:02 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-20 10:31 [PATCHv5 00/11] scsi: remove legacy cmd_list implementation Hannes Reinecke
2019-11-20 10:31 ` [PATCH 01/11] dpt_i2o: rename adpt_i2o_to_scsi() to adpt_i2o_scsi_complete() Hannes Reinecke
2019-11-26 16:54 ` Christoph Hellwig
2019-11-20 10:31 ` [PATCH 02/11] scsi: add scsi_host_flush_commands() helper Hannes Reinecke
2019-11-20 16:19 ` Bart Van Assche
2019-11-21 15:46 ` Hannes Reinecke
2019-11-26 16:55 ` Christoph Hellwig
2019-11-26 18:25 ` Hannes Reinecke
2019-11-20 10:31 ` [PATCH 03/11] dpt_i2o: use scsi_host_flush_commands() to abort outstanding commands Hannes Reinecke
2019-11-26 16:55 ` Christoph Hellwig
2019-11-20 10:31 ` [PATCH 04/11] aacraid: Do not wait for outstanding write commands on synchronize_cache Hannes Reinecke
2019-11-26 17:02 ` Christoph Hellwig
2019-11-28 11:40 ` Balsundar.P
2019-11-20 10:31 ` [PATCH 05/11] aacraid: use midlayer helper to terminate outstanding commands Hannes Reinecke
2019-11-26 17:02 ` Christoph Hellwig
2019-11-28 11:41 ` Balsundar.P
2019-11-20 10:31 ` [PATCH 06/11] aacraid: replace aac_flush_ios() with midlayer helper Hannes Reinecke
2019-11-20 16:14 ` Bart Van Assche
2019-11-28 11:41 ` Balsundar.P
2019-11-20 10:31 ` [PATCH 07/11] scsi: add scsi_host_quiesce()/scsi_host_resume() helper Hannes Reinecke
2019-11-20 10:31 ` [PATCH 08/11] aacraid: use scsi_host_quiesce() to wait for I/O to complete Hannes Reinecke
2019-11-20 16:23 ` Bart Van Assche
2019-11-26 8:29 ` Balsundar.P
2019-11-28 11:45 ` Balsundar.P
2019-11-28 12:09 ` Hannes Reinecke
2019-12-04 15:02 ` Hannes Reinecke [this message]
2019-11-20 10:31 ` [PATCH 09/11] scsi: add scsi_host_busy_iter() Hannes Reinecke
2019-11-20 10:31 ` [PATCH 10/11] aacraid: use scsi_host_busy_iter() in get_num_of_incomplete_fibs() Hannes Reinecke
2019-11-28 11:42 ` Balsundar.P
2019-11-20 10:31 ` [PATCH 11/11] scsi: Remove cmd_list functionality Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6696e47b-9f72-4034-a724-16e39e531409@suse.de \
--to=hare@suse.de \
--cc=Balsundar.P@microchip.com \
--cc=aacraid@microsemi.com \
--cc=balsundar.p@microsemi.com \
--cc=hch@lst.de \
--cc=james.bottomley@hansenpartnership.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).