All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Balsundar.P@microchip.com, martin.petersen@oracle.com
Cc: hch@lst.de, james.bottomley@hansenpartnership.com,
	linux-scsi@vger.kernel.org, balsundar.p@microsemi.com,
	aacraid@microsemi.com
Subject: Re: [PATCH 08/11] aacraid: use scsi_host_quiesce() to wait for I/O to complete
Date: Wed, 4 Dec 2019 16:02:06 +0100	[thread overview]
Message-ID: <6696e47b-9f72-4034-a724-16e39e531409@suse.de> (raw)
In-Reply-To: <4ec87e61-2e2f-a3b5-00f6-1e6abf9cb261@suse.de>

On 11/28/19 1:09 PM, Hannes Reinecke wrote:
> On 11/28/19 12:45 PM, Balsundar.P@microchip.com wrote:
>> NAK
>>
>> After applying this patch, while IOs were running on physical drive, 
>> issued controller reset from management utility.
>> Observed below call trace. It is from scsi_device_quiesce().
>>
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.799311] INFO: task arcconf:2386 blocked for more than 120 seconds.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.799841]       Not tainted 5.4.0-rc1+ #2
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800678] arcconf         D    0  2386   2173 0x00004000
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800682] Call Trace:
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800699]  __schedule+0x291/0x6f0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800703]  schedule+0x33/0xa0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800710]  blk_mq_freeze_queue_wait+0x4b/0xb0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800717]  ? wait_woken+0x80/0x80
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800721]  blk_mq_freeze_queue+0x1a/0x20
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800727]  scsi_device_quiesce+0x5d/0xb0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800730]  scsi_host_quiesce+0x41/0x60
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800742]  aac_send_shutdown+0x7c/0x180 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800749]  aac_reset_adapter+0x29f/0x760 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800757]  ? security_capable+0x3f/0x60
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800762]  aac_store_reset_adapter+0x41/0x60 [aacraid]
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800770]  dev_attr_store+0x17/0x30
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800777]  sysfs_kf_write+0x3c/0x50
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800779]  kernfs_fop_write+0x125/0x1a0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800785]  __vfs_write+0x1b/0x40
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800789]  vfs_write+0xb1/0x1a0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800792]  ksys_write+0xa7/0xe0
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800795]  __x64_sys_write+0x1a/0x20
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800802]  do_syscall_64+0x57/0x190
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800806]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800810] RIP: 0033:0x7f67cfb7c2b7
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800819] Code: Bad RIP value.
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800821] RSP: 002b:00007ffeb23ca8c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800823] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f67cfb7c2b7
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800825] RDX: 0000000000000002 RSI: 00007ffeb23ca8f0 RDI: 0000000000000006
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800826] RBP: 00007ffeb23ca8f0 R08: 0000000000000000 R09: 0000000000000000
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800828] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
>> Nov 27 19:24:21 ubuntu kernel: [ 1330.800829] R13: 00007ffeb23cad20 R14: 00007ffeb23cadf0 R15: 00007ffeb23cac60
>>
> Thanks for testing.
> I'll have a look at the call trace and will come back to you with an
> updated version.
> 
After testing I've discovered that we can't use freeze_queue here.
Point is, when resetting the HBA there will be commands outstanding
(which will keep the q_usage_counter to non-zero), but we should _not_
terminate those commands as I/O processing will be resumed after reset.
Hence the blk_mq_freeze_queue_wait() will never complete.

So for the next iteration I've reverted back to use a busy iterator, as
we just need to complete the commands currently held by the firmware;
all other commands can (and should) be left alone.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      Teamlead Storage & Networking
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

  reply	other threads:[~2019-12-04 15:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 10:31 [PATCHv5 00/11] scsi: remove legacy cmd_list implementation Hannes Reinecke
2019-11-20 10:31 ` [PATCH 01/11] dpt_i2o: rename adpt_i2o_to_scsi() to adpt_i2o_scsi_complete() Hannes Reinecke
2019-11-26 16:54   ` Christoph Hellwig
2019-11-20 10:31 ` [PATCH 02/11] scsi: add scsi_host_flush_commands() helper Hannes Reinecke
2019-11-20 16:19   ` Bart Van Assche
2019-11-21 15:46     ` Hannes Reinecke
2019-11-26 16:55   ` Christoph Hellwig
2019-11-26 18:25     ` Hannes Reinecke
2019-11-20 10:31 ` [PATCH 03/11] dpt_i2o: use scsi_host_flush_commands() to abort outstanding commands Hannes Reinecke
2019-11-26 16:55   ` Christoph Hellwig
2019-11-20 10:31 ` [PATCH 04/11] aacraid: Do not wait for outstanding write commands on synchronize_cache Hannes Reinecke
2019-11-26 17:02   ` Christoph Hellwig
2019-11-28 11:40   ` Balsundar.P
2019-11-20 10:31 ` [PATCH 05/11] aacraid: use midlayer helper to terminate outstanding commands Hannes Reinecke
2019-11-26 17:02   ` Christoph Hellwig
2019-11-28 11:41   ` Balsundar.P
2019-11-20 10:31 ` [PATCH 06/11] aacraid: replace aac_flush_ios() with midlayer helper Hannes Reinecke
2019-11-20 16:14   ` Bart Van Assche
2019-11-28 11:41   ` Balsundar.P
2019-11-20 10:31 ` [PATCH 07/11] scsi: add scsi_host_quiesce()/scsi_host_resume() helper Hannes Reinecke
2019-11-20 10:31 ` [PATCH 08/11] aacraid: use scsi_host_quiesce() to wait for I/O to complete Hannes Reinecke
2019-11-20 16:23   ` Bart Van Assche
2019-11-26  8:29   ` Balsundar.P
2019-11-28 11:45   ` Balsundar.P
2019-11-28 12:09     ` Hannes Reinecke
2019-12-04 15:02       ` Hannes Reinecke [this message]
2019-11-20 10:31 ` [PATCH 09/11] scsi: add scsi_host_busy_iter() Hannes Reinecke
2019-11-20 10:31 ` [PATCH 10/11] aacraid: use scsi_host_busy_iter() in get_num_of_incomplete_fibs() Hannes Reinecke
2019-11-28 11:42   ` Balsundar.P
2019-11-20 10:31 ` [PATCH 11/11] scsi: Remove cmd_list functionality Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6696e47b-9f72-4034-a724-16e39e531409@suse.de \
    --to=hare@suse.de \
    --cc=Balsundar.P@microchip.com \
    --cc=aacraid@microsemi.com \
    --cc=balsundar.p@microsemi.com \
    --cc=hch@lst.de \
    --cc=james.bottomley@hansenpartnership.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.