All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Bart Van Assche <bvanassche@acm.org>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-scsi@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Ming Lei <ming.lei@redhat.com>, Christoph Hellwig <hch@lst.de>,
	Mike Christie <michael.christie@oracle.com>,
	Hannes Reinecke <hare@suse.de>,
	John Garry <john.garry@huawei.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>
Subject: Re: [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
Date: Mon, 5 Sep 2022 10:40:47 -0700	[thread overview]
Message-ID: <20220905173905.GA3405134@roeck-us.net> (raw)
In-Reply-To: <20220728221851.1822295-3-bvanassche@acm.org>

On Thu, Jul 28, 2022 at 03:18:49PM -0700, Bart Van Assche wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> Fix the race conditions between SCSI LLD kernel module unloading and SCSI
> device and target removal by making sure that SCSI hosts are destroyed after
> all associated target and device objects have been freed.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: John Garry <john.garry@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> [ bvanassche: Reworked Ming's patch and split it ]

I know this has been reported before, but it is still seen in the
upstream kernel, so:

This patch results in a deadlock if a USB storage device is removed.

[   29.291148] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.300064] ci_hdrc ci_hdrc.1: remove, state 4
[   29.300317] usb usb2: USB disconnect, device number 1
[   29.305090] ci_hdrc ci_hdrc.1: USB bus 2 deregistered
[   29.307052] ci_hdrc ci_hdrc.0: remove, state 1
[   29.307214] usb usb1: USB disconnect, device number 1
[   29.307321] usb 1-1: USB disconnect, device number 2
[   29.344575] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.345323] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   63.358569] INFO: task init:347 blocked for more than 30 seconds.
[   63.358928]       Tainted: G        W        N 6.0.0-rc4-00017-gcec18aa4b63a #1
[   63.359200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   63.359600] task:init            state:D stack:    0 pid:  347 ppid:     1 flags:0x00000000
[   63.360104]  __schedule from schedule+0x60/0xbc
[   63.360368]  schedule from scsi_remove_host+0x154/0x1c0
[   63.360602]  scsi_remove_host from usb_stor_disconnect+0x4c/0xac
[   63.360852]  usb_stor_disconnect from usb_unbind_interface+0x74/0x268
[   63.361100]  usb_unbind_interface from device_release_driver_internal+0x1a0/0x22c
[   63.361383]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.361651]  bus_remove_device from device_del+0x16c/0x3f8
[   63.361877]  device_del from usb_disable_device+0xcc/0x178
[   63.362097]  usb_disable_device from usb_disconnect+0xd0/0x230
[   63.362325]  usb_disconnect from usb_disconnect+0x9c/0x230
[   63.362536]  usb_disconnect from usb_remove_hcd+0xd0/0x16c
[   63.362741]  usb_remove_hcd from host_stop+0x38/0xa8
[   63.362946]  host_stop from ci_hdrc_remove+0x44/0x120
[   63.363148]  ci_hdrc_remove from platform_remove+0x20/0x4c
[   63.363367]  platform_remove from device_release_driver_internal+0x1a0/0x22c
[   63.363635]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.363897]  bus_remove_device from device_del+0x16c/0x3f8
[   63.364117]  device_del from platform_device_del.part.0+0x10/0x74
[   63.364353]  platform_device_del.part.0 from platform_device_unregister+0x18/0x24
[   63.364623]  platform_device_unregister from ci_hdrc_remove_device+0xc/0x20
[   63.364886]  ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0x110
[   63.365131]  ci_hdrc_imx_remove from device_shutdown+0x174/0x250
[   63.365372]  device_shutdown from __do_sys_reboot+0x124/0x270
[   63.365616]  __do_sys_reboot from ret_fast_syscall+0x0/0x1c
[   63.365849] Exception stack(0xd1859fa8 to 0xd1859ff0)
[   63.366054] 9fa0:                   01234567 000c623f fee1dead 28121969 01234567 00000000
[   63.366343] 9fc0: 01234567 000c623f 00000001 00000058 000d85c0 00000000 00000000 00000000
[   63.366620] 9fe0: 000d8298 bef49de4 000918bc b6e8cedc
[   63.366881] INFO: lockdep is turned off.
[   63.367069] Kernel panic - not syncing: hung_task: blocked tasks

I understand that it looks like the problem is caused by the shutdown
function in the imx driver calling remove_device, but that is not really
the problem.

As can be seen in the backtrace, usb_stor_disconnect() calls
scsi_remove_host(). Thanks to this patch, scsi_remove_host() now
waits for the scsi release function to be called. However,
usb_stor_disconnect() only calls release_everything() and with it
scsi_host_put() _after_ scsi_remove_host() has returned. Since
scsi_remove_host() now waits for the resource which is released
by calling scsi_host_put(), this causes a deadlock.

If my analysis is correct, any USB storage device removal should
result in the deadlock. My analysis may of course be wrong. If so,
please let me know what I missed.

Thanks,
Guenter

  reply	other threads:[~2022-09-05 17:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
2022-09-05 17:40   ` Guenter Roeck [this message]
2022-09-06 14:16     ` Bart Van Assche
2022-09-06 14:23       ` Guenter Roeck
2022-07-28 22:18 ` [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-29 15:59 ` [PATCH v5 0/4] " Mike Christie
2022-08-01 23:45 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220905173905.GA3405134@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=adrian.hunter@intel.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=michael.christie@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.