All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier
@ 2022-07-28 22:18 Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-07-28 22:18 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche

Hi Martin,

This patch series fixes a recently reported use-after-free in the SRP driver.
Please consider this patch series for kernel v5.20.

Changes compared to v4:
- Left out the scsi_mq_destroy_tags() changes.

Changes compared to v3:
- Added a patch to delay scsi_remove_target() until dependent devices have been
  removed.
- Split a patch into two patches.

Changes compared to v2:
- Dropped the patch that simplifies scsi_forget_host().
- Replaced patch 2/3 with a patch from Ming Lei.

Changes compared to v1:
- Expanded this series from one to three patches.
- Fixed the description of patch 3/3.

Thanks,

Bart.

Bart Van Assche (2):
  scsi: core: Make sure that targets outlive devices
  scsi: core: Call blk_mq_free_tag_set() earlier

Ming Lei (2):
  scsi: core: Make sure that hosts outlive targets
  scsi: core: Simplify LLD module reference counting

 drivers/scsi/hosts.c       | 18 +++++++++++++-----
 drivers/scsi/scsi.c        |  9 ++++++---
 drivers/scsi/scsi_scan.c   |  9 +++++++++
 drivers/scsi/scsi_sysfs.c  | 29 +++++++++++++++++------------
 include/scsi/scsi_device.h |  2 ++
 include/scsi/scsi_host.h   |  3 +++
 6 files changed, 50 insertions(+), 20 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
@ 2022-07-28 22:18 ` Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-07-28 22:18 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Mike Christie, Ming Lei, Christoph Hellwig, Hannes Reinecke,
	John Garry, Li Zhijian, James E.J. Bottomley

This patch prevents that the following sequence triggers a kernel crash:
* Deletion of a SCSI device is requested via sysfs. Device removal takes
  some time because blk_cleanup_queue() is waiting for the SCSI error
  handler.
* The SCSI target associated with that SCSI device is removed.
* scsi_remove_target() returns and its caller frees the resources
  associated with the SCSI target.
* The error handler makes progress and invokes an LLD callback that
  dereferences the SCSI target pointer.

Reported-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_scan.c   |  2 ++
 drivers/scsi/scsi_sysfs.c  | 20 +++++++++++++++++---
 include/scsi/scsi_device.h |  2 ++
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 91ac901a6682..4c1efd6a3b0c 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -521,6 +521,8 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
+	init_waitqueue_head(&starget->sdev_wq);
+
  retry:
 	spin_lock_irqsave(shost->host_lock, flags);
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 43949798a2e4..1bc9c26fe1d4 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -443,7 +443,9 @@ static void scsi_device_cls_release(struct device *class_dev)
 
 static void scsi_device_dev_release_usercontext(struct work_struct *work)
 {
-	struct scsi_device *sdev;
+	struct scsi_device *sdev = container_of(work, struct scsi_device,
+						ew.work);
+	struct scsi_target *starget = sdev->sdev_target;
 	struct device *parent;
 	struct list_head *this, *tmp;
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
@@ -452,8 +454,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	unsigned long flags;
 	struct module *mod;
 
-	sdev = container_of(work, struct scsi_device, ew.work);
-
 	mod = sdev->host->hostt->module;
 
 	scsi_dh_release_device(sdev);
@@ -516,6 +516,9 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	kfree(sdev->inquiry);
 	kfree(sdev);
 
+	if (starget && atomic_dec_return(&starget->sdev_count) == 0)
+		wake_up(&starget->sdev_wq);
+
 	if (parent)
 		put_device(parent);
 	module_put(mod);
@@ -1535,6 +1538,14 @@ static void __scsi_remove_target(struct scsi_target *starget)
 		goto restart;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
+
+	/*
+	 * After scsi_remove_target() returns its caller can remove resources
+	 * associated with @starget, e.g. an rport or session. Wait until all
+	 * devices associated with @starget have been removed to prevent that
+	 * a SCSI error handling callback function triggers a use-after-free.
+	 */
+	wait_event(starget->sdev_wq, atomic_read(&starget->sdev_count) == 0);
 }
 
 /**
@@ -1645,6 +1656,9 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 	list_add_tail(&sdev->same_target_siblings, &starget->devices);
 	list_add_tail(&sdev->siblings, &shost->__devices);
 	spin_unlock_irqrestore(shost->host_lock, flags);
+
+	atomic_inc(&starget->sdev_count);
+
 	/*
 	 * device can now only be removed via __scsi_remove_device() so hold
 	 * the target.  Target will be held in CREATED state until something
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 7cf5f3b7589f..190d2081f4c6 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -309,6 +309,8 @@ struct scsi_target {
 	struct list_head	devices;
 	struct device		dev;
 	struct kref		reap_ref; /* last put renders target invisible */
+	atomic_t		sdev_count;
+	wait_queue_head_t	sdev_wq;
 	unsigned int		channel;
 	unsigned int		id; /* target id ... replace
 				     * scsi_device.id eventually */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
@ 2022-07-28 22:18 ` Bart Van Assche
  2022-09-05 17:40   ` Guenter Roeck
  2022-07-28 22:18 ` [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting Bart Van Assche
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-07-28 22:18 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, James E.J. Bottomley

From: Ming Lei <ming.lei@redhat.com>

Fix the race conditions between SCSI LLD kernel module unloading and SCSI
device and target removal by making sure that SCSI hosts are destroyed after
all associated target and device objects have been freed.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Reworked Ming's patch and split it ]
---
 drivers/scsi/hosts.c     | 8 ++++++++
 drivers/scsi/scsi_scan.c | 7 +++++++
 include/scsi/scsi_host.h | 3 +++
 3 files changed, 18 insertions(+)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ef6c0e37acce..8fa98c8d0ee0 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -190,6 +190,13 @@ void scsi_remove_host(struct Scsi_Host *shost)
 	transport_unregister_device(&shost->shost_gendev);
 	device_unregister(&shost->shost_dev);
 	device_del(&shost->shost_gendev);
+
+	/*
+	 * After scsi_remove_host() has returned the scsi LLD module can be
+	 * unloaded and/or the host resources can be released. Hence wait until
+	 * the dependent SCSI targets and devices are gone before returning.
+	 */
+	wait_event(shost->targets_wq, atomic_read(&shost->target_count) == 0);
 }
 EXPORT_SYMBOL(scsi_remove_host);
 
@@ -394,6 +401,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	INIT_LIST_HEAD(&shost->starved_list);
 	init_waitqueue_head(&shost->host_wait);
 	mutex_init(&shost->scan_mutex);
+	init_waitqueue_head(&shost->targets_wq);
 
 	index = ida_alloc(&host_index_ida, GFP_KERNEL);
 	if (index < 0) {
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 4c1efd6a3b0c..ac6059702d13 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -406,9 +406,14 @@ static void scsi_target_destroy(struct scsi_target *starget)
 static void scsi_target_dev_release(struct device *dev)
 {
 	struct device *parent = dev->parent;
+	struct Scsi_Host *shost = dev_to_shost(parent);
 	struct scsi_target *starget = to_scsi_target(dev);
 
 	kfree(starget);
+
+	if (atomic_dec_return(&shost->target_count) == 0)
+		wake_up(&shost->targets_wq);
+
 	put_device(parent);
 }
 
@@ -523,6 +528,8 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
 	init_waitqueue_head(&starget->sdev_wq);
 
+	atomic_inc(&shost->target_count);
+
  retry:
 	spin_lock_irqsave(shost->host_lock, flags);
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 667d889b92b5..339f975d356e 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -689,6 +689,9 @@ struct Scsi_Host {
 	/* ldm bits */
 	struct device		shost_gendev, shost_dev;
 
+	atomic_t		target_count;
+	wait_queue_head_t	targets_wq;
+
 	/*
 	 * Points to the transport data (if any) which is allocated
 	 * separately

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
@ 2022-07-28 22:18 ` Bart Van Assche
  2022-07-28 22:18 ` [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-07-28 22:18 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, James E.J. Bottomley

From: Ming Lei <ming.lei@redhat.com>

Swap two statements in scsi_device_put() now that it is guaranteed that
SCSI hosts outlive SCSI devices. Remove the reference counting code from
scsi_sysfs.c that became superfluous because SCSI hosts now outlive SCSI
devices.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ bvanassche: Extracted this patch from a larger patch ]
---
 drivers/scsi/scsi.c       | 9 ++++++---
 drivers/scsi/scsi_sysfs.c | 9 ---------
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index c59eac7a32f2..086ec5b5862d 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -586,10 +586,13 @@ EXPORT_SYMBOL(scsi_device_get);
  */
 void scsi_device_put(struct scsi_device *sdev)
 {
-	struct module *mod = sdev->host->hostt->module;
-
+	/*
+	 * Decreasing the module reference count before the device reference
+	 * count is safe since scsi_remove_host() only returns after all
+	 * devices have been removed.
+	 */
+	module_put(sdev->host->hostt->module);
 	put_device(&sdev->sdev_gendev);
-	module_put(mod);
 }
 EXPORT_SYMBOL(scsi_device_put);
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 1bc9c26fe1d4..213ebc88f76a 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -452,9 +452,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL;
 	struct scsi_vpd *vpd_pgb0 = NULL, *vpd_pgb1 = NULL, *vpd_pgb2 = NULL;
 	unsigned long flags;
-	struct module *mod;
-
-	mod = sdev->host->hostt->module;
 
 	scsi_dh_release_device(sdev);
 
@@ -521,17 +518,11 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 
 	if (parent)
 		put_device(parent);
-	module_put(mod);
 }
 
 static void scsi_device_dev_release(struct device *dev)
 {
 	struct scsi_device *sdp = to_scsi_device(dev);
-
-	/* Set module pointer as NULL in case of module unloading */
-	if (!try_module_get(sdp->host->hostt->module))
-		sdp->host->hostt->module = NULL;
-
 	execute_in_process_context(scsi_device_dev_release_usercontext,
 				   &sdp->ew);
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
                   ` (2 preceding siblings ...)
  2022-07-28 22:18 ` [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting Bart Van Assche
@ 2022-07-28 22:18 ` Bart Van Assche
  2022-07-29 15:59 ` [PATCH v5 0/4] " Mike Christie
  2022-08-01 23:45 ` Martin K. Petersen
  5 siblings, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-07-28 22:18 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, Li Zhijian, James E.J. Bottomley

There are two .exit_cmd_priv implementations. Both implementations use
resources associated with the SCSI host. Make sure that these resources are
still available when .exit_cmd_priv is called by moving the .exit_cmd_priv
calls from scsi_host_dev_release() to scsi_forget_host(). Moving
blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_remove_host() is
safe because scsi_forget_host() waits until all SCSI devices associated
with the host have been removed.

This patch fixes the following use-after-free:

==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0x5e/0x5db
 kasan_report+0xab/0x120
 srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
 scsi_mq_exit_request+0x4d/0x70
 blk_mq_free_rqs+0x143/0x410
 __blk_mq_free_map_and_rqs+0x6e/0x100
 blk_mq_free_tag_set+0x2b/0x160
 scsi_host_dev_release+0xf3/0x1a0
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 scsi_device_dev_release_usercontext+0x4c1/0x4e0
 execute_in_process_context+0x23/0x90
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 scsi_disk_release+0x3f/0x50
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 disk_release+0x17f/0x1b0
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 dm_put_table_device+0xa3/0x160 [dm_mod]
 dm_put_device+0xd0/0x140 [dm_mod]
 free_priority_group+0xd8/0x110 [dm_multipath]
 free_multipath+0x94/0xe0 [dm_multipath]
 dm_table_destroy+0xa2/0x1e0 [dm_mod]
 __dm_destroy+0x196/0x350 [dm_mod]
 dev_remove+0x10c/0x160 [dm_mod]
 ctl_ioctl+0x2c2/0x590 [dm_mod]
 dm_ctl_ioctl+0x5/0x10 [dm_mod]
 __x64_sys_ioctl+0xb4/0xf0
 dm_ctl_ioctl+0x5/0x10 [dm_mod]
 __x64_sys_ioctl+0xb4/0xf0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Fixes: 65ca846a5314 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/hosts.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 8fa98c8d0ee0..6c63672971f1 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -197,6 +197,8 @@ void scsi_remove_host(struct Scsi_Host *shost)
 	 * the dependent SCSI targets and devices are gone before returning.
 	 */
 	wait_event(shost->targets_wq, atomic_read(&shost->target_count) == 0);
+
+	scsi_mq_destroy_tags(shost);
 }
 EXPORT_SYMBOL(scsi_remove_host);
 
@@ -302,8 +304,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	return error;
 
 	/*
-	 * Any host allocation in this function will be freed in
-	 * scsi_host_dev_release().
+	 * Any resources associated with the SCSI host in this function except
+	 * the tag set will be freed by scsi_host_dev_release().
 	 */
  out_del_dev:
 	device_del(&shost->shost_dev);
@@ -319,6 +321,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	pm_runtime_disable(&shost->shost_gendev);
 	pm_runtime_set_suspended(&shost->shost_gendev);
 	pm_runtime_put_noidle(&shost->shost_gendev);
+	scsi_mq_destroy_tags(shost);
  fail:
 	return error;
 }
@@ -352,9 +355,6 @@ static void scsi_host_dev_release(struct device *dev)
 		kfree(dev_name(&shost->shost_dev));
 	}
 
-	if (shost->tag_set.tags)
-		scsi_mq_destroy_tags(shost);
-
 	kfree(shost->shost_data);
 
 	ida_free(&host_index_ida, shost->host_no);

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
                   ` (3 preceding siblings ...)
  2022-07-28 22:18 ` [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche
@ 2022-07-29 15:59 ` Mike Christie
  2022-08-01 23:45 ` Martin K. Petersen
  5 siblings, 0 replies; 10+ messages in thread
From: Mike Christie @ 2022-07-29 15:59 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter

On 7/28/22 5:18 PM, Bart Van Assche wrote:
> Hi Martin,
> 
> This patch series fixes a recently reported use-after-free in the SRP driver.
> Please consider this patch series for kernel v5.20.
> 

Reviewed-by: Mike Christie <michael.christie@oracle.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier
  2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
                   ` (4 preceding siblings ...)
  2022-07-29 15:59 ` [PATCH v5 0/4] " Mike Christie
@ 2022-08-01 23:45 ` Martin K. Petersen
  5 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2022-08-01 23:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Adrian Hunter


Bart,

> This patch series fixes a recently reported use-after-free in the SRP
> driver.  Please consider this patch series for kernel v5.20.

Applied to 5.20/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
  2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
@ 2022-09-05 17:40   ` Guenter Roeck
  2022-09-06 14:16     ` Bart Van Assche
  0 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2022-09-05 17:40 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Adrian Hunter,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, James E.J. Bottomley

On Thu, Jul 28, 2022 at 03:18:49PM -0700, Bart Van Assche wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> Fix the race conditions between SCSI LLD kernel module unloading and SCSI
> device and target removal by making sure that SCSI hosts are destroyed after
> all associated target and device objects have been freed.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: John Garry <john.garry@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> [ bvanassche: Reworked Ming's patch and split it ]

I know this has been reported before, but it is still seen in the
upstream kernel, so:

This patch results in a deadlock if a USB storage device is removed.

[   29.291148] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.300064] ci_hdrc ci_hdrc.1: remove, state 4
[   29.300317] usb usb2: USB disconnect, device number 1
[   29.305090] ci_hdrc ci_hdrc.1: USB bus 2 deregistered
[   29.307052] ci_hdrc ci_hdrc.0: remove, state 1
[   29.307214] usb usb1: USB disconnect, device number 1
[   29.307321] usb 1-1: USB disconnect, device number 2
[   29.344575] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.345323] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   63.358569] INFO: task init:347 blocked for more than 30 seconds.
[   63.358928]       Tainted: G        W        N 6.0.0-rc4-00017-gcec18aa4b63a #1
[   63.359200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   63.359600] task:init            state:D stack:    0 pid:  347 ppid:     1 flags:0x00000000
[   63.360104]  __schedule from schedule+0x60/0xbc
[   63.360368]  schedule from scsi_remove_host+0x154/0x1c0
[   63.360602]  scsi_remove_host from usb_stor_disconnect+0x4c/0xac
[   63.360852]  usb_stor_disconnect from usb_unbind_interface+0x74/0x268
[   63.361100]  usb_unbind_interface from device_release_driver_internal+0x1a0/0x22c
[   63.361383]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.361651]  bus_remove_device from device_del+0x16c/0x3f8
[   63.361877]  device_del from usb_disable_device+0xcc/0x178
[   63.362097]  usb_disable_device from usb_disconnect+0xd0/0x230
[   63.362325]  usb_disconnect from usb_disconnect+0x9c/0x230
[   63.362536]  usb_disconnect from usb_remove_hcd+0xd0/0x16c
[   63.362741]  usb_remove_hcd from host_stop+0x38/0xa8
[   63.362946]  host_stop from ci_hdrc_remove+0x44/0x120
[   63.363148]  ci_hdrc_remove from platform_remove+0x20/0x4c
[   63.363367]  platform_remove from device_release_driver_internal+0x1a0/0x22c
[   63.363635]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.363897]  bus_remove_device from device_del+0x16c/0x3f8
[   63.364117]  device_del from platform_device_del.part.0+0x10/0x74
[   63.364353]  platform_device_del.part.0 from platform_device_unregister+0x18/0x24
[   63.364623]  platform_device_unregister from ci_hdrc_remove_device+0xc/0x20
[   63.364886]  ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0x110
[   63.365131]  ci_hdrc_imx_remove from device_shutdown+0x174/0x250
[   63.365372]  device_shutdown from __do_sys_reboot+0x124/0x270
[   63.365616]  __do_sys_reboot from ret_fast_syscall+0x0/0x1c
[   63.365849] Exception stack(0xd1859fa8 to 0xd1859ff0)
[   63.366054] 9fa0:                   01234567 000c623f fee1dead 28121969 01234567 00000000
[   63.366343] 9fc0: 01234567 000c623f 00000001 00000058 000d85c0 00000000 00000000 00000000
[   63.366620] 9fe0: 000d8298 bef49de4 000918bc b6e8cedc
[   63.366881] INFO: lockdep is turned off.
[   63.367069] Kernel panic - not syncing: hung_task: blocked tasks

I understand that it looks like the problem is caused by the shutdown
function in the imx driver calling remove_device, but that is not really
the problem.

As can be seen in the backtrace, usb_stor_disconnect() calls
scsi_remove_host(). Thanks to this patch, scsi_remove_host() now
waits for the scsi release function to be called. However,
usb_stor_disconnect() only calls release_everything() and with it
scsi_host_put() _after_ scsi_remove_host() has returned. Since
scsi_remove_host() now waits for the resource which is released
by calling scsi_host_put(), this causes a deadlock.

If my analysis is correct, any USB storage device removal should
result in the deadlock. My analysis may of course be wrong. If so,
please let me know what I missed.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
  2022-09-05 17:40   ` Guenter Roeck
@ 2022-09-06 14:16     ` Bart Van Assche
  2022-09-06 14:23       ` Guenter Roeck
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-09-06 14:16 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Adrian Hunter,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, James E.J. Bottomley

On 9/5/22 10:40, Guenter Roeck wrote:
> I understand that it looks like the problem is caused by the shutdown
> function in the imx driver calling remove_device, but that is not really
> the problem.

Hi Guenter,

A revert for this patch series has been queued and is expected to be sent
soon to Linus. See also
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/log/?h=6.0/scsi-fixes

Bart.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
  2022-09-06 14:16     ` Bart Van Assche
@ 2022-09-06 14:23       ` Guenter Roeck
  0 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2022-09-06 14:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Adrian Hunter,
	Ming Lei, Christoph Hellwig, Mike Christie, Hannes Reinecke,
	John Garry, James E.J. Bottomley

On Tue, Sep 06, 2022 at 07:16:15AM -0700, Bart Van Assche wrote:
> On 9/5/22 10:40, Guenter Roeck wrote:
> > I understand that it looks like the problem is caused by the shutdown
> > function in the imx driver calling remove_device, but that is not really
> > the problem.
> 
> Hi Guenter,
> 
> A revert for this patch series has been queued and is expected to be sent
> soon to Linus. See also
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/log/?h=6.0/scsi-fixes
> 

Excellent, thanks!

Guenter

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-06 15:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
2022-09-05 17:40   ` Guenter Roeck
2022-09-06 14:16     ` Bart Van Assche
2022-09-06 14:23       ` Guenter Roeck
2022-07-28 22:18 ` [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-29 15:59 ` [PATCH v5 0/4] " Mike Christie
2022-08-01 23:45 ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.