linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] scsi: fix failure handling of alloc/add host
@ 2021-06-02 13:30 Ming Lei
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-02 13:30 UTC (permalink / raw)
  To: Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Bart Van Assche, John Garry, Hannes Reinecke

Hello Martin,

Fix failure handling of alloc/add host code, and related device release
handling.


Ming Lei (4):
  scsi: core: fix error handling of scsi_host_alloc
  scsi: core: fix failure handling of scsi_add_host_with_dma
  scsi: core: put .shost_dev in failure path if host state becomes
    running
  scsi: core: only put parent device if host state isn't in
    SHOST_CREATED

 drivers/scsi/hosts.c | 47 ++++++++++++++++++++++++--------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>

-- 
2.29.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
@ 2021-06-02 13:30 ` Ming Lei
  2021-06-03  2:26   ` Bart Van Assche
                     ` (3 more replies)
  2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
                   ` (4 subsequent siblings)
  5 siblings, 4 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-02 13:30 UTC (permalink / raw)
  To: Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Bart Van Assche, John Garry, Hannes Reinecke

After device is initialized via device_initialize(), or its name is
set via dev_set_name(), the device has to be freed via put_device(),
otherwise device name will be leaked because it is allocated
dynamically in dev_set_name().

Fixes the issue by replacing kfree(shost) via put_device(&shost->shost_gendev)
which can help to free dev_name(&shost->shost_dev) when host state is
in SHOST_CREATED. Meantime needn't to remove IDA and stop the kthread of
shost->ehandler in the error handling code.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hosts.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 624e2582c3df..25cf76e73595 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -391,8 +391,10 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	mutex_init(&shost->scan_mutex);
 
 	index = ida_simple_get(&host_index_ida, 0, 0, GFP_KERNEL);
-	if (index < 0)
-		goto fail_kfree;
+	if (index < 0) {
+		kfree(shost);
+		return NULL;
+	}
 	shost->host_no = index;
 
 	shost->dma_channel = 0xff;
@@ -484,7 +486,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 		shost_printk(KERN_WARNING, shost,
 			"error handler thread failed to spawn, error = %ld\n",
 			PTR_ERR(shost->ehandler));
-		goto fail_index_remove;
+		goto fail;
 	}
 
 	shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d",
@@ -493,17 +495,18 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	if (!shost->tmf_work_q) {
 		shost_printk(KERN_WARNING, shost,
 			     "failed to create tmf workq\n");
-		goto fail_kthread;
+		goto fail;
 	}
 	scsi_proc_hostdir_add(shost->hostt);
 	return shost;
+ fail:
+	/*
+	 * host state is still SHOST_CREATED, and it is enough to release
+	 * ->shost_gendev since scsi_host_dev_release() can help to free
+	 * dev_name(&shost->shost_dev)
+	 */
+	put_device(&shost->shost_gendev);
 
- fail_kthread:
-	kthread_stop(shost->ehandler);
- fail_index_remove:
-	ida_simple_remove(&host_index_ida, shost->host_no);
- fail_kfree:
-	kfree(shost);
 	return NULL;
 }
 EXPORT_SYMBOL(scsi_host_alloc);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
@ 2021-06-02 13:30 ` Ming Lei
  2021-06-03  2:32   ` Bart Van Assche
                     ` (2 more replies)
  2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-02 13:30 UTC (permalink / raw)
  To: Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Bart Van Assche, John Garry, Hannes Reinecke

When scsi_add_host_with_dma() return failure, the caller will call
scsi_host_put(shost) to release everything allocated for this host
instance. So we can't free allocated stuff in scsi_add_host_with_dma(),
otherwise double free will be caused.

Strictly speaking, these host resources allocation should have been
moved to scsi_host_alloc(), but the allocation may need driver's
info which can be built between calling scsi_host_alloc() and
scsi_add_host(), so just keep the allocations in
scsi_add_host_with_dma().

Fixes the problem by relying on host device's release handler to
release everything.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hosts.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 25cf76e73595..796736e47764 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -281,23 +281,22 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 
 		if (!shost->work_q) {
 			error = -EINVAL;
-			goto out_free_shost_data;
+			goto out_del_dev;
 		}
 	}
 
 	error = scsi_sysfs_add_host(shost);
 	if (error)
-		goto out_destroy_host;
+		goto out_del_dev;
 
 	scsi_proc_host_add(shost);
 	scsi_autopm_put_host(shost);
 	return error;
 
- out_destroy_host:
-	if (shost->work_q)
-		destroy_workqueue(shost->work_q);
- out_free_shost_data:
-	kfree(shost->shost_data);
+	/*
+	 * any host allocation in this function will be freed in
+	 * scsi_host_dev_release, so don't free them in the failure path
+	 */
  out_del_dev:
 	device_del(&shost->shost_dev);
  out_del_gendev:
@@ -307,7 +306,6 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	pm_runtime_disable(&shost->shost_gendev);
 	pm_runtime_set_suspended(&shost->shost_gendev);
 	pm_runtime_put_noidle(&shost->shost_gendev);
-	scsi_mq_destroy_tags(shost);
  fail:
 	return error;
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
  2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
@ 2021-06-02 13:30 ` Ming Lei
  2021-06-03  3:06   ` Bart Van Assche
                     ` (2 more replies)
  2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-02 13:30 UTC (permalink / raw)
  To: Martin K . Petersen, linux-scsi
  Cc: Ming Lei, John Garry, Bart Van Assche, Hannes Reinecke

scsi_host_dev_release() only works around for us by freeing
dev_name(&shost->shost_dev) when host state is SHOST_CREATED. After host
state is changed to SHOST_RUNNING, scsi_host_dev_release() doesn't do
that any more.

So fix the issue by put .shost_dev in failure path if host state becomes
running, meantime move get_device(&shost->shost_gendev) before
device_add(&shost->shost_dev), so that scsi_host_cls_release() can put
this reference.

Reported-by: John Garry <john.garry@huawei.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hosts.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 796736e47764..7049844adb6b 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -257,12 +257,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 
 	device_enable_async_suspend(&shost->shost_dev);
 
+	get_device(&shost->shost_gendev);
 	error = device_add(&shost->shost_dev);
 	if (error)
 		goto out_del_gendev;
 
-	get_device(&shost->shost_gendev);
-
 	if (shost->transportt->host_size) {
 		shost->shost_data = kzalloc(shost->transportt->host_size,
 					 GFP_KERNEL);
@@ -300,6 +299,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
  out_del_dev:
 	device_del(&shost->shost_dev);
  out_del_gendev:
+	/*
+	 * host state has become SHOST_RUNNING, so we have to release
+	 * ->shost_dev explicitly
+	 */
+	put_device(&shost->shost_dev);
 	device_del(&shost->shost_gendev);
  out_disable_runtime_pm:
 	device_disable_async_suspend(&shost->shost_gendev);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
                   ` (2 preceding siblings ...)
  2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
@ 2021-06-02 13:30 ` Ming Lei
  2021-06-03  3:08   ` Bart Van Assche
                     ` (2 more replies)
  2021-06-03 15:43 ` [PATCH 0/4] scsi: fix failure handling of alloc/add host John Garry
  2021-06-08  3:04 ` Martin K. Petersen
  5 siblings, 3 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-02 13:30 UTC (permalink / raw)
  To: Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Bart Van Assche, John Garry, Hannes Reinecke

get_device(shost->shost_gendev.parent) is called after host state is
changed to SHOST_RUNNING. So scsi_host_dev_release() shouldn't release
the parent device if host state is still SHOST_CREATED.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/hosts.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7049844adb6b..34db5be7a562 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -350,7 +350,7 @@ static void scsi_host_dev_release(struct device *dev)
 
 	ida_simple_remove(&host_index_ida, shost->host_no);
 
-	if (parent)
+	if (shost->shost_state != SHOST_CREATED)
 		put_device(parent);
 	kfree(shost);
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
@ 2021-06-03  2:26   ` Bart Van Assche
  2021-06-03 15:40   ` John Garry
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2021-06-03  2:26 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: John Garry, Hannes Reinecke

On 6/2/21 6:30 AM, Ming Lei wrote:
> After device is initialized via device_initialize(), or its name is
> set via dev_set_name(), the device has to be freed via put_device(),
> otherwise device name will be leaked because it is allocated
> dynamically in dev_set_name().

dev_set_name() must be called after device_initialize() so I think the
reference to dev_set_name() can be left out from the above sentence.

>  	return shost;
> + fail:

Please leave a blank line above labels. Otherwise this patch looks good
to me hence:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma
  2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
@ 2021-06-03  2:32   ` Bart Van Assche
  2021-06-03 15:40   ` John Garry
  2021-06-07 11:37   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2021-06-03  2:32 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: John Garry, Hannes Reinecke

On 6/2/21 6:30 AM, Ming Lei wrote:
> When scsi_add_host_with_dma() return failure, the caller will call
> scsi_host_put(shost) to release everything allocated for this host
> instance. So we can't free allocated stuff in scsi_add_host_with_dma(),
> otherwise double free will be caused.
> 
> Strictly speaking, these host resources allocation should have been
> moved to scsi_host_alloc(), but the allocation may need driver's
> info which can be built between calling scsi_host_alloc() and
> scsi_add_host(), so just keep the allocations in
> scsi_add_host_with_dma().
> 
> Fixes the problem by relying on host device's release handler to
> release everything.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running
  2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
@ 2021-06-03  3:06   ` Bart Van Assche
  2021-06-03  3:22     ` Ming Lei
  2021-06-03 15:41   ` John Garry
  2021-06-07 11:40   ` Hannes Reinecke
  2 siblings, 1 reply; 23+ messages in thread
From: Bart Van Assche @ 2021-06-03  3:06 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: John Garry, Hannes Reinecke

On 6/2/21 6:30 AM, Ming Lei wrote:
> scsi_host_dev_release() only works around for us by freeing
> dev_name(&shost->shost_dev) when host state is SHOST_CREATED. After host
> state is changed to SHOST_RUNNING, scsi_host_dev_release() doesn't do
> that any more.
> 
> So fix the issue by put .shost_dev in failure path if host state becomes
> running, meantime move get_device(&shost->shost_gendev) before
> device_add(&shost->shost_dev), so that scsi_host_cls_release() can put
> this reference.
> 
> Reported-by: John Garry <john.garry@huawei.com>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 796736e47764..7049844adb6b 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -257,12 +257,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
>  
>  	device_enable_async_suspend(&shost->shost_dev);
>  
> +	get_device(&shost->shost_gendev);
>  	error = device_add(&shost->shost_dev);
>  	if (error)
>  		goto out_del_gendev;
>  
> -	get_device(&shost->shost_gendev);
> -
>  	if (shost->transportt->host_size) {
>  		shost->shost_data = kzalloc(shost->transportt->host_size,
>  					 GFP_KERNEL);
> @@ -300,6 +299,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
>   out_del_dev:
>  	device_del(&shost->shost_dev);
>   out_del_gendev:
> +	/*
> +	 * host state has become SHOST_RUNNING, so we have to release
> +	 * ->shost_dev explicitly
> +	 */
> +	put_device(&shost->shost_dev);
>  	device_del(&shost->shost_gendev);
>   out_disable_runtime_pm:
>  	device_disable_async_suspend(&shost->shost_gendev);

Shouldn't this change be merged into patch 2/4 since both patches touch
the same function? Anyway, this patch also looks good to me.

Bart.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
  2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
@ 2021-06-03  3:08   ` Bart Van Assche
  2021-06-03 15:38   ` John Garry
  2021-06-07 11:41   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2021-06-03  3:08 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: John Garry, Hannes Reinecke

On 6/2/21 6:30 AM, Ming Lei wrote:
> get_device(shost->shost_gendev.parent) is called after host state is
> changed to SHOST_RUNNING. So scsi_host_dev_release() shouldn't release
> the parent device if host state is still SHOST_CREATED.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 7049844adb6b..34db5be7a562 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -350,7 +350,7 @@ static void scsi_host_dev_release(struct device *dev)
>  
>  	ida_simple_remove(&host_index_ida, shost->host_no);
>  
> -	if (parent)
> +	if (shost->shost_state != SHOST_CREATED)
>  		put_device(parent);
>  	kfree(shost);
>  }

Not sure what others think but I prefer that this patch would also be
merged into patch 2/4.

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running
  2021-06-03  3:06   ` Bart Van Assche
@ 2021-06-03  3:22     ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-03  3:22 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, John Garry, Hannes Reinecke

On Wed, Jun 02, 2021 at 08:06:31PM -0700, Bart Van Assche wrote:
> On 6/2/21 6:30 AM, Ming Lei wrote:
> > scsi_host_dev_release() only works around for us by freeing
> > dev_name(&shost->shost_dev) when host state is SHOST_CREATED. After host
> > state is changed to SHOST_RUNNING, scsi_host_dev_release() doesn't do
> > that any more.
> > 
> > So fix the issue by put .shost_dev in failure path if host state becomes
> > running, meantime move get_device(&shost->shost_gendev) before
> > device_add(&shost->shost_dev), so that scsi_host_cls_release() can put
> > this reference.
> > 
> > Reported-by: John Garry <john.garry@huawei.com>
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: Hannes Reinecke <hare@suse.de>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/scsi/hosts.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> > index 796736e47764..7049844adb6b 100644
> > --- a/drivers/scsi/hosts.c
> > +++ b/drivers/scsi/hosts.c
> > @@ -257,12 +257,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
> >  
> >  	device_enable_async_suspend(&shost->shost_dev);
> >  
> > +	get_device(&shost->shost_gendev);
> >  	error = device_add(&shost->shost_dev);
> >  	if (error)
> >  		goto out_del_gendev;
> >  
> > -	get_device(&shost->shost_gendev);
> > -
> >  	if (shost->transportt->host_size) {
> >  		shost->shost_data = kzalloc(shost->transportt->host_size,
> >  					 GFP_KERNEL);
> > @@ -300,6 +299,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
> >   out_del_dev:
> >  	device_del(&shost->shost_dev);
> >   out_del_gendev:
> > +	/*
> > +	 * host state has become SHOST_RUNNING, so we have to release
> > +	 * ->shost_dev explicitly
> > +	 */
> > +	put_device(&shost->shost_dev);
> >  	device_del(&shost->shost_gendev);
> >   out_disable_runtime_pm:
> >  	device_disable_async_suspend(&shost->shost_gendev);
> 
> Shouldn't this change be merged into patch 2/4 since both patches touch
> the same function? Anyway, this patch also looks good to me.

2/4 address double-free, this one fixes memory leak. Not mention this
one isn't trivial to find & figuring out, so it will be easier to review by
splitting them out.
 

Thanks,
Ming


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
  2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
  2021-06-03  3:08   ` Bart Van Assche
@ 2021-06-03 15:38   ` John Garry
  2021-06-07 11:41   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: John Garry @ 2021-06-03 15:38 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, Hannes Reinecke

On 02/06/2021 14:30, Ming Lei wrote:
> get_device(shost->shost_gendev.parent) is called after host state is
> changed to SHOST_RUNNING. So scsi_host_dev_release() shouldn't release
> the parent device if host state is still SHOST_CREATED.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   drivers/scsi/hosts.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 7049844adb6b..34db5be7a562 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -350,7 +350,7 @@ static void scsi_host_dev_release(struct device *de >
>   	ida_simple_remove(&host_index_ida, shost->host_no);
>   
> -	if (parent)
> +	if (shost->shost_state != SHOST_CREATED)
>   		put_device(parent);
>   	kfree(shost);
>   }
> 

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
  2021-06-03  2:26   ` Bart Van Assche
@ 2021-06-03 15:40   ` John Garry
  2021-06-07 11:39   ` Hannes Reinecke
  2021-06-29 19:23   ` Tyrel Datwyler
  3 siblings, 0 replies; 23+ messages in thread
From: John Garry @ 2021-06-03 15:40 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, Hannes Reinecke

On 02/06/2021 14:30, Ming Lei wrote:
> After device is initialized via device_initialize(), or its name is
> set via dev_set_name(), the device has to be freed via put_device(),
> otherwise device name will be leaked because it is allocated
> dynamically in dev_set_name().
> 
> Fixes the issue by replacing kfree(shost) via put_device(&shost->shost_gendev)
> which can help to free dev_name(&shost->shost_dev) when host state is
> in SHOST_CREATED. Meantime needn't to remove IDA and stop the kthread of
> shost->ehandler in the error handling code.
> 
> Cc: Bart Van Assche<bvanassche@acm.org>
> Cc: John Garry<john.garry@huawei.com>
> Cc: Hannes Reinecke<hare@suse.de>
> Signed-off-by: Ming Lei<ming.lei@redhat.com>

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma
  2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
  2021-06-03  2:32   ` Bart Van Assche
@ 2021-06-03 15:40   ` John Garry
  2021-06-07 11:37   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: John Garry @ 2021-06-03 15:40 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, Hannes Reinecke

On 02/06/2021 14:30, Ming Lei wrote:
> When scsi_add_host_with_dma() return failure, the caller will call
> scsi_host_put(shost) to release everything allocated for this host
> instance. So we can't free allocated stuff in scsi_add_host_with_dma(),
> otherwise double free will be caused.
> 
> Strictly speaking, these host resources allocation should have been
> moved to scsi_host_alloc(), but the allocation may need driver's
> info which can be built between calling scsi_host_alloc() and
> scsi_add_host(), so just keep the allocations in
> scsi_add_host_with_dma().
> 
> Fixes the problem by relying on host device's release handler to
> release everything.
> 
> Cc: Bart Van Assche<bvanassche@acm.org>
> Cc: John Garry<john.garry@huawei.com>
> Cc: Hannes Reinecke<hare@suse.de>
> Signed-off-by: Ming Lei<ming.lei@redhat.com>

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running
  2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
  2021-06-03  3:06   ` Bart Van Assche
@ 2021-06-03 15:41   ` John Garry
  2021-06-07 11:40   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: John Garry @ 2021-06-03 15:41 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, Hannes Reinecke

On 02/06/2021 14:30, Ming Lei wrote:
> scsi_host_dev_release() only works around for us by freeing
> dev_name(&shost->shost_dev) when host state is SHOST_CREATED. After host
> state is changed to SHOST_RUNNING, scsi_host_dev_release() doesn't do
> that any more.
> 
> So fix the issue by put .shost_dev in failure path if host state becomes
> running, meantime move get_device(&shost->shost_gendev) before
> device_add(&shost->shost_dev), so that scsi_host_cls_release() can put
> this reference.
> 
> Reported-by: John Garry<john.garry@huawei.com>
> Cc: Bart Van Assche<bvanassche@acm.org>
> Cc: Hannes Reinecke<hare@suse.de>
> Signed-off-by: Ming Lei<ming.lei@redhat.com>

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/4] scsi: fix failure handling of alloc/add host
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
                   ` (3 preceding siblings ...)
  2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
@ 2021-06-03 15:43 ` John Garry
  2021-06-08  3:04 ` Martin K. Petersen
  5 siblings, 0 replies; 23+ messages in thread
From: John Garry @ 2021-06-03 15:43 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, Hannes Reinecke

On 02/06/2021 14:30, Ming Lei wrote:
> Hello Martin,
> 
> Fix failure handling of alloc/add host code, and related device release
> handling.
> 
> 
> Ming Lei (4):
>    scsi: core: fix error handling of scsi_host_alloc
>    scsi: core: fix failure handling of scsi_add_host_with_dma
>    scsi: core: put .shost_dev in failure path if host state becomes
>      running
>    scsi: core: only put parent device if host state isn't in
>      SHOST_CREATED
> 
>   drivers/scsi/hosts.c | 47 ++++++++++++++++++++++++--------------------
>   1 file changed, 26 insertions(+), 21 deletions(-)
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> 

I tested a single error path in each of scsi_host_alloc() and 
scsi_add_host_with_dma(), and it looked ok, so:

Tested-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma
  2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
  2021-06-03  2:32   ` Bart Van Assche
  2021-06-03 15:40   ` John Garry
@ 2021-06-07 11:37   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2021-06-07 11:37 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: Bart Van Assche, John Garry

On 6/2/21 3:30 PM, Ming Lei wrote:
> When scsi_add_host_with_dma() return failure, the caller will call
> scsi_host_put(shost) to release everything allocated for this host
> instance. So we can't free allocated stuff in scsi_add_host_with_dma(),
> otherwise double free will be caused.
> 
> Strictly speaking, these host resources allocation should have been
> moved to scsi_host_alloc(), but the allocation may need driver's
> info which can be built between calling scsi_host_alloc() and
> scsi_add_host(), so just keep the allocations in
> scsi_add_host_with_dma().
> 
> Fixes the problem by relying on host device's release handler to
> release everything.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
  2021-06-03  2:26   ` Bart Van Assche
  2021-06-03 15:40   ` John Garry
@ 2021-06-07 11:39   ` Hannes Reinecke
  2021-06-29 19:23   ` Tyrel Datwyler
  3 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2021-06-07 11:39 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: Bart Van Assche, John Garry

On 6/2/21 3:30 PM, Ming Lei wrote:
> After device is initialized via device_initialize(), or its name is
> set via dev_set_name(), the device has to be freed via put_device(),
> otherwise device name will be leaked because it is allocated
> dynamically in dev_set_name().
> 
> Fixes the issue by replacing kfree(shost) via put_device(&shost->shost_gendev)
> which can help to free dev_name(&shost->shost_dev) when host state is
> in SHOST_CREATED. Meantime needn't to remove IDA and stop the kthread of
> shost->ehandler in the error handling code.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running
  2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
  2021-06-03  3:06   ` Bart Van Assche
  2021-06-03 15:41   ` John Garry
@ 2021-06-07 11:40   ` Hannes Reinecke
  2 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2021-06-07 11:40 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: John Garry, Bart Van Assche

On 6/2/21 3:30 PM, Ming Lei wrote:
> scsi_host_dev_release() only works around for us by freeing
> dev_name(&shost->shost_dev) when host state is SHOST_CREATED. After host
> state is changed to SHOST_RUNNING, scsi_host_dev_release() doesn't do
> that any more.
> 
> So fix the issue by put .shost_dev in failure path if host state becomes
> running, meantime move get_device(&shost->shost_gendev) before
> device_add(&shost->shost_dev), so that scsi_host_cls_release() can put
> this reference.
> 
> Reported-by: John Garry <john.garry@huawei.com>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
  2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
  2021-06-03  3:08   ` Bart Van Assche
  2021-06-03 15:38   ` John Garry
@ 2021-06-07 11:41   ` Hannes Reinecke
  2021-06-07 11:56     ` Ming Lei
  2 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2021-06-07 11:41 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi; +Cc: Bart Van Assche, John Garry

On 6/2/21 3:30 PM, Ming Lei wrote:
> get_device(shost->shost_gendev.parent) is called after host state is
> changed to SHOST_RUNNING. So scsi_host_dev_release() shouldn't release
> the parent device if host state is still SHOST_CREATED.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 7049844adb6b..34db5be7a562 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -350,7 +350,7 @@ static void scsi_host_dev_release(struct device *dev)
>  
>  	ida_simple_remove(&host_index_ida, shost->host_no);
>  
> -	if (parent)
> +	if (shost->shost_state != SHOST_CREATED)
>  		put_device(parent);
>  	kfree(shost);
>  }
> 
What happened to the check 'if (parent)'?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
  2021-06-07 11:41   ` Hannes Reinecke
@ 2021-06-07 11:56     ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-07 11:56 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K . Petersen, linux-scsi, Bart Van Assche, John Garry

On Mon, Jun 07, 2021 at 01:41:36PM +0200, Hannes Reinecke wrote:
> On 6/2/21 3:30 PM, Ming Lei wrote:
> > get_device(shost->shost_gendev.parent) is called after host state is
> > changed to SHOST_RUNNING. So scsi_host_dev_release() shouldn't release
> > the parent device if host state is still SHOST_CREATED.
> > 
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: John Garry <john.garry@huawei.com>
> > Cc: Hannes Reinecke <hare@suse.de>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/scsi/hosts.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> > index 7049844adb6b..34db5be7a562 100644
> > --- a/drivers/scsi/hosts.c
> > +++ b/drivers/scsi/hosts.c
> > @@ -350,7 +350,7 @@ static void scsi_host_dev_release(struct device *dev)
> >  
> >  	ida_simple_remove(&host_index_ida, shost->host_no);
> >  
> > -	if (parent)
> > +	if (shost->shost_state != SHOST_CREATED)
> >  		put_device(parent);
> >  	kfree(shost);
> >  }
> > 
> What happened to the check 'if (parent)'?

put_device() is just like kfree(), and it will handle null 'parent'.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/4] scsi: fix failure handling of alloc/add host
  2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
                   ` (4 preceding siblings ...)
  2021-06-03 15:43 ` [PATCH 0/4] scsi: fix failure handling of alloc/add host John Garry
@ 2021-06-08  3:04 ` Martin K. Petersen
  5 siblings, 0 replies; 23+ messages in thread
From: Martin K. Petersen @ 2021-06-08  3:04 UTC (permalink / raw)
  To: Ming Lei, linux-scsi
  Cc: Martin K . Petersen, John Garry, Hannes Reinecke, Bart Van Assche

On Wed, 2 Jun 2021 21:30:25 +0800, Ming Lei wrote:

> Fix failure handling of alloc/add host code, and related device release
> handling.
> 
> 
> Ming Lei (4):
>   scsi: core: fix error handling of scsi_host_alloc
>   scsi: core: fix failure handling of scsi_add_host_with_dma
>   scsi: core: put .shost_dev in failure path if host state becomes
>     running
>   scsi: core: only put parent device if host state isn't in
>     SHOST_CREATED
> 
> [...]

Applied to 5.13/scsi-fixes, thanks!

[1/4] scsi: core: fix error handling of scsi_host_alloc
      https://git.kernel.org/mkp/scsi/c/66a834d09293
[2/4] scsi: core: fix failure handling of scsi_add_host_with_dma
      https://git.kernel.org/mkp/scsi/c/3719f4ff047e
[3/4] scsi: core: put .shost_dev in failure path if host state becomes running
      https://git.kernel.org/mkp/scsi/c/11714026c02d
[4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED
      https://git.kernel.org/mkp/scsi/c/1e0d4e622599

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
                     ` (2 preceding siblings ...)
  2021-06-07 11:39   ` Hannes Reinecke
@ 2021-06-29 19:23   ` Tyrel Datwyler
  2021-06-30  0:11     ` Ming Lei
  3 siblings, 1 reply; 23+ messages in thread
From: Tyrel Datwyler @ 2021-06-29 19:23 UTC (permalink / raw)
  To: Ming Lei, Martin K . Petersen, linux-scsi
  Cc: Bart Van Assche, John Garry, Hannes Reinecke

On 6/2/21 6:30 AM, Ming Lei wrote:
> After device is initialized via device_initialize(), or its name is
> set via dev_set_name(), the device has to be freed via put_device(),
> otherwise device name will be leaked because it is allocated
> dynamically in dev_set_name().
> 
> Fixes the issue by replacing kfree(shost) via put_device(&shost->shost_gendev)
> which can help to free dev_name(&shost->shost_dev) when host state is
> in SHOST_CREATED. Meantime needn't to remove IDA and stop the kthread of
> shost->ehandler in the error handling code.

This statement is incorrect for kthread. If error handler thread failed to spawn
the value of shost->ehandler will be ERR_PTR(-ENOMEM) which will pass the "if
(shost->ehandler)" check in scsi_host_dev_release() resulting in a
kthread_stop() call for a non-existant kthread which triggers a bad pointer
dereference. Here is an example splat:

scsi host11: error handler thread failed to spawn, error = -4
Kernel attempted to read user page (10c) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x0000010c
Faulting instruction address: 0xc00000000818e9a8
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvscsi(+) scsi_transport_srp dm_multipath dm_mirror dm_region
 hash dm_log dm_mod fuse overlay squashfs loop
CPU: 12 PID: 274 Comm: systemd-udevd Not tainted 5.13.0-rc7 #1
NIP:  c00000000818e9a8 LR: c0000000089846e8 CTR: 0000000000007ee8
REGS: c000000037d12ea0 TRAP: 0300   Not tainted  (5.13.0-rc7)
MSR:  800000000280b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE&gt;  CR: 28228228
XER: 20040001
CFAR: c0000000089846e4 DAR: 000000000000010c DSISR: 40000000 IRQMASK: 0
GPR00: c0000000089846e8 c000000037d13140 c000000009cc1100 fffffffffffffffc
GPR04: 0000000000000001 0000000000000000 0000000000000000 c000000037dc0000
GPR08: 0000000000000000 c000000037dc0000 0000000000000001 00000000fffff7ff
GPR12: 0000000000008000 c00000000a049000 c000000037d13d00 000000011134d5a0
GPR16: 0000000000001740 c0080000190d0000 c0080000190d1740 c000000009129288
GPR20: c000000037d13bc0 0000000000000001 c000000037d13bc0 c0080000190b7898
GPR24: c0080000190b7708 0000000000000000 c000000033bb2c48 0000000000000000
GPR28: c000000046b28280 0000000000000000 000000000000010c fffffffffffffffc
NIP [c00000000818e9a8] kthread_stop+0x38/0x230
LR [c0000000089846e8] scsi_host_dev_release+0x98/0x160
Call Trace:
[c000000033bb2c48] 0xc000000033bb2c48 (unreliable)
[c0000000089846e8] scsi_host_dev_release+0x98/0x160
[c00000000891e960] device_release+0x60/0x100
[c0000000087e55c4] kobject_release+0x84/0x210
[c00000000891ec78] put_device+0x28/0x40
[c000000008984ea4] scsi_host_alloc+0x314/0x430
[c0080000190b38bc] ibmvscsi_probe+0x54/0xad0 [ibmvscsi]
[c000000008110104] vio_bus_probe+0xa4/0x4b0
[c00000000892a860] really_probe+0x140/0x680
[c00000000892aefc] driver_probe_device+0x15c/0x200
[c00000000892b63c] device_driver_attach+0xcc/0xe0
[c00000000892b740] __driver_attach+0xf0/0x200
[c000000008926f28] bus_for_each_dev+0xa8/0x130
[c000000008929ce4] driver_attach+0x34/0x50
[c000000008928fc0] bus_add_driver+0x1b0/0x300
[c00000000892c798] driver_register+0x98/0x1a0
[c00000000810eb60] __vio_register_driver+0x80/0xe0
[c0080000190b4a30] ibmvscsi_module_init+0x9c/0xdc [ibmvscsi]
[c0000000080121d0] do_one_initcall+0x60/0x2d0
[c000000008261abc] do_init_module+0x7c/0x320
[c000000008265700] load_module+0x2350/0x25b0
[c000000008265cb4] __do_sys_finit_module+0xd4/0x160
[c000000008031110] system_call_exception+0x150/0x2d0
[c00000000800d35c] system_call_common+0xec/0x278


I'm happy to send a fix, but I see two possible approaches.

1.) Set shost->ehandler = NULL if kthread_run() fails in scsi_host_alloc()

or

2.) Test that (shost->ehandler && !IS_ERR(shost->ehandler)) before calling
kthread_stop in scsi_host_dev_release()

-Tyrel

> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/hosts.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 624e2582c3df..25cf76e73595 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -391,8 +391,10 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
>  	mutex_init(&shost->scan_mutex);
>  
>  	index = ida_simple_get(&host_index_ida, 0, 0, GFP_KERNEL);
> -	if (index < 0)
> -		goto fail_kfree;
> +	if (index < 0) {
> +		kfree(shost);
> +		return NULL;
> +	}
>  	shost->host_no = index;
>  
>  	shost->dma_channel = 0xff;
> @@ -484,7 +486,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
>  		shost_printk(KERN_WARNING, shost,
>  			"error handler thread failed to spawn, error = %ld\n",
>  			PTR_ERR(shost->ehandler));
> -		goto fail_index_remove;
> +		goto fail;
>  	}
>  
>  	shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d",
> @@ -493,17 +495,18 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
>  	if (!shost->tmf_work_q) {
>  		shost_printk(KERN_WARNING, shost,
>  			     "failed to create tmf workq\n");
> -		goto fail_kthread;
> +		goto fail;
>  	}
>  	scsi_proc_hostdir_add(shost->hostt);
>  	return shost;
> + fail:
> +	/*
> +	 * host state is still SHOST_CREATED, and it is enough to release
> +	 * ->shost_gendev since scsi_host_dev_release() can help to free
> +	 * dev_name(&shost->shost_dev)
> +	 */
> +	put_device(&shost->shost_gendev);
>  
> - fail_kthread:
> -	kthread_stop(shost->ehandler);
> - fail_index_remove:
> -	ida_simple_remove(&host_index_ida, shost->host_no);
> - fail_kfree:
> -	kfree(shost);
>  	return NULL;
>  }
>  EXPORT_SYMBOL(scsi_host_alloc);
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc
  2021-06-29 19:23   ` Tyrel Datwyler
@ 2021-06-30  0:11     ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2021-06-30  0:11 UTC (permalink / raw)
  To: Tyrel Datwyler
  Cc: Martin K . Petersen, linux-scsi, Bart Van Assche, John Garry,
	Hannes Reinecke

On Tue, Jun 29, 2021 at 12:23:04PM -0700, Tyrel Datwyler wrote:
> On 6/2/21 6:30 AM, Ming Lei wrote:
> > After device is initialized via device_initialize(), or its name is
> > set via dev_set_name(), the device has to be freed via put_device(),
> > otherwise device name will be leaked because it is allocated
> > dynamically in dev_set_name().
> > 
> > Fixes the issue by replacing kfree(shost) via put_device(&shost->shost_gendev)
> > which can help to free dev_name(&shost->shost_dev) when host state is
> > in SHOST_CREATED. Meantime needn't to remove IDA and stop the kthread of
> > shost->ehandler in the error handling code.
> 
> This statement is incorrect for kthread. If error handler thread failed to spawn
> the value of shost->ehandler will be ERR_PTR(-ENOMEM) which will pass the "if
> (shost->ehandler)" check in scsi_host_dev_release() resulting in a
> kthread_stop() call for a non-existant kthread which triggers a bad pointer
> dereference. Here is an example splat:
> 
> scsi host11: error handler thread failed to spawn, error = -4
> Kernel attempted to read user page (10c) - exploit attempt? (uid: 0)
> BUG: Kernel NULL pointer dereference on read at 0x0000010c
> Faulting instruction address: 0xc00000000818e9a8
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: ibmvscsi(+) scsi_transport_srp dm_multipath dm_mirror dm_region
>  hash dm_log dm_mod fuse overlay squashfs loop
> CPU: 12 PID: 274 Comm: systemd-udevd Not tainted 5.13.0-rc7 #1
> NIP:  c00000000818e9a8 LR: c0000000089846e8 CTR: 0000000000007ee8
> REGS: c000000037d12ea0 TRAP: 0300   Not tainted  (5.13.0-rc7)
> MSR:  800000000280b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE&gt;  CR: 28228228
> XER: 20040001
> CFAR: c0000000089846e4 DAR: 000000000000010c DSISR: 40000000 IRQMASK: 0
> GPR00: c0000000089846e8 c000000037d13140 c000000009cc1100 fffffffffffffffc
> GPR04: 0000000000000001 0000000000000000 0000000000000000 c000000037dc0000
> GPR08: 0000000000000000 c000000037dc0000 0000000000000001 00000000fffff7ff
> GPR12: 0000000000008000 c00000000a049000 c000000037d13d00 000000011134d5a0
> GPR16: 0000000000001740 c0080000190d0000 c0080000190d1740 c000000009129288
> GPR20: c000000037d13bc0 0000000000000001 c000000037d13bc0 c0080000190b7898
> GPR24: c0080000190b7708 0000000000000000 c000000033bb2c48 0000000000000000
> GPR28: c000000046b28280 0000000000000000 000000000000010c fffffffffffffffc
> NIP [c00000000818e9a8] kthread_stop+0x38/0x230
> LR [c0000000089846e8] scsi_host_dev_release+0x98/0x160
> Call Trace:
> [c000000033bb2c48] 0xc000000033bb2c48 (unreliable)
> [c0000000089846e8] scsi_host_dev_release+0x98/0x160
> [c00000000891e960] device_release+0x60/0x100
> [c0000000087e55c4] kobject_release+0x84/0x210
> [c00000000891ec78] put_device+0x28/0x40
> [c000000008984ea4] scsi_host_alloc+0x314/0x430
> [c0080000190b38bc] ibmvscsi_probe+0x54/0xad0 [ibmvscsi]
> [c000000008110104] vio_bus_probe+0xa4/0x4b0
> [c00000000892a860] really_probe+0x140/0x680
> [c00000000892aefc] driver_probe_device+0x15c/0x200
> [c00000000892b63c] device_driver_attach+0xcc/0xe0
> [c00000000892b740] __driver_attach+0xf0/0x200
> [c000000008926f28] bus_for_each_dev+0xa8/0x130
> [c000000008929ce4] driver_attach+0x34/0x50
> [c000000008928fc0] bus_add_driver+0x1b0/0x300
> [c00000000892c798] driver_register+0x98/0x1a0
> [c00000000810eb60] __vio_register_driver+0x80/0xe0
> [c0080000190b4a30] ibmvscsi_module_init+0x9c/0xdc [ibmvscsi]
> [c0000000080121d0] do_one_initcall+0x60/0x2d0
> [c000000008261abc] do_init_module+0x7c/0x320
> [c000000008265700] load_module+0x2350/0x25b0
> [c000000008265cb4] __do_sys_finit_module+0xd4/0x160
> [c000000008031110] system_call_exception+0x150/0x2d0
> [c00000000800d35c] system_call_common+0xec/0x278
> 
> 
> I'm happy to send a fix, but I see two possible approaches.
> 
> 1.) Set shost->ehandler = NULL if kthread_run() fails in scsi_host_alloc()
> 
> or
> 
> 2.) Test that (shost->ehandler && !IS_ERR(shost->ehandler)) before calling
> kthread_stop in scsi_host_dev_release()

Either one looks fine for me, please go ahead to make a patch, and thanks for
the catch!

-- 
Ming


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-06-30  0:11 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 13:30 [PATCH 0/4] scsi: fix failure handling of alloc/add host Ming Lei
2021-06-02 13:30 ` [PATCH 1/4] scsi: core: fix error handling of scsi_host_alloc Ming Lei
2021-06-03  2:26   ` Bart Van Assche
2021-06-03 15:40   ` John Garry
2021-06-07 11:39   ` Hannes Reinecke
2021-06-29 19:23   ` Tyrel Datwyler
2021-06-30  0:11     ` Ming Lei
2021-06-02 13:30 ` [PATCH 2/4] scsi: core: fix failure handling of scsi_add_host_with_dma Ming Lei
2021-06-03  2:32   ` Bart Van Assche
2021-06-03 15:40   ` John Garry
2021-06-07 11:37   ` Hannes Reinecke
2021-06-02 13:30 ` [PATCH 3/4] scsi: core: put .shost_dev in failure path if host state becomes running Ming Lei
2021-06-03  3:06   ` Bart Van Assche
2021-06-03  3:22     ` Ming Lei
2021-06-03 15:41   ` John Garry
2021-06-07 11:40   ` Hannes Reinecke
2021-06-02 13:30 ` [PATCH 4/4] scsi: core: only put parent device if host state isn't in SHOST_CREATED Ming Lei
2021-06-03  3:08   ` Bart Van Assche
2021-06-03 15:38   ` John Garry
2021-06-07 11:41   ` Hannes Reinecke
2021-06-07 11:56     ` Ming Lei
2021-06-03 15:43 ` [PATCH 0/4] scsi: fix failure handling of alloc/add host John Garry
2021-06-08  3:04 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).