linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scsi: core: don't start concurrent async scan on same host
@ 2020-10-10  3:25 Ming Lei
  2020-10-10 17:31 ` Lee Duncan
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Ming Lei @ 2020-10-10  3:25 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen
  Cc: Ming Lei, Christoph Hellwig, Ewan D . Milne, Hannes Reinecke,
	Bart Van Assche

Current scsi host scan mechanism supposes to fallback into sync host
scan if async scan is in-progress. However, this rule isn't strictly
respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
checking shost->async_scan. When scsi_scan_host() is called
concurrently, two async scan on same host may be started, and hang in
do_scan_async() is observed.

Fixes this issue by checking & setting shost->async_scan atomically
with shost->scan_mutex.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_scan.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2437a7570ce..9af50e6f94c4 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1714,15 +1714,16 @@ static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
  */
 static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
 {
-	struct async_scan_data *data;
+	struct async_scan_data *data = NULL;
 	unsigned long flags;
 
 	if (strncmp(scsi_scan_type, "sync", 4) == 0)
 		return NULL;
 
+	mutex_lock(&shost->scan_mutex);
 	if (shost->async_scan) {
 		shost_printk(KERN_DEBUG, shost, "%s called twice\n", __func__);
-		return NULL;
+		goto err;
 	}
 
 	data = kmalloc(sizeof(*data), GFP_KERNEL);
@@ -1733,7 +1734,6 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
 		goto err;
 	init_completion(&data->prev_finished);
 
-	mutex_lock(&shost->scan_mutex);
 	spin_lock_irqsave(shost->host_lock, flags);
 	shost->async_scan = 1;
 	spin_unlock_irqrestore(shost->host_lock, flags);
@@ -1748,6 +1748,7 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
 	return data;
 
  err:
+	mutex_unlock(&shost->scan_mutex);
 	kfree(data);
 	return NULL;
 }
-- 
2.25.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: core: don't start concurrent async scan on same host
  2020-10-10  3:25 [PATCH] scsi: core: don't start concurrent async scan on same host Ming Lei
@ 2020-10-10 17:31 ` Lee Duncan
  2020-10-17  6:23 ` Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Lee Duncan @ 2020-10-10 17:31 UTC (permalink / raw)
  To: Ming Lei, linux-scsi, Martin K . Petersen
  Cc: Christoph Hellwig, Ewan D . Milne, Hannes Reinecke, Bart Van Assche

On 10/9/20 8:25 PM, Ming Lei wrote:
> Current scsi host scan mechanism supposes to fallback into sync host
> scan if async scan is in-progress. However, this rule isn't strictly
> respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
> checking shost->async_scan. When scsi_scan_host() is called
> concurrently, two async scan on same host may be started, and hang in
> do_scan_async() is observed.
> 
> Fixes this issue by checking & setting shost->async_scan atomically
> with shost->scan_mutex.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/scsi_scan.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f2437a7570ce..9af50e6f94c4 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1714,15 +1714,16 @@ static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
>   */
>  static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>  {
> -	struct async_scan_data *data;
> +	struct async_scan_data *data = NULL;
>  	unsigned long flags;
>  
>  	if (strncmp(scsi_scan_type, "sync", 4) == 0)
>  		return NULL;
>  
> +	mutex_lock(&shost->scan_mutex);
>  	if (shost->async_scan) {
>  		shost_printk(KERN_DEBUG, shost, "%s called twice\n", __func__);
> -		return NULL;
> +		goto err;
>  	}
>  
>  	data = kmalloc(sizeof(*data), GFP_KERNEL);
> @@ -1733,7 +1734,6 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>  		goto err;
>  	init_completion(&data->prev_finished);
>  
> -	mutex_lock(&shost->scan_mutex);
>  	spin_lock_irqsave(shost->host_lock, flags);
>  	shost->async_scan = 1;
>  	spin_unlock_irqrestore(shost->host_lock, flags);
> @@ -1748,6 +1748,7 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>  	return data;
>  
>   err:
> +	mutex_unlock(&shost->scan_mutex);
>  	kfree(data);
>  	return NULL;
>  }
> 
Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: core: don't start concurrent async scan on same host
  2020-10-10  3:25 [PATCH] scsi: core: don't start concurrent async scan on same host Ming Lei
  2020-10-10 17:31 ` Lee Duncan
@ 2020-10-17  6:23 ` Ming Lei
  2020-10-19 14:17 ` Bart Van Assche
  2020-10-26 20:08 ` Martin K. Petersen
  3 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2020-10-17  6:23 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux SCSI List, Martin K . Petersen, Christoph Hellwig,
	Ewan D . Milne, Hannes Reinecke, Bart Van Assche

On Sat, Oct 10, 2020 at 4:23 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> Current scsi host scan mechanism supposes to fallback into sync host
> scan if async scan is in-progress. However, this rule isn't strictly
> respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
> checking shost->async_scan. When scsi_scan_host() is called
> concurrently, two async scan on same host may be started, and hang in
> do_scan_async() is observed.
>
> Fixes this issue by checking & setting shost->async_scan atomically
> with shost->scan_mutex.
>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/scsi_scan.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f2437a7570ce..9af50e6f94c4 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1714,15 +1714,16 @@ static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
>   */
>  static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>  {
> -       struct async_scan_data *data;
> +       struct async_scan_data *data = NULL;
>         unsigned long flags;
>
>         if (strncmp(scsi_scan_type, "sync", 4) == 0)
>                 return NULL;
>
> +       mutex_lock(&shost->scan_mutex);
>         if (shost->async_scan) {
>                 shost_printk(KERN_DEBUG, shost, "%s called twice\n", __func__);
> -               return NULL;
> +               goto err;
>         }
>
>         data = kmalloc(sizeof(*data), GFP_KERNEL);
> @@ -1733,7 +1734,6 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>                 goto err;
>         init_completion(&data->prev_finished);
>
> -       mutex_lock(&shost->scan_mutex);
>         spin_lock_irqsave(shost->host_lock, flags);
>         shost->async_scan = 1;
>         spin_unlock_irqrestore(shost->host_lock, flags);
> @@ -1748,6 +1748,7 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
>         return data;
>
>   err:
> +       mutex_unlock(&shost->scan_mutex);
>         kfree(data);
>         return NULL;
>  }
> --
> 2.25.2
>

Hello Guys,

Ping...

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: core: don't start concurrent async scan on same host
  2020-10-10  3:25 [PATCH] scsi: core: don't start concurrent async scan on same host Ming Lei
  2020-10-10 17:31 ` Lee Duncan
  2020-10-17  6:23 ` Ming Lei
@ 2020-10-19 14:17 ` Bart Van Assche
  2020-10-19 14:38   ` Ming Lei
  2020-10-26 20:08 ` Martin K. Petersen
  3 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2020-10-19 14:17 UTC (permalink / raw)
  To: Ming Lei, linux-scsi, Martin K . Petersen
  Cc: Christoph Hellwig, Ewan D . Milne, Hannes Reinecke

On 10/9/20 8:25 PM, Ming Lei wrote:
> Current scsi host scan mechanism supposes to fallback into sync host
> scan if async scan is in-progress. However, this rule isn't strictly
> respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
> checking shost->async_scan. When scsi_scan_host() is called
> concurrently, two async scan on same host may be started, and hang in
> do_scan_async() is observed.
> 
> Fixes this issue by checking & setting shost->async_scan atomically
> with shost->scan_mutex.

Did you perhaps mean "by serializing shost->async_scan accesses with
shost->scan_mutex"?

It is not clear to me why the shost->async_scan assignment is protected
with the host lock. Can spin_lock_irqsave(shost->host_lock, flags) and
spin_unlock_irqrestore(shost->host_lock, flags) be left out from this
function?

Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: core: don't start concurrent async scan on same host
  2020-10-19 14:17 ` Bart Van Assche
@ 2020-10-19 14:38   ` Ming Lei
  0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2020-10-19 14:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi, Martin K . Petersen, Christoph Hellwig,
	Ewan D . Milne, Hannes Reinecke

On Mon, Oct 19, 2020 at 07:17:52AM -0700, Bart Van Assche wrote:
> On 10/9/20 8:25 PM, Ming Lei wrote:
> > Current scsi host scan mechanism supposes to fallback into sync host
> > scan if async scan is in-progress. However, this rule isn't strictly
> > respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
> > checking shost->async_scan. When scsi_scan_host() is called
> > concurrently, two async scan on same host may be started, and hang in
> > do_scan_async() is observed.
> > 
> > Fixes this issue by checking & setting shost->async_scan atomically
> > with shost->scan_mutex.
> 
> Did you perhaps mean "by serializing shost->async_scan accesses with
> shost->scan_mutex"?

Specifically, the following checking & setting has to be done atomically,
so shost->scan_mutex is required, just as what scsi_finish_async_scan()
does for clearing shost->async_scan:

scsi_prep_async_scan():

	if (!shost->async_scan)
		return NULL;
	...
	shost->async_scan = 1

> 
> It is not clear to me why the shost->async_scan assignment is protected
> with the host lock. Can spin_lock_irqsave(shost->host_lock, flags) and
> spin_unlock_irqrestore(shost->host_lock, flags) be left out from this
> function?

I think it is doable to remove the ->host_lock from both scsi_prep_async_scan()
and scsi_finish_async_scan(), which can be done as one follow-up cleanup.

With this patch, all reading/writing shost->async_scan are protected by
shost->scan_mutex.

> 
> Anyway:
> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Thanks!


-- 
Ming


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: core: don't start concurrent async scan on same host
  2020-10-10  3:25 [PATCH] scsi: core: don't start concurrent async scan on same host Ming Lei
                   ` (2 preceding siblings ...)
  2020-10-19 14:17 ` Bart Van Assche
@ 2020-10-26 20:08 ` Martin K. Petersen
  3 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2020-10-26 20:08 UTC (permalink / raw)
  To: Ming Lei
  Cc: linux-scsi, Martin K . Petersen, Christoph Hellwig,
	Ewan D . Milne, Hannes Reinecke, Bart Van Assche


Ming,

> Current scsi host scan mechanism supposes to fallback into sync host
> scan if async scan is in-progress. However, this rule isn't strictly
> respected, because scsi_prep_async_scan() doesn't hold scan_mutex when
> checking shost->async_scan. When scsi_scan_host() is called
> concurrently, two async scan on same host may be started, and hang in
> do_scan_async() is observed.
>
> Fixes this issue by checking & setting shost->async_scan atomically
> with shost->scan_mutex.

Applied to 5.10/scsi-fixes, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-26 20:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-10  3:25 [PATCH] scsi: core: don't start concurrent async scan on same host Ming Lei
2020-10-10 17:31 ` Lee Duncan
2020-10-17  6:23 ` Ming Lei
2020-10-19 14:17 ` Bart Van Assche
2020-10-19 14:38   ` Ming Lei
2020-10-26 20:08 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).