Stable Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] nbd: fix shutdown and recv work deadlock
@ 2019-12-02 21:51 Mike Christie
  2019-12-03 15:45 ` Mike Christie
  0 siblings, 1 reply; 2+ messages in thread
From: Mike Christie @ 2019-12-02 21:51 UTC (permalink / raw)
  To: sunke32, nbd, axboe, josef, linux-block; +Cc: Mike Christie, stable

This fixes a regression added with:

commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
Author: Mike Christie <mchristi@redhat.com>
Date:   Sun Aug 4 14:10:06 2019 -0500

    nbd: fix max number of supported devs

where we can deadlock during device shutdown. The problem will occur if
userpsace has done a NBD_CLEAR_SOCK call, then does close() before the
recv_work work has done its nbd_config_put() call. If recv_work does the
last call then it will do destroy_workqueue which will then be stuck
waiting for the work we are running from.

This fixes the issue by having nbd_start_device_ioctl flush the work
queue on both the failure and success cases and has a refcount on the
nbd_device while it is flushing the work queue.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/block/nbd.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 57532465fb83..f8597d2fb365 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
 
 	if (max_part)
 		bdev->bd_invalidated = 1;
+
+	refcount_inc(&nbd->config_refs);
 	mutex_unlock(&nbd->config_lock);
 	ret = wait_event_interruptible(config->recv_wq,
 					 atomic_read(&config->recv_threads) == 0);
-	if (ret) {
+	if (ret)
 		sock_shutdown(nbd);
-		flush_workqueue(nbd->recv_workq);
-	}
+	flush_workqueue(nbd->recv_workq);
+
 	mutex_lock(&nbd->config_lock);
 	nbd_bdev_reset(bdev);
 	/* user requested, ignore socket errors */
@@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
 		ret = 0;
 	if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags))
 		ret = -ETIMEDOUT;
+	nbd_config_put(nbd);
 	return ret;
 }
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] nbd: fix shutdown and recv work deadlock
  2019-12-02 21:51 [PATCH] nbd: fix shutdown and recv work deadlock Mike Christie
@ 2019-12-03 15:45 ` Mike Christie
  0 siblings, 0 replies; 2+ messages in thread
From: Mike Christie @ 2019-12-03 15:45 UTC (permalink / raw)
  To: sunke32, nbd, axboe, josef, linux-block; +Cc: stable

Josef and Jens,

Ignore this patch. It could also deadlock but in a different way, and it
looks like there are other possible issues with races and refcounts. I
will send some new patches.


On 12/02/2019 03:51 PM, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> Author: Mike Christie <mchristi@redhat.com>
> Date:   Sun Aug 4 14:10:06 2019 -0500
> 
>     nbd: fix max number of supported devs
> 
> where we can deadlock during device shutdown. The problem will occur if
> userpsace has done a NBD_CLEAR_SOCK call, then does close() before the
> recv_work work has done its nbd_config_put() call. If recv_work does the
> last call then it will do destroy_workqueue which will then be stuck
> waiting for the work we are running from.
> 
> This fixes the issue by having nbd_start_device_ioctl flush the work
> queue on both the failure and success cases and has a refcount on the
> nbd_device while it is flushing the work queue.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mike Christie <mchristi@redhat.com>
> ---
>  drivers/block/nbd.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 57532465fb83..f8597d2fb365 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
>  
>  	if (max_part)
>  		bdev->bd_invalidated = 1;
> +
> +	refcount_inc(&nbd->config_refs);
>  	mutex_unlock(&nbd->config_lock);
>  	ret = wait_event_interruptible(config->recv_wq,
>  					 atomic_read(&config->recv_threads) == 0);
> -	if (ret) {
> +	if (ret)
>  		sock_shutdown(nbd);
> -		flush_workqueue(nbd->recv_workq);
> -	}
> +	flush_workqueue(nbd->recv_workq);
> +
>  	mutex_lock(&nbd->config_lock);
>  	nbd_bdev_reset(bdev);
>  	/* user requested, ignore socket errors */
> @@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
>  		ret = 0;
>  	if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags))
>  		ret = -ETIMEDOUT;
> +	nbd_config_put(nbd);
>  	return ret;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-02 21:51 [PATCH] nbd: fix shutdown and recv work deadlock Mike Christie
2019-12-03 15:45 ` Mike Christie

Stable Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://lore.kernel.org/stable \
		stable@vger.kernel.org
	public-inbox-index stable

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git