linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nbd: add a flush_workqueue in nbd_start_device
@ 2020-01-21 12:48 Sun Ke
  2020-01-21 14:00 ` Josef Bacik
  0 siblings, 1 reply; 4+ messages in thread
From: Sun Ke @ 2020-01-21 12:48 UTC (permalink / raw)
  To: josef, axboe, sunke32, mchristi; +Cc: linux-block, nbd, linux-kernel

When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.

If num_connections is m (2 < m), and NO.1 ~ NO.n
(1 < n < m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.

To fix it, add a flush_workqueue in nbd_start_device.

Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke <sunke32@huawei.com>
---
 drivers/block/nbd.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b4607dd96185..dd1f8c2c6169 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1264,7 +1264,12 @@ static int nbd_start_device(struct nbd_device *nbd)
 
 		args = kzalloc(sizeof(*args), GFP_KERNEL);
 		if (!args) {
-			sock_shutdown(nbd);
+			if (i == 0)
+				sock_shutdown(nbd);
+			else {
+				sock_shutdown(nbd);
+				flush_workqueue(nbd->recv_workq);
+			}
 			return -ENOMEM;
 		}
 		sk_set_memalloc(config->socks[i]->sock->sk);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] nbd: add a flush_workqueue in nbd_start_device
  2020-01-21 12:48 [PATCH] nbd: add a flush_workqueue in nbd_start_device Sun Ke
@ 2020-01-21 14:00 ` Josef Bacik
  2020-01-21 21:25   ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2020-01-21 14:00 UTC (permalink / raw)
  To: Sun Ke, axboe, mchristi; +Cc: linux-block, nbd, linux-kernel

On 1/21/20 7:48 AM, Sun Ke wrote:
> When kzalloc fail, may cause trying to destroy the
> workqueue from inside the workqueue.
> 
> If num_connections is m (2 < m), and NO.1 ~ NO.n
> (1 < n < m) kzalloc are successful. The NO.(n + 1)
> failed. Then, nbd_start_device will return ENOMEM
> to nbd_start_device_ioctl, and nbd_start_device_ioctl
> will return immediately without running flush_workqueue.
> However, we still have n recv threads. If nbd_release
> run first, recv threads may have to drop the last
> config_refs and try to destroy the workqueue from
> inside the workqueue.
> 
> To fix it, add a flush_workqueue in nbd_start_device.
> 
> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> Signed-off-by: Sun Ke <sunke32@huawei.com>
> ---
>   drivers/block/nbd.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index b4607dd96185..dd1f8c2c6169 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1264,7 +1264,12 @@ static int nbd_start_device(struct nbd_device *nbd)
>   
>   		args = kzalloc(sizeof(*args), GFP_KERNEL);
>   		if (!args) {
> -			sock_shutdown(nbd);
> +			if (i == 0)
> +				sock_shutdown(nbd);
> +			else {
> +				sock_shutdown(nbd);
> +				flush_workqueue(nbd->recv_workq);
> +			}

Just for readability sake why don't we just flush_workqueue() unconditionally, 
and add a comment so we know why in the future.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nbd: add a flush_workqueue in nbd_start_device
  2020-01-21 14:00 ` Josef Bacik
@ 2020-01-21 21:25   ` Jens Axboe
  2020-01-22  2:45     ` sunke (E)
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2020-01-21 21:25 UTC (permalink / raw)
  To: Josef Bacik, Sun Ke, mchristi; +Cc: linux-block, nbd, linux-kernel

On 1/21/20 7:00 AM, Josef Bacik wrote:
> On 1/21/20 7:48 AM, Sun Ke wrote:
>> When kzalloc fail, may cause trying to destroy the
>> workqueue from inside the workqueue.
>>
>> If num_connections is m (2 < m), and NO.1 ~ NO.n
>> (1 < n < m) kzalloc are successful. The NO.(n + 1)
>> failed. Then, nbd_start_device will return ENOMEM
>> to nbd_start_device_ioctl, and nbd_start_device_ioctl
>> will return immediately without running flush_workqueue.
>> However, we still have n recv threads. If nbd_release
>> run first, recv threads may have to drop the last
>> config_refs and try to destroy the workqueue from
>> inside the workqueue.
>>
>> To fix it, add a flush_workqueue in nbd_start_device.
>>
>> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
>> Signed-off-by: Sun Ke <sunke32@huawei.com>
>> ---
>>   drivers/block/nbd.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index b4607dd96185..dd1f8c2c6169 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -1264,7 +1264,12 @@ static int nbd_start_device(struct nbd_device *nbd)
>>   
>>   		args = kzalloc(sizeof(*args), GFP_KERNEL);
>>   		if (!args) {
>> -			sock_shutdown(nbd);
>> +			if (i == 0)
>> +				sock_shutdown(nbd);
>> +			else {
>> +				sock_shutdown(nbd);
>> +				flush_workqueue(nbd->recv_workq);
>> +			}
> 
> Just for readability sake why don't we just flush_workqueue()
> unconditionally, and add a comment so we know why in the future.

Or maybe just make it:

	sock_shutdown(nbd);
	if (i)
		flush_workqueue(nbd->recv_workq);

which does the same thing, but is still readable. The current code with
the shutdown duplication is just a bit odd. Needs a comment either way.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] nbd: add a flush_workqueue in nbd_start_device
  2020-01-21 21:25   ` Jens Axboe
@ 2020-01-22  2:45     ` sunke (E)
  0 siblings, 0 replies; 4+ messages in thread
From: sunke (E) @ 2020-01-22  2:45 UTC (permalink / raw)
  To: Jens Axboe, Josef Bacik, mchristi; +Cc: linux-block, nbd, linux-kernel



在 2020/1/22 5:25, Jens Axboe 写道:
> On 1/21/20 7:00 AM, Josef Bacik wrote:
>> On 1/21/20 7:48 AM, Sun Ke wrote:
>>> When kzalloc fail, may cause trying to destroy the
>>> workqueue from inside the workqueue.
>>>
>>> If num_connections is m (2 < m), and NO.1 ~ NO.n
>>> (1 < n < m) kzalloc are successful. The NO.(n + 1)
>>> failed. Then, nbd_start_device will return ENOMEM
>>> to nbd_start_device_ioctl, and nbd_start_device_ioctl
>>> will return immediately without running flush_workqueue.
>>> However, we still have n recv threads. If nbd_release
>>> run first, recv threads may have to drop the last
>>> config_refs and try to destroy the workqueue from
>>> inside the workqueue.
>>>
>>> To fix it, add a flush_workqueue in nbd_start_device.
>>>
>>> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
>>> Signed-off-by: Sun Ke <sunke32@huawei.com>
>>> ---
>>>    drivers/block/nbd.c | 7 ++++++-
>>>    1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index b4607dd96185..dd1f8c2c6169 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -1264,7 +1264,12 @@ static int nbd_start_device(struct nbd_device *nbd)
>>>    
>>>    		args = kzalloc(sizeof(*args), GFP_KERNEL);
>>>    		if (!args) {
>>> -			sock_shutdown(nbd);
>>> +			if (i == 0)
>>> +				sock_shutdown(nbd);
>>> +			else {
>>> +				sock_shutdown(nbd);
>>> +				flush_workqueue(nbd->recv_workq);
>>> +			}
>>
>> Just for readability sake why don't we just flush_workqueue()
>> unconditionally, and add a comment so we know why in the future.
> 
> Or maybe just make it:
> 
> 	sock_shutdown(nbd);
> 	if (i)
> 		flush_workqueue(nbd->recv_workq);
> 
> which does the same thing, but is still readable. The current code with
> the shutdown duplication is just a bit odd. Needs a comment either way.
> 

OK, I will improve it in my v2 patch.

Thanks,

Sun Ke


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-22  2:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21 12:48 [PATCH] nbd: add a flush_workqueue in nbd_start_device Sun Ke
2020-01-21 14:00 ` Josef Bacik
2020-01-21 21:25   ` Jens Axboe
2020-01-22  2:45     ` sunke (E)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).