linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xen-netfront: Fix hang on device removal
@ 2018-02-28 12:23 Jason Andryuk
  2018-02-28 15:21 ` [Xen-devel] " Boris Ostrovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jason Andryuk @ 2018-02-28 12:23 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: Jason Andryuk, Eduardo Otubo, Boris Ostrovsky, Juergen Gross, open list

A toolstack may delete the vif frontend and backend xenstore entries
while xen-netfront is in the removal code path.  In that case, the
checks for xenbus_read_driver_state would return XenbusStateUnknown, and
xennet_remove would hang indefinitely.  This hang prevents system
shutdown.

xennet_remove must be able to handle XenbusStateUnknown, and
netback_changed must also wake up the wake_queue for that state as well.

Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Cc: Eduardo Otubo <otubo@redhat.com>
---
 drivers/net/xen-netfront.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8328d395e332..3127bc8633ca 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -2005,7 +2005,10 @@ static void netback_changed(struct xenbus_device *dev,
 	case XenbusStateInitialised:
 	case XenbusStateReconfiguring:
 	case XenbusStateReconfigured:
+		break;
+
 	case XenbusStateUnknown:
+		wake_up_all(&module_unload_q);
 		break;
 
 	case XenbusStateInitWait:
@@ -2136,7 +2139,9 @@ static int xennet_remove(struct xenbus_device *dev)
 		xenbus_switch_state(dev, XenbusStateClosing);
 		wait_event(module_unload_q,
 			   xenbus_read_driver_state(dev->otherend) ==
-			   XenbusStateClosing);
+			   XenbusStateClosing ||
+			   xenbus_read_driver_state(dev->otherend) ==
+			   XenbusStateUnknown);
 
 		xenbus_switch_state(dev, XenbusStateClosed);
 		wait_event(module_unload_q,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: Fix hang on device removal
  2018-02-28 12:23 [PATCH] xen-netfront: Fix hang on device removal Jason Andryuk
@ 2018-02-28 15:21 ` Boris Ostrovsky
  2018-02-28 19:21 ` Juergen Gross
  2018-04-19 18:10 ` [Xen-devel] " Simon Gaiser
  2 siblings, 0 replies; 7+ messages in thread
From: Boris Ostrovsky @ 2018-02-28 15:21 UTC (permalink / raw)
  To: Jason Andryuk, xen-devel, netdev; +Cc: Eduardo Otubo, Juergen Gross, open list

On 02/28/2018 07:23 AM, Jason Andryuk wrote:
> A toolstack may delete the vif frontend and backend xenstore entries
> while xen-netfront is in the removal code path.  In that case, the
> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
> xennet_remove would hang indefinitely.  This hang prevents system
> shutdown.
>
> xennet_remove must be able to handle XenbusStateUnknown, and
> netback_changed must also wake up the wake_queue for that state as well.
>
> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")
>
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> Cc: Eduardo Otubo <otubo@redhat.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xen-netfront: Fix hang on device removal
  2018-02-28 12:23 [PATCH] xen-netfront: Fix hang on device removal Jason Andryuk
  2018-02-28 15:21 ` [Xen-devel] " Boris Ostrovsky
@ 2018-02-28 19:21 ` Juergen Gross
  2018-04-19 18:10 ` [Xen-devel] " Simon Gaiser
  2 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2018-02-28 19:21 UTC (permalink / raw)
  To: Jason Andryuk, xen-devel, netdev
  Cc: Eduardo Otubo, Boris Ostrovsky, open list

On 28/02/18 13:23, Jason Andryuk wrote:
> A toolstack may delete the vif frontend and backend xenstore entries
> while xen-netfront is in the removal code path.  In that case, the
> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
> xennet_remove would hang indefinitely.  This hang prevents system
> shutdown.
> 
> xennet_remove must be able to handle XenbusStateUnknown, and
> netback_changed must also wake up the wake_queue for that state as well.
> 
> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")
> 
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> Cc: Eduardo Otubo <otubo@redhat.com>

Committed to xen/tip for-linus-4.16a


Juergen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: Fix hang on device removal
  2018-02-28 12:23 [PATCH] xen-netfront: Fix hang on device removal Jason Andryuk
  2018-02-28 15:21 ` [Xen-devel] " Boris Ostrovsky
  2018-02-28 19:21 ` Juergen Gross
@ 2018-04-19 18:10 ` Simon Gaiser
  2018-04-19 18:14   ` Jason Andryuk
  2 siblings, 1 reply; 7+ messages in thread
From: Simon Gaiser @ 2018-04-19 18:10 UTC (permalink / raw)
  To: netdev
  Cc: Jason Andryuk, xen-devel, Eduardo Otubo, Juergen Gross,
	Boris Ostrovsky, open list


[-- Attachment #1.1: Type: text/plain, Size: 815 bytes --]

Jason Andryuk:
> A toolstack may delete the vif frontend and backend xenstore entries
> while xen-netfront is in the removal code path.  In that case, the
> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
> xennet_remove would hang indefinitely.  This hang prevents system
> shutdown.
> 
> xennet_remove must be able to handle XenbusStateUnknown, and
> netback_changed must also wake up the wake_queue for that state as well.
> 
> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")

I think this should go into stable since AFAIK the hanging network
device can only be fixed by rebooting the guest. AFAICS this affects all
4.* branches since 5b5971df3bc2 got backported to them.

Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61.

Simon


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: Fix hang on device removal
  2018-04-19 18:10 ` [Xen-devel] " Simon Gaiser
@ 2018-04-19 18:14   ` Jason Andryuk
  2018-04-19 20:09     ` Simon Gaiser
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Andryuk @ 2018-04-19 18:14 UTC (permalink / raw)
  To: Simon Gaiser
  Cc: netdev, xen-devel, Eduardo Otubo, Juergen Gross, Boris Ostrovsky,
	open list

On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser
<simon@invisiblethingslab.com> wrote:
> Jason Andryuk:
>> A toolstack may delete the vif frontend and backend xenstore entries
>> while xen-netfront is in the removal code path.  In that case, the
>> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
>> xennet_remove would hang indefinitely.  This hang prevents system
>> shutdown.
>>
>> xennet_remove must be able to handle XenbusStateUnknown, and
>> netback_changed must also wake up the wake_queue for that state as well.
>>
>> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")
>
> I think this should go into stable since AFAIK the hanging network
> device can only be fixed by rebooting the guest. AFAICS this affects all
> 4.* branches since 5b5971df3bc2 got backported to them.
>
> Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61.

Simon,

Yes, I agree.  I actually submitted the request to stable earlier
today, so hopefully it gets added soon.

Have you experienced this hang?

Regards,
Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: Fix hang on device removal
  2018-04-19 18:14   ` Jason Andryuk
@ 2018-04-19 20:09     ` Simon Gaiser
  2018-04-20 12:41       ` Jason Andryuk
  0 siblings, 1 reply; 7+ messages in thread
From: Simon Gaiser @ 2018-04-19 20:09 UTC (permalink / raw)
  To: Jason Andryuk
  Cc: netdev, xen-devel, Eduardo Otubo, Juergen Gross, Boris Ostrovsky,
	open list


[-- Attachment #1.1: Type: text/plain, Size: 1353 bytes --]

Jason Andryuk:
> On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser
> <simon@invisiblethingslab.com> wrote:
>> Jason Andryuk:
>>> A toolstack may delete the vif frontend and backend xenstore entries
>>> while xen-netfront is in the removal code path.  In that case, the
>>> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
>>> xennet_remove would hang indefinitely.  This hang prevents system
>>> shutdown.
>>>
>>> xennet_remove must be able to handle XenbusStateUnknown, and
>>> netback_changed must also wake up the wake_queue for that state as well.
>>>
>>> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")
>>
>> I think this should go into stable since AFAIK the hanging network
>> device can only be fixed by rebooting the guest. AFAICS this affects all
>> 4.* branches since 5b5971df3bc2 got backported to them.
>>
>> Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61.
> 
> Simon,
> 
> Yes, I agree.  I actually submitted the request to stable earlier
> today, so hopefully it gets added soon.

Ok, great. (I checked the stable patch queue, but didn't check the
mailing list archive).

> Have you experienced this hang?

Yes, it's affecting the kernel shipped by Qubes OS (see [1]).

Thanks, Simon.

[1]: https://github.com/QubesOS/qubes-issues/issues/3657


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: Fix hang on device removal
  2018-04-19 20:09     ` Simon Gaiser
@ 2018-04-20 12:41       ` Jason Andryuk
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Andryuk @ 2018-04-20 12:41 UTC (permalink / raw)
  To: Simon Gaiser
  Cc: netdev, xen-devel, Eduardo Otubo, Juergen Gross, Boris Ostrovsky,
	open list

On Thu, Apr 19, 2018 at 4:09 PM, Simon Gaiser
<simon@invisiblethingslab.com> wrote:
> Jason Andryuk:
>> On Thu, Apr 19, 2018 at 2:10 PM, Simon Gaiser
>> <simon@invisiblethingslab.com> wrote:
>>> Jason Andryuk:
>>>> A toolstack may delete the vif frontend and backend xenstore entries
>>>> while xen-netfront is in the removal code path.  In that case, the
>>>> checks for xenbus_read_driver_state would return XenbusStateUnknown, and
>>>> xennet_remove would hang indefinitely.  This hang prevents system
>>>> shutdown.
>>>>
>>>> xennet_remove must be able to handle XenbusStateUnknown, and
>>>> netback_changed must also wake up the wake_queue for that state as well.
>>>>
>>>> Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module")
>>>
>>> I think this should go into stable since AFAIK the hanging network
>>> device can only be fixed by rebooting the guest. AFAICS this affects all
>>> 4.* branches since 5b5971df3bc2 got backported to them.
>>>
>>> Upstream commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61.
>>
>> Simon,
>>
>> Yes, I agree.  I actually submitted the request to stable earlier
>> today, so hopefully it gets added soon.
>
> Ok, great. (I checked the stable patch queue, but didn't check the
> mailing list archive).
>
>> Have you experienced this hang?
>
> Yes, it's affecting the kernel shipped by Qubes OS (see [1]).

Ok, interesting.  I tracked down this bug with older xenvm tools, and
I didn't know if libxl tools were also affected.

Greg KH added the patch to the stable queue, so it's in the process.

Regards,
Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-04-20 12:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 12:23 [PATCH] xen-netfront: Fix hang on device removal Jason Andryuk
2018-02-28 15:21 ` [Xen-devel] " Boris Ostrovsky
2018-02-28 19:21 ` Juergen Gross
2018-04-19 18:10 ` [Xen-devel] " Simon Gaiser
2018-04-19 18:14   ` Jason Andryuk
2018-04-19 20:09     ` Simon Gaiser
2018-04-20 12:41       ` Jason Andryuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).