All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface
@ 2019-03-06 17:19 Heitor R. Alves de Siqueira
  2019-03-06 17:48 ` [Qemu-devel] [Bug 1818880] " Heitor R. Alves de Siqueira
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-06 17:19 UTC (permalink / raw)
  To: qemu-devel

Public bug reported:

[Impact]
Qemu guests hang indefinitely

[Description]
When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

Upstream commit:
https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

$ git describe --contains 73c6e4013b4c
v2.10.0-rc2~1^2~8

$ rmadison qemu
===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
     qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
     qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
     qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

[Test Case]
Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
2) Stress the network interface with e.g. Windows HLK test suite or similar
3) Repeatedly attach/detach the network adapter that's in use
It usually takes more than ~4000 attach/detach cycles to trigger the bug.

[Regression Potential]
Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with autokpkgtest and scripted Qemu runs.

** Affects: qemu
     Importance: Undecided
         Status: Fix Released

** Affects: qemu (Ubuntu)
     Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
         Status: Fix Released

** Affects: qemu (Ubuntu Xenial)
     Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
         Status: Confirmed

** Affects: qemu (Ubuntu Bionic)
     Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
         Status: Fix Released

** Affects: qemu (Ubuntu Cosmic)
     Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
         Status: Fix Released

** Affects: qemu (Ubuntu Disco)
     Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
         Status: Fix Released


** Tags: sts

** Changed in: qemu
       Status: New => Confirmed

** Changed in: qemu
       Status: Confirmed => Fix Released

** Also affects: qemu (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: qemu (Ubuntu)
     Assignee: (unassigned) => Heitor R. Alves de Siqueira (halves)

** Changed in: qemu (Ubuntu)
       Status: New => Confirmed

** Also affects: qemu (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: qemu (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: qemu (Ubuntu Disco)
   Importance: Undecided
     Assignee: Heitor R. Alves de Siqueira (halves)
       Status: Confirmed

** Also affects: qemu (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Changed in: qemu (Ubuntu Cosmic)
     Assignee: (unassigned) => Heitor R. Alves de Siqueira (halves)

** Changed in: qemu (Ubuntu Bionic)
     Assignee: (unassigned) => Heitor R. Alves de Siqueira (halves)

** Changed in: qemu (Ubuntu Xenial)
     Assignee: (unassigned) => Heitor R. Alves de Siqueira (halves)

** Changed in: qemu (Ubuntu Disco)
       Status: Confirmed => Fix Released

** Changed in: qemu (Ubuntu Cosmic)
       Status: New => Fix Released

** Changed in: qemu (Ubuntu Bionic)
       Status: New => Fix Released

** Changed in: qemu (Ubuntu Xenial)
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with autokpkgtest and scripted Qemu runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
@ 2019-03-06 17:48 ` Heitor R. Alves de Siqueira
  2019-03-07 13:42 ` Heitor R. Alves de Siqueira
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-06 17:48 UTC (permalink / raw)
  To: qemu-devel

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Changed in: cloud-archive
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with autokpkgtest and scripted Qemu runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
  2019-03-06 17:48 ` [Qemu-devel] [Bug 1818880] " Heitor R. Alves de Siqueira
@ 2019-03-07 13:42 ` Heitor R. Alves de Siqueira
  2019-03-07 18:51 ` Heitor R. Alves de Siqueira
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-07 13:42 UTC (permalink / raw)
  To: qemu-devel

** Patch added: "Debdiff for xenial"
   https://bugs.launchpad.net/cloud-archive/+bug/1818880/+attachment/5244384/+files/xenial.debdiff

** Description changed:

  [Impact]
  Qemu guests hang indefinitely
  
  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.
  
  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c
  
  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8
  
  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
-      qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
-      qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
-      qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...
+      qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
+      qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
+      qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...
  
  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.
  
  [Regression Potential]
- Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with autokpkgtest and scripted Qemu runs.
+ Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

** Tags added: sts-sponsor

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
  2019-03-06 17:48 ` [Qemu-devel] [Bug 1818880] " Heitor R. Alves de Siqueira
  2019-03-07 13:42 ` Heitor R. Alves de Siqueira
@ 2019-03-07 18:51 ` Heitor R. Alves de Siqueira
  2019-03-07 20:44 ` Dan Streetman
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-07 18:51 UTC (permalink / raw)
  To: qemu-devel

** Changed in: qemu (Ubuntu Disco)
     Assignee: Heitor R. Alves de Siqueira (halves) => (unassigned)

** Changed in: qemu (Ubuntu Cosmic)
     Assignee: Heitor R. Alves de Siqueira (halves) => (unassigned)

** Changed in: qemu (Ubuntu Bionic)
     Assignee: Heitor R. Alves de Siqueira (halves) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
                   ` (2 preceding siblings ...)
  2019-03-07 18:51 ` Heitor R. Alves de Siqueira
@ 2019-03-07 20:44 ` Dan Streetman
  2019-03-07 21:23 ` Heitor R. Alves de Siqueira
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dan Streetman @ 2019-03-07 20:44 UTC (permalink / raw)
  To: qemu-devel

** Tags added: sts-sponsor-ddstreet

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
                   ` (3 preceding siblings ...)
  2019-03-07 20:44 ` Dan Streetman
@ 2019-03-07 21:23 ` Heitor R. Alves de Siqueira
  2019-03-07 21:43 ` Heitor R. Alves de Siqueira
  2019-03-08  7:12 ` Thomas Huth
  6 siblings, 0 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-07 21:23 UTC (permalink / raw)
  To: qemu-devel

Patch v2:
Added missing DEP3 info and corrected pkg version

** Patch added: "Debdiff for xenial v2"
   https://bugs.launchpad.net/cloud-archive/+bug/1818880/+attachment/5244567/+files/xenial_v2.debdiff

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
                   ` (4 preceding siblings ...)
  2019-03-07 21:23 ` Heitor R. Alves de Siqueira
@ 2019-03-07 21:43 ` Heitor R. Alves de Siqueira
  2019-03-08  7:12 ` Thomas Huth
  6 siblings, 0 replies; 8+ messages in thread
From: Heitor R. Alves de Siqueira @ 2019-03-07 21:43 UTC (permalink / raw)
  To: qemu-devel

** Patch removed: "Debdiff for xenial v2"
   https://bugs.launchpad.net/cloud-archive/+bug/1818880/+attachment/5244567/+files/xenial_v2.debdiff

** Patch added: "Correct debdiff for xenial v2"
   https://bugs.launchpad.net/cloud-archive/+bug/1818880/+attachment/5244568/+files/lp1818880-v2.debdiff

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [Bug 1818880] Re: Deadlock when detaching network interface
  2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
                   ` (5 preceding siblings ...)
  2019-03-07 21:43 ` Heitor R. Alves de Siqueira
@ 2019-03-08  7:12 ` Thomas Huth
  6 siblings, 0 replies; 8+ messages in thread
From: Thomas Huth @ 2019-03-08  7:12 UTC (permalink / raw)
  To: qemu-devel

** No longer affects: qemu

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818880

Title:
  Deadlock when detaching network interface

Status in Ubuntu Cloud Archive:
  Confirmed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Confirmed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [Impact]
  Qemu guests hang indefinitely

  [Description]
  When running a Qemu guest with VirtIO network interfaces, detaching an interface that's currently being used can result in a deadlock. The guest instance will hang and become unresponsive to commands, and the only option is to kill -9 the instance.
  The reason for this is a dealock between a monitor and an RCU thread, which will fight over the BQL (qemu_global_mutex) and the critical RCU section locks. The monitor thread will acquire the BQL for detaching the network interface, and fire up a helper thread to deal with detaching the network adapter. That new thread needs to wait on the RCU thread to complete the deletion, but the RCU thread wants the BQL to commit its transactions.
  This bug is already fixed upstream (73c6e4013b4c rcu: completely disable pthread_atfork callbacks as soon as possible) and included for other series (see below), so we don't need to backport it to Bionic onwards.

  Upstream commit:
  https://git.qemu.org/?p=qemu.git;a=commit;h=73c6e4013b4c

  $ git describe --contains 73c6e4013b4c
  v2.10.0-rc2~1^2~8

  $ rmadison qemu
  ===> qemu | 1:2.5+dfsg-5ubuntu10.34 | xenial-updates/universe   | amd64, ...
       qemu | 1:2.11+dfsg-1ubuntu7    | bionic/universe           | amd64, ...
       qemu | 1:2.12+dfsg-3ubuntu8    | cosmic/universe           | amd64, ...
       qemu | 1:3.1+dfsg-2ubuntu2     | disco/universe            | amd64, ...

  [Test Case]
  Being a racing condition, this is a tricky bug to reproduce consistently. We've had reports of users running into this with OpenStack deployments and Windows Server guests, and the scenario is usually like this:
  1) Deploy a 16vCPU Windows Server 2012 R2 guest with a virtio network interface
  2) Stress the network interface with e.g. Windows HLK test suite or similar
  3) Repeatedly attach/detach the network adapter that's in use
  It usually takes more than ~4000 attach/detach cycles to trigger the bug.

  [Regression Potential]
  Regressions for this might arise from the fact that the fix changes RCU lock code. Since this patch has been upstream and in other series for a while, it's unlikely that it would regressions in RCU code specifically. Other code that makes use of the RCU locks (MMIO and some monitor events) will be thoroughly tested for any regressions with use-case scenarios and scripted runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1818880/+subscriptions

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-03-08  7:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-06 17:19 [Qemu-devel] [Bug 1818880] [NEW] Deadlock when detaching network interface Heitor R. Alves de Siqueira
2019-03-06 17:48 ` [Qemu-devel] [Bug 1818880] " Heitor R. Alves de Siqueira
2019-03-07 13:42 ` Heitor R. Alves de Siqueira
2019-03-07 18:51 ` Heitor R. Alves de Siqueira
2019-03-07 20:44 ` Dan Streetman
2019-03-07 21:23 ` Heitor R. Alves de Siqueira
2019-03-07 21:43 ` Heitor R. Alves de Siqueira
2019-03-08  7:12 ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.