qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
@ 2019-09-12 18:19 Sergio Lopez
  2019-09-12 19:51 ` Michael S. Tsirkin
  0 siblings, 1 reply; 6+ messages in thread
From: Sergio Lopez @ 2019-09-12 18:19 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, Sergio Lopez, mst, qemu-devel, mreitz, stefanha

Another AioContext-related issue, and this is a tricky one.

Executing a QMP block_resize request for a virtio-blk device running
on an iothread may cause a deadlock involving the following mutexes:

 - main thead
  * Has acquired: qemu_mutex_global.
  * Is trying the acquire: iothread AioContext lock via
    AIO_WAIT_WHILE (after aio_poll).

 - iothread
  * Has acquired: AioContext lock.
  * Is trying to acquire: qemu_mutex_global (via
    virtio_notify_config->prepare_mmio_access).

With this change, virtio_blk_resize checks if it's being called from a
coroutine context running on a non-main thread, and if that's the
case, creates a new coroutine and schedules it to be run on the main
thread.

This works, but means the actual operation is done
asynchronously, perhaps opening a window in which a "device_del"
operation may fit and remove the VirtIODevice before
virtio_notify_config() is executed.

I *think* it shouldn't be possible, as BHs will be processed before
any new QMP/monitor command, but I'm open to a different approach.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 18851601cb..c763d071f6 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -16,6 +16,7 @@
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "qemu/error-report.h"
+#include "qemu/main-loop.h"
 #include "trace.h"
 #include "hw/block/block.h"
 #include "hw/qdev-properties.h"
@@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
     return 0;
 }
 
+static void coroutine_fn virtio_resize_co_entry(void *opaque)
+{
+    VirtIODevice *vdev = opaque;
+
+    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+    virtio_notify_config(vdev);
+    aio_wait_kick();
+}
+
 static void virtio_blk_resize(void *opaque)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+    Coroutine *co;
 
-    virtio_notify_config(vdev);
+    if (qemu_in_coroutine() &&
+        qemu_get_current_aio_context() != qemu_get_aio_context()) {
+        /*
+         * virtio_notify_config() needs to acquire the global mutex,
+         * so calling it from a coroutine running on a non-main context
+         * may cause a deadlock. Instead, create a new coroutine and
+         * schedule it to be run on the main thread.
+         */
+        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
+        aio_co_schedule(qemu_get_aio_context(), co);
+    } else {
+        virtio_notify_config(vdev);
+    }
 }
 
 static const BlockDevOps virtio_block_ops = {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
  2019-09-12 18:19 [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context Sergio Lopez
@ 2019-09-12 19:51 ` Michael S. Tsirkin
  2019-09-13  7:46   ` Sergio Lopez
  2019-09-13  9:04   ` Kevin Wolf
  0 siblings, 2 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2019-09-12 19:51 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: kwolf, stefanha, qemu-devel, qemu-block, mreitz

On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
> Another AioContext-related issue, and this is a tricky one.
> 
> Executing a QMP block_resize request for a virtio-blk device running
> on an iothread may cause a deadlock involving the following mutexes:
> 
>  - main thead
>   * Has acquired: qemu_mutex_global.
>   * Is trying the acquire: iothread AioContext lock via
>     AIO_WAIT_WHILE (after aio_poll).
> 
>  - iothread
>   * Has acquired: AioContext lock.
>   * Is trying to acquire: qemu_mutex_global (via
>     virtio_notify_config->prepare_mmio_access).

Hmm is this really the only case iothread takes qemu mutex?
If any such access can deadlock, don't we need a generic
solution? Maybe main thread can drop qemu mutex
before taking io thread AioContext lock?

> With this change, virtio_blk_resize checks if it's being called from a
> coroutine context running on a non-main thread, and if that's the
> case, creates a new coroutine and schedules it to be run on the main
> thread.
> 
> This works, but means the actual operation is done
> asynchronously, perhaps opening a window in which a "device_del"
> operation may fit and remove the VirtIODevice before
> virtio_notify_config() is executed.
> 
> I *think* it shouldn't be possible, as BHs will be processed before
> any new QMP/monitor command, but I'm open to a different approach.
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 18851601cb..c763d071f6 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -16,6 +16,7 @@
>  #include "qemu/iov.h"
>  #include "qemu/module.h"
>  #include "qemu/error-report.h"
> +#include "qemu/main-loop.h"
>  #include "trace.h"
>  #include "hw/block/block.h"
>  #include "hw/qdev-properties.h"
> @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
>      return 0;
>  }
>  
> +static void coroutine_fn virtio_resize_co_entry(void *opaque)
> +{
> +    VirtIODevice *vdev = opaque;
> +
> +    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> +    virtio_notify_config(vdev);
> +    aio_wait_kick();
> +}
> +
>  static void virtio_blk_resize(void *opaque)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
> +    Coroutine *co;
>  
> -    virtio_notify_config(vdev);
> +    if (qemu_in_coroutine() &&
> +        qemu_get_current_aio_context() != qemu_get_aio_context()) {
> +        /*
> +         * virtio_notify_config() needs to acquire the global mutex,
> +         * so calling it from a coroutine running on a non-main context
> +         * may cause a deadlock. Instead, create a new coroutine and
> +         * schedule it to be run on the main thread.
> +         */
> +        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
> +        aio_co_schedule(qemu_get_aio_context(), co);
> +    } else {
> +        virtio_notify_config(vdev);
> +    }
>  }
>  
>  static const BlockDevOps virtio_block_ops = {
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
  2019-09-12 19:51 ` Michael S. Tsirkin
@ 2019-09-13  7:46   ` Sergio Lopez
  2019-09-13  9:04   ` Kevin Wolf
  1 sibling, 0 replies; 6+ messages in thread
From: Sergio Lopez @ 2019-09-13  7:46 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kwolf, stefanha, qemu-devel, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 3765 bytes --]


Michael S. Tsirkin <mst@redhat.com> writes:

> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
>> Another AioContext-related issue, and this is a tricky one.
>> 
>> Executing a QMP block_resize request for a virtio-blk device running
>> on an iothread may cause a deadlock involving the following mutexes:
>> 
>>  - main thead
>>   * Has acquired: qemu_mutex_global.
>>   * Is trying the acquire: iothread AioContext lock via
>>     AIO_WAIT_WHILE (after aio_poll).
>> 
>>  - iothread
>>   * Has acquired: AioContext lock.
>>   * Is trying to acquire: qemu_mutex_global (via
>>     virtio_notify_config->prepare_mmio_access).
>
> Hmm is this really the only case iothread takes qemu mutex?

Not the only one that takes the mutex, but the only one so far we found
doing so upon request from a job running on the main thread (should be
quite noticeable, due to the deadlock).

> If any such access can deadlock, don't we need a generic
> solution? Maybe main thread can drop qemu mutex
> before taking io thread AioContext lock?

The mutex is acquired very early at os_host_main_loop_wait(), so I
assume there may be many assumptions in multiple code paths that it has
been acquired.

>> With this change, virtio_blk_resize checks if it's being called from a
>> coroutine context running on a non-main thread, and if that's the
>> case, creates a new coroutine and schedules it to be run on the main
>> thread.
>> 
>> This works, but means the actual operation is done
>> asynchronously, perhaps opening a window in which a "device_del"
>> operation may fit and remove the VirtIODevice before
>> virtio_notify_config() is executed.
>> 
>> I *think* it shouldn't be possible, as BHs will be processed before
>> any new QMP/monitor command, but I'm open to a different approach.
>> 
>> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>> ---
>>  hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
>>  1 file changed, 24 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index 18851601cb..c763d071f6 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -16,6 +16,7 @@
>>  #include "qemu/iov.h"
>>  #include "qemu/module.h"
>>  #include "qemu/error-report.h"
>> +#include "qemu/main-loop.h"
>>  #include "trace.h"
>>  #include "hw/block/block.h"
>>  #include "hw/qdev-properties.h"
>> @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
>>      return 0;
>>  }
>>  
>> +static void coroutine_fn virtio_resize_co_entry(void *opaque)
>> +{
>> +    VirtIODevice *vdev = opaque;
>> +
>> +    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>> +    virtio_notify_config(vdev);
>> +    aio_wait_kick();
>> +}
>> +
>>  static void virtio_blk_resize(void *opaque)
>>  {
>>      VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
>> +    Coroutine *co;
>>  
>> -    virtio_notify_config(vdev);
>> +    if (qemu_in_coroutine() &&
>> +        qemu_get_current_aio_context() != qemu_get_aio_context()) {
>> +        /*
>> +         * virtio_notify_config() needs to acquire the global mutex,
>> +         * so calling it from a coroutine running on a non-main context
>> +         * may cause a deadlock. Instead, create a new coroutine and
>> +         * schedule it to be run on the main thread.
>> +         */
>> +        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
>> +        aio_co_schedule(qemu_get_aio_context(), co);
>> +    } else {
>> +        virtio_notify_config(vdev);
>> +    }
>>  }
>>  
>>  static const BlockDevOps virtio_block_ops = {
>> -- 
>> 2.21.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
  2019-09-12 19:51 ` Michael S. Tsirkin
  2019-09-13  7:46   ` Sergio Lopez
@ 2019-09-13  9:04   ` Kevin Wolf
  2019-09-13  9:28     ` Sergio Lopez
  1 sibling, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2019-09-13  9:04 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: mreitz, stefanha, qemu-block, Sergio Lopez, qemu-devel

Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben:
> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
> > Another AioContext-related issue, and this is a tricky one.
> > 
> > Executing a QMP block_resize request for a virtio-blk device running
> > on an iothread may cause a deadlock involving the following mutexes:
> > 
> >  - main thead
> >   * Has acquired: qemu_mutex_global.
> >   * Is trying the acquire: iothread AioContext lock via
> >     AIO_WAIT_WHILE (after aio_poll).
> > 
> >  - iothread
> >   * Has acquired: AioContext lock.
> >   * Is trying to acquire: qemu_mutex_global (via
> >     virtio_notify_config->prepare_mmio_access).
> 
> Hmm is this really the only case iothread takes qemu mutex?
> If any such access can deadlock, don't we need a generic
> solution? Maybe main thread can drop qemu mutex
> before taking io thread AioContext lock?

The rule is that iothreads must not take the qemu mutex. If they do
(like in this case), it's a bug.

Maybe we could actually assert this in qemu_mutex_lock_iothread()?

> > With this change, virtio_blk_resize checks if it's being called from a
> > coroutine context running on a non-main thread, and if that's the
> > case, creates a new coroutine and schedules it to be run on the main
> > thread.
> > 
> > This works, but means the actual operation is done
> > asynchronously, perhaps opening a window in which a "device_del"
> > operation may fit and remove the VirtIODevice before
> > virtio_notify_config() is executed.
> > 
> > I *think* it shouldn't be possible, as BHs will be processed before
> > any new QMP/monitor command, but I'm open to a different approach.
> > 
> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
> > Signed-off-by: Sergio Lopez <slp@redhat.com>
> > ---
> >  hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index 18851601cb..c763d071f6 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -16,6 +16,7 @@
> >  #include "qemu/iov.h"
> >  #include "qemu/module.h"
> >  #include "qemu/error-report.h"
> > +#include "qemu/main-loop.h"
> >  #include "trace.h"
> >  #include "hw/block/block.h"
> >  #include "hw/qdev-properties.h"
> > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
> >      return 0;
> >  }
> >  
> > +static void coroutine_fn virtio_resize_co_entry(void *opaque)
> > +{
> > +    VirtIODevice *vdev = opaque;
> > +
> > +    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> > +    virtio_notify_config(vdev);
> > +    aio_wait_kick();
> > +}
> > +
> >  static void virtio_blk_resize(void *opaque)
> >  {
> >      VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
> > +    Coroutine *co;
> >  
> > -    virtio_notify_config(vdev);
> > +    if (qemu_in_coroutine() &&
> > +        qemu_get_current_aio_context() != qemu_get_aio_context()) {
> > +        /*
> > +         * virtio_notify_config() needs to acquire the global mutex,
> > +         * so calling it from a coroutine running on a non-main context
> > +         * may cause a deadlock. Instead, create a new coroutine and
> > +         * schedule it to be run on the main thread.
> > +         */
> > +        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
> > +        aio_co_schedule(qemu_get_aio_context(), co);
> > +    } else {
> > +        virtio_notify_config(vdev);
> > +    }
> >  }

Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why
you need a coroutine when you never yield.

The reason why it deadlocks also has nothing to do with whether we are
called from a coroutine or not. The important part is that we're running
in an iothread.

Kevin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
  2019-09-13  9:04   ` Kevin Wolf
@ 2019-09-13  9:28     ` Sergio Lopez
  2019-09-13  9:45       ` Kevin Wolf
  0 siblings, 1 reply; 6+ messages in thread
From: Sergio Lopez @ 2019-09-13  9:28 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: mreitz, stefanha, qemu-devel, qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 4200 bytes --]


Kevin Wolf <kwolf@redhat.com> writes:

> Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben:
>> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
>> > Another AioContext-related issue, and this is a tricky one.
>> > 
>> > Executing a QMP block_resize request for a virtio-blk device running
>> > on an iothread may cause a deadlock involving the following mutexes:
>> > 
>> >  - main thead
>> >   * Has acquired: qemu_mutex_global.
>> >   * Is trying the acquire: iothread AioContext lock via
>> >     AIO_WAIT_WHILE (after aio_poll).
>> > 
>> >  - iothread
>> >   * Has acquired: AioContext lock.
>> >   * Is trying to acquire: qemu_mutex_global (via
>> >     virtio_notify_config->prepare_mmio_access).
>> 
>> Hmm is this really the only case iothread takes qemu mutex?
>> If any such access can deadlock, don't we need a generic
>> solution? Maybe main thread can drop qemu mutex
>> before taking io thread AioContext lock?
>
> The rule is that iothreads must not take the qemu mutex. If they do
> (like in this case), it's a bug.
>
> Maybe we could actually assert this in qemu_mutex_lock_iothread()?
>
>> > With this change, virtio_blk_resize checks if it's being called from a
>> > coroutine context running on a non-main thread, and if that's the
>> > case, creates a new coroutine and schedules it to be run on the main
>> > thread.
>> > 
>> > This works, but means the actual operation is done
>> > asynchronously, perhaps opening a window in which a "device_del"
>> > operation may fit and remove the VirtIODevice before
>> > virtio_notify_config() is executed.
>> > 
>> > I *think* it shouldn't be possible, as BHs will be processed before
>> > any new QMP/monitor command, but I'm open to a different approach.
>> > 
>> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
>> > Signed-off-by: Sergio Lopez <slp@redhat.com>
>> > ---
>> >  hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
>> >  1 file changed, 24 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> > index 18851601cb..c763d071f6 100644
>> > --- a/hw/block/virtio-blk.c
>> > +++ b/hw/block/virtio-blk.c
>> > @@ -16,6 +16,7 @@
>> >  #include "qemu/iov.h"
>> >  #include "qemu/module.h"
>> >  #include "qemu/error-report.h"
>> > +#include "qemu/main-loop.h"
>> >  #include "trace.h"
>> >  #include "hw/block/block.h"
>> >  #include "hw/qdev-properties.h"
>> > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
>> >      return 0;
>> >  }
>> >  
>> > +static void coroutine_fn virtio_resize_co_entry(void *opaque)
>> > +{
>> > +    VirtIODevice *vdev = opaque;
>> > +
>> > +    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>> > +    virtio_notify_config(vdev);
>> > +    aio_wait_kick();
>> > +}
>> > +
>> >  static void virtio_blk_resize(void *opaque)
>> >  {
>> >      VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
>> > +    Coroutine *co;
>> >  
>> > -    virtio_notify_config(vdev);
>> > +    if (qemu_in_coroutine() &&
>> > +        qemu_get_current_aio_context() != qemu_get_aio_context()) {
>> > +        /*
>> > +         * virtio_notify_config() needs to acquire the global mutex,
>> > +         * so calling it from a coroutine running on a non-main context
>> > +         * may cause a deadlock. Instead, create a new coroutine and
>> > +         * schedule it to be run on the main thread.
>> > +         */
>> > +        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
>> > +        aio_co_schedule(qemu_get_aio_context(), co);
>> > +    } else {
>> > +        virtio_notify_config(vdev);
>> > +    }
>> >  }
>
> Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why
> you need a coroutine when you never yield.

You're right, that's actually simpler, haven't thought of it.

Do you see any drawbacks or should I send a non-RFC fixed version of
this patch?

> The reason why it deadlocks also has nothing to do with whether we are
> called from a coroutine or not. The important part is that we're running
> in an iothread.
>
> Kevin


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context
  2019-09-13  9:28     ` Sergio Lopez
@ 2019-09-13  9:45       ` Kevin Wolf
  0 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2019-09-13  9:45 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: mreitz, stefanha, qemu-devel, qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 4336 bytes --]

Am 13.09.2019 um 11:28 hat Sergio Lopez geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 12.09.2019 um 21:51 hat Michael S. Tsirkin geschrieben:
> >> On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
> >> > Another AioContext-related issue, and this is a tricky one.
> >> > 
> >> > Executing a QMP block_resize request for a virtio-blk device running
> >> > on an iothread may cause a deadlock involving the following mutexes:
> >> > 
> >> >  - main thead
> >> >   * Has acquired: qemu_mutex_global.
> >> >   * Is trying the acquire: iothread AioContext lock via
> >> >     AIO_WAIT_WHILE (after aio_poll).
> >> > 
> >> >  - iothread
> >> >   * Has acquired: AioContext lock.
> >> >   * Is trying to acquire: qemu_mutex_global (via
> >> >     virtio_notify_config->prepare_mmio_access).
> >> 
> >> Hmm is this really the only case iothread takes qemu mutex?
> >> If any such access can deadlock, don't we need a generic
> >> solution? Maybe main thread can drop qemu mutex
> >> before taking io thread AioContext lock?
> >
> > The rule is that iothreads must not take the qemu mutex. If they do
> > (like in this case), it's a bug.
> >
> > Maybe we could actually assert this in qemu_mutex_lock_iothread()?
> >
> >> > With this change, virtio_blk_resize checks if it's being called from a
> >> > coroutine context running on a non-main thread, and if that's the
> >> > case, creates a new coroutine and schedules it to be run on the main
> >> > thread.
> >> > 
> >> > This works, but means the actual operation is done
> >> > asynchronously, perhaps opening a window in which a "device_del"
> >> > operation may fit and remove the VirtIODevice before
> >> > virtio_notify_config() is executed.
> >> > 
> >> > I *think* it shouldn't be possible, as BHs will be processed before
> >> > any new QMP/monitor command, but I'm open to a different approach.
> >> > 
> >> > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
> >> > Signed-off-by: Sergio Lopez <slp@redhat.com>
> >> > ---
> >> >  hw/block/virtio-blk.c | 25 ++++++++++++++++++++++++-
> >> >  1 file changed, 24 insertions(+), 1 deletion(-)
> >> > 
> >> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> >> > index 18851601cb..c763d071f6 100644
> >> > --- a/hw/block/virtio-blk.c
> >> > +++ b/hw/block/virtio-blk.c
> >> > @@ -16,6 +16,7 @@
> >> >  #include "qemu/iov.h"
> >> >  #include "qemu/module.h"
> >> >  #include "qemu/error-report.h"
> >> > +#include "qemu/main-loop.h"
> >> >  #include "trace.h"
> >> >  #include "hw/block/block.h"
> >> >  #include "hw/qdev-properties.h"
> >> > @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
> >> >      return 0;
> >> >  }
> >> >  
> >> > +static void coroutine_fn virtio_resize_co_entry(void *opaque)
> >> > +{
> >> > +    VirtIODevice *vdev = opaque;
> >> > +
> >> > +    assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> >> > +    virtio_notify_config(vdev);
> >> > +    aio_wait_kick();
> >> > +}
> >> > +
> >> >  static void virtio_blk_resize(void *opaque)
> >> >  {
> >> >      VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
> >> > +    Coroutine *co;
> >> >  
> >> > -    virtio_notify_config(vdev);
> >> > +    if (qemu_in_coroutine() &&
> >> > +        qemu_get_current_aio_context() != qemu_get_aio_context()) {
> >> > +        /*
> >> > +         * virtio_notify_config() needs to acquire the global mutex,
> >> > +         * so calling it from a coroutine running on a non-main context
> >> > +         * may cause a deadlock. Instead, create a new coroutine and
> >> > +         * schedule it to be run on the main thread.
> >> > +         */
> >> > +        co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
> >> > +        aio_co_schedule(qemu_get_aio_context(), co);
> >> > +    } else {
> >> > +        virtio_notify_config(vdev);
> >> > +    }
> >> >  }
> >
> > Wouldn't a simple BH suffice (aio_bh_schedule_oneshot)? I don't see why
> > you need a coroutine when you never yield.
> 
> You're right, that's actually simpler, haven't thought of it.
> 
> Do you see any drawbacks or should I send a non-RFC fixed version of
> this patch?

Sending a fixed non-RFC version sounds good to me.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-13  9:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-12 18:19 [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context Sergio Lopez
2019-09-12 19:51 ` Michael S. Tsirkin
2019-09-13  7:46   ` Sergio Lopez
2019-09-13  9:04   ` Kevin Wolf
2019-09-13  9:28     ` Sergio Lopez
2019-09-13  9:45       ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).