All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread
@ 2017-05-17 17:09 Stefan Hajnoczi
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests Stefan Hajnoczi
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-17 17:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Paolo Bonzini, Kevin Wolf, Fam Zheng, Stefan Hajnoczi

The 'savevm' command hangs when -object iothread is used.  See patches for
details, but basically the vmstate read/write code didn't conform to the latest
block layer locking rules.

Stefan Hajnoczi (3):
  block: count bdrv_co_rw_vmstate() requests
  block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  migration: avoid recursive AioContext locking in save_vmstate()

 block/io.c         | 21 +++++++++++++--------
 migration/savevm.c | 12 +++++++++++-
 2 files changed, 24 insertions(+), 9 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests
  2017-05-17 17:09 [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Stefan Hajnoczi
@ 2017-05-17 17:09 ` Stefan Hajnoczi
  2017-05-17 20:09   ` Eric Blake
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate() Stefan Hajnoczi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-17 17:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Paolo Bonzini, Kevin Wolf, Fam Zheng, Stefan Hajnoczi

Call bdrv_inc/dec_in_flight() for vmstate reads/writes.  This seems
unnecessary at first glance because vmstate reads/writes are done
synchronously while the guest is stopped.  But we need the bdrv_wakeup()
in bdrv_dec_in_flight() so the main loop sees request completion.
Besides, it's cleaner to count vmstate reads/writes like ordinary
read/write requests.

The bdrv_wakeup() partially fixes a 'savevm' hang with -object iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/io.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index fdd7485..cc56e90 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1988,17 +1988,24 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
                    bool is_read)
 {
     BlockDriver *drv = bs->drv;
+    int ret = -ENOTSUP;
+
+    bdrv_inc_in_flight(bs);
 
     if (!drv) {
-        return -ENOMEDIUM;
+        ret = -ENOMEDIUM;
     } else if (drv->bdrv_load_vmstate) {
-        return is_read ? drv->bdrv_load_vmstate(bs, qiov, pos)
-                       : drv->bdrv_save_vmstate(bs, qiov, pos);
+        if (is_read) {
+            ret = drv->bdrv_load_vmstate(bs, qiov, pos);
+        } else {
+            ret = drv->bdrv_save_vmstate(bs, qiov, pos);
+        }
     } else if (bs->file) {
-        return bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
+        ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
     }
 
-    return -ENOTSUP;
+    bdrv_dec_in_flight(bs);
+    return ret;
 }
 
 static void coroutine_fn bdrv_co_rw_vmstate_entry(void *opaque)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-17 17:09 [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Stefan Hajnoczi
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests Stefan Hajnoczi
@ 2017-05-17 17:09 ` Stefan Hajnoczi
  2017-05-17 20:16   ` Eric Blake
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate() Stefan Hajnoczi
  2017-05-17 20:49 ` [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Paolo Bonzini
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-17 17:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Paolo Bonzini, Kevin Wolf, Fam Zheng, Stefan Hajnoczi

Calling aio_poll() directly may have been fine previously, but this is
the future, man!  The difference between an aio_poll() loop and
BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
around aio_poll().

This allows the IOThread to run fd handlers or BHs to complete the
request.  Failure to release the AioContext causes deadlocks.

Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/io.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index cc56e90..f0041cd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
         Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
 
         bdrv_coroutine_enter(bs, co);
-        while (data.ret == -EINPROGRESS) {
-            aio_poll(bdrv_get_aio_context(bs), true);
-        }
+        BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
         return data.ret;
     }
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate()
  2017-05-17 17:09 [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Stefan Hajnoczi
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests Stefan Hajnoczi
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate() Stefan Hajnoczi
@ 2017-05-17 17:09 ` Stefan Hajnoczi
  2017-05-17 20:24   ` Eric Blake
  2017-05-18  8:18   ` Kevin Wolf
  2017-05-17 20:49 ` [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Paolo Bonzini
  3 siblings, 2 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-17 17:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Paolo Bonzini, Kevin Wolf, Fam Zheng, Stefan Hajnoczi

AioContext was designed to allow nested acquire/release calls.  It uses
a recursive mutex so callers don't need to worry about nesting...or so
we thought.

BDRV_POLL_WHILE() is used to wait for block I/O requests.  It releases
the AioContext temporarily around aio_poll().  This gives IOThreads a
chance to acquire the AioContext to process I/O completions.

It turns out that recursive locking and BDRV_POLL_WHILE() don't mix.
BDRV_POLL_WHILE() only releases the AioContext once, so the IOThread
will not be able to acquire the AioContext if it was acquired
multiple times.

Instead of trying to release AioContext n times in BDRV_POLL_WHILE(),
this patch simply avoids nested locking in save_vmstate().  It's the
simplest fix and we should step back to consider the big picture with
all the recent changes to block layer threading.

This patch is the final fix to solve 'savevm' hanging with -object
iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 migration/savevm.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 7f66d58..a70ba20 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2153,6 +2153,14 @@ int save_vmstate(const char *name)
         goto the_end;
     }
 
+    /* The bdrv_all_create_snapshot() call that follows acquires the AioContext
+     * for itself.  BDRV_POLL_WHILE() does not support nested locking because
+     * it only releases the lock once.  Therefore synchronous I/O will deadlock
+     * unless we release the AioContext before bdrv_all_create_snapshot().
+     */
+    aio_context_release(aio_context);
+    aio_context = NULL;
+
     ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
     if (ret < 0) {
         error_report("Error while creating snapshot on '%s'",
@@ -2163,7 +2171,9 @@ int save_vmstate(const char *name)
     ret = 0;
 
  the_end:
-    aio_context_release(aio_context);
+    if (aio_context) {
+        aio_context_release(aio_context);
+    }
     if (saved_vm_running) {
         vm_start();
     }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests Stefan Hajnoczi
@ 2017-05-17 20:09   ` Eric Blake
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2017-05-17 20:09 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Fam Zheng, qemu-block

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> Call bdrv_inc/dec_in_flight() for vmstate reads/writes.  This seems
> unnecessary at first glance because vmstate reads/writes are done
> synchronously while the guest is stopped.  But we need the bdrv_wakeup()
> in bdrv_dec_in_flight() so the main loop sees request completion.
> Besides, it's cleaner to count vmstate reads/writes like ordinary
> read/write requests.
> 
> The bdrv_wakeup() partially fixes a 'savevm' hang with -object iothread.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/io.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate() Stefan Hajnoczi
@ 2017-05-17 20:16   ` Eric Blake
  2017-05-17 20:47     ` Paolo Bonzini
                       ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Eric Blake @ 2017-05-17 20:16 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Fam Zheng, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1733 bytes --]

On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> Calling aio_poll() directly may have been fine previously, but this is
> the future, man!

lol

>  The difference between an aio_poll() loop and
> BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
> around aio_poll().
> 
> This allows the IOThread to run fd handlers or BHs to complete the
> request.  Failure to release the AioContext causes deadlocks.
> 
> Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
> iothread.

I'm surprised at how many separate hangs we actually had!

> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/io.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block/io.c b/block/io.c
> index cc56e90..f0041cd 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
>          Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
>  
>          bdrv_coroutine_enter(bs, co);
> -        while (data.ret == -EINPROGRESS) {
> -            aio_poll(bdrv_get_aio_context(bs), true);
> -        }
> +        BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
>          return data.ret;
>      }

Do we have other culprits (not necessarily for vmsave, but for other
situations), where we should be using BDRV_POLL_WHILE in separate
patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
a direct call to aio_poll(,true) in a while loop.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate()
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate() Stefan Hajnoczi
@ 2017-05-17 20:24   ` Eric Blake
  2017-05-18  8:18   ` Kevin Wolf
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Blake @ 2017-05-17 20:24 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Fam Zheng, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1363 bytes --]

On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> AioContext was designed to allow nested acquire/release calls.  It uses
> a recursive mutex so callers don't need to worry about nesting...or so
> we thought.
> 
> BDRV_POLL_WHILE() is used to wait for block I/O requests.  It releases
> the AioContext temporarily around aio_poll().  This gives IOThreads a
> chance to acquire the AioContext to process I/O completions.
> 
> It turns out that recursive locking and BDRV_POLL_WHILE() don't mix.
> BDRV_POLL_WHILE() only releases the AioContext once, so the IOThread
> will not be able to acquire the AioContext if it was acquired
> multiple times.
> 
> Instead of trying to release AioContext n times in BDRV_POLL_WHILE(),
> this patch simply avoids nested locking in save_vmstate().  It's the
> simplest fix and we should step back to consider the big picture with
> all the recent changes to block layer threading.
> 
> This patch is the final fix to solve 'savevm' hanging with -object
> iothread.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  migration/savevm.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-17 20:16   ` Eric Blake
@ 2017-05-17 20:47     ` Paolo Bonzini
  2017-05-18  7:57     ` Kevin Wolf
  2017-05-18  8:05     ` Stefan Hajnoczi
  2 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-05-17 20:47 UTC (permalink / raw)
  To: Eric Blake; +Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, Fam Zheng, qemu-block


> I'm surprised at how many separate hangs we actually had!

Note that I have seen quite a few before, though I am not sure about
the details and the reproducibility.  The release/acquire was hidden
behind RFifoLock contention callbacks instead of BDRV_POLL_WHILE.

Paolo

> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  block/io.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> > 
> > diff --git a/block/io.c b/block/io.c
> > index cc56e90..f0041cd 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector
> > *qiov, int64_t pos,
> >          Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry,
> >          &data);
> >  
> >          bdrv_coroutine_enter(bs, co);
> > -        while (data.ret == -EINPROGRESS) {
> > -            aio_poll(bdrv_get_aio_context(bs), true);
> > -        }
> > +        BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
> >          return data.ret;
> >      }
> 
> Do we have other culprits (not necessarily for vmsave, but for other
> situations), where we should be using BDRV_POLL_WHILE in separate
> patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
> a direct call to aio_poll(,true) in a while loop.
> 
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread
  2017-05-17 17:09 [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate() Stefan Hajnoczi
@ 2017-05-17 20:49 ` Paolo Bonzini
  3 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-05-17 20:49 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, Fam Zheng, qemu-block



----- Original Message -----
> From: "Stefan Hajnoczi" <stefanha@redhat.com>
> To: qemu-devel@nongnu.org
> Cc: "Kevin Wolf" <kwolf@redhat.com>, "Paolo Bonzini" <pbonzini@redhat.com>, "Fam Zheng" <famz@redhat.com>, "Stefan
> Hajnoczi" <stefanha@redhat.com>, qemu-block@nongnu.org
> Sent: Wednesday, May 17, 2017 7:09:38 PM
> Subject: [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread
> 
> The 'savevm' command hangs when -object iothread is used.  See patches for
> details, but basically the vmstate read/write code didn't conform to the
> latest block layer locking rules.

Thanks for the fixes.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

> Stefan Hajnoczi (3):
>   block: count bdrv_co_rw_vmstate() requests
>   block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
>   migration: avoid recursive AioContext locking in save_vmstate()
> 
>  block/io.c         | 21 +++++++++++++--------
>  migration/savevm.c | 12 +++++++++++-
>  2 files changed, 24 insertions(+), 9 deletions(-)
> 
> --
> 2.9.3
> 
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-17 20:16   ` Eric Blake
  2017-05-17 20:47     ` Paolo Bonzini
@ 2017-05-18  7:57     ` Kevin Wolf
  2017-05-18  8:06       ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2017-05-18  8:05     ` Stefan Hajnoczi
  2 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2017-05-18  7:57 UTC (permalink / raw)
  To: Eric Blake
  Cc: Stefan Hajnoczi, qemu-devel, Paolo Bonzini, Fam Zheng, qemu-block

[-- Attachment #1: Type: text/plain, Size: 789 bytes --]

Am 17.05.2017 um 22:16 hat Eric Blake geschrieben:
> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> > Calling aio_poll() directly may have been fine previously, but this is
> > the future, man!
> 
> lol
> 
> >  The difference between an aio_poll() loop and
> > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
> > around aio_poll().
> > 
> > This allows the IOThread to run fd handlers or BHs to complete the
> > request.  Failure to release the AioContext causes deadlocks.
> > 
> > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
> > iothread.
> 
> I'm surprised at how many separate hangs we actually had!

How hard would it be to write some test cases for this? Dataplane has
a serious lack of automated testing.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-17 20:16   ` Eric Blake
  2017-05-17 20:47     ` Paolo Bonzini
  2017-05-18  7:57     ` Kevin Wolf
@ 2017-05-18  8:05     ` Stefan Hajnoczi
  2 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-18  8:05 UTC (permalink / raw)
  To: Eric Blake
  Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Fam Zheng, qemu block

On Wed, May 17, 2017 at 9:16 PM, Eric Blake <eblake@redhat.com> wrote:
> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
>> diff --git a/block/io.c b/block/io.c
>> index cc56e90..f0041cd 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
>>          Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
>>
>>          bdrv_coroutine_enter(bs, co);
>> -        while (data.ret == -EINPROGRESS) {
>> -            aio_poll(bdrv_get_aio_context(bs), true);
>> -        }
>> +        BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
>>          return data.ret;
>>      }
>
> Do we have other culprits (not necessarily for vmsave, but for other
> situations), where we should be using BDRV_POLL_WHILE in separate
> patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
> a direct call to aio_poll(,true) in a while loop.

virtio-9p doesn't use IOThread (dataplane) so it is not affected.  It
also doesn't use BlockDriverState so it cannot use BDRV_POLL_WHILE().

I did grep for other aio_poll() callers and didn't spot any obvious
cases that need to be fixed.  They tend to run in situations where
this deadlock cannot occur.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-18  7:57     ` Kevin Wolf
@ 2017-05-18  8:06       ` Stefan Hajnoczi
  2017-05-18  8:22         ` Kevin Wolf
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-18  8:06 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Eric Blake, Paolo Bonzini, qemu block, Fam Zheng, qemu-devel,
	Stefan Hajnoczi

On Thu, May 18, 2017 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 17.05.2017 um 22:16 hat Eric Blake geschrieben:
>> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
>> > Calling aio_poll() directly may have been fine previously, but this is
>> > the future, man!
>>
>> lol
>>
>> >  The difference between an aio_poll() loop and
>> > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
>> > around aio_poll().
>> >
>> > This allows the IOThread to run fd handlers or BHs to complete the
>> > request.  Failure to release the AioContext causes deadlocks.
>> >
>> > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
>> > iothread.
>>
>> I'm surprised at how many separate hangs we actually had!
>
> How hard would it be to write some test cases for this? Dataplane has
> a serious lack of automated testing.

And this hang doesn't even require in-flight guest I/O, so it would be
easy to reproduce in qemu-iotests if we add an IOThread mode.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate()
  2017-05-17 17:09 ` [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate() Stefan Hajnoczi
  2017-05-17 20:24   ` Eric Blake
@ 2017-05-18  8:18   ` Kevin Wolf
  2017-05-19  9:07     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  1 sibling, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2017-05-18  8:18 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block, Paolo Bonzini, Fam Zheng

Am 17.05.2017 um 19:09 hat Stefan Hajnoczi geschrieben:
> AioContext was designed to allow nested acquire/release calls.  It uses
> a recursive mutex so callers don't need to worry about nesting...or so
> we thought.
> 
> BDRV_POLL_WHILE() is used to wait for block I/O requests.  It releases
> the AioContext temporarily around aio_poll().  This gives IOThreads a
> chance to acquire the AioContext to process I/O completions.
> 
> It turns out that recursive locking and BDRV_POLL_WHILE() don't mix.
> BDRV_POLL_WHILE() only releases the AioContext once, so the IOThread
> will not be able to acquire the AioContext if it was acquired
> multiple times.
> 
> Instead of trying to release AioContext n times in BDRV_POLL_WHILE(),
> this patch simply avoids nested locking in save_vmstate().  It's the
> simplest fix and we should step back to consider the big picture with
> all the recent changes to block layer threading.
> 
> This patch is the final fix to solve 'savevm' hanging with -object
> iothread.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  migration/savevm.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 7f66d58..a70ba20 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2153,6 +2153,14 @@ int save_vmstate(const char *name)
>          goto the_end;
>      }
>  
> +    /* The bdrv_all_create_snapshot() call that follows acquires the AioContext
> +     * for itself.  BDRV_POLL_WHILE() does not support nested locking because
> +     * it only releases the lock once.  Therefore synchronous I/O will deadlock
> +     * unless we release the AioContext before bdrv_all_create_snapshot().
> +     */
> +    aio_context_release(aio_context);
> +    aio_context = NULL;
> +
>      ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
>      if (ret < 0) {
>          error_report("Error while creating snapshot on '%s'",
> @@ -2163,7 +2171,9 @@ int save_vmstate(const char *name)
>      ret = 0;
>  
>   the_end:
> -    aio_context_release(aio_context);
> +    if (aio_context) {
> +        aio_context_release(aio_context);
> +    }
>      if (saved_vm_running) {
>          vm_start();
>      }

It might actually even be true before this patch because the lock is
already only taken for some parts of the function, but don't we need to
call bdrv_drain_all_begin/end() around the whole function now?

We're stopping the VM, so hopefully no device is continuing to process
requests, but can't we still have block jobs, NBD server requests etc.?

And the same is probably true for qemu_loadvm_state().

Kevin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
  2017-05-18  8:06       ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2017-05-18  8:22         ` Kevin Wolf
  0 siblings, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2017-05-18  8:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Eric Blake, Paolo Bonzini, qemu block, Fam Zheng, qemu-devel,
	Stefan Hajnoczi

Am 18.05.2017 um 10:06 hat Stefan Hajnoczi geschrieben:
> On Thu, May 18, 2017 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 17.05.2017 um 22:16 hat Eric Blake geschrieben:
> >> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> >> > Calling aio_poll() directly may have been fine previously, but this is
> >> > the future, man!
> >>
> >> lol
> >>
> >> >  The difference between an aio_poll() loop and
> >> > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
> >> > around aio_poll().
> >> >
> >> > This allows the IOThread to run fd handlers or BHs to complete the
> >> > request.  Failure to release the AioContext causes deadlocks.
> >> >
> >> > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
> >> > iothread.
> >>
> >> I'm surprised at how many separate hangs we actually had!
> >
> > How hard would it be to write some test cases for this? Dataplane has
> > a serious lack of automated testing.
> 
> And this hang doesn't even require in-flight guest I/O, so it would be
> easy to reproduce in qemu-iotests if we add an IOThread mode.

I don't think I would make it a separate mode (because people would run
only one or the other - when did you last run qcow2 v2 tests?), but just
add some test cases that make use of it during the normal -qcow2 or -raw
run.

Kevin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate()
  2017-05-18  8:18   ` Kevin Wolf
@ 2017-05-19  9:07     ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-05-19  9:07 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Stefan Hajnoczi, Paolo Bonzini, Fam Zheng, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

On Thu, May 18, 2017 at 10:18:46AM +0200, Kevin Wolf wrote:
> Am 17.05.2017 um 19:09 hat Stefan Hajnoczi geschrieben:
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 7f66d58..a70ba20 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2153,6 +2153,14 @@ int save_vmstate(const char *name)
> >          goto the_end;
> >      }
> >  
> > +    /* The bdrv_all_create_snapshot() call that follows acquires the AioContext
> > +     * for itself.  BDRV_POLL_WHILE() does not support nested locking because
> > +     * it only releases the lock once.  Therefore synchronous I/O will deadlock
> > +     * unless we release the AioContext before bdrv_all_create_snapshot().
> > +     */
> > +    aio_context_release(aio_context);
> > +    aio_context = NULL;
> > +
> >      ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
> >      if (ret < 0) {
> >          error_report("Error while creating snapshot on '%s'",
> > @@ -2163,7 +2171,9 @@ int save_vmstate(const char *name)
> >      ret = 0;
> >  
> >   the_end:
> > -    aio_context_release(aio_context);
> > +    if (aio_context) {
> > +        aio_context_release(aio_context);
> > +    }
> >      if (saved_vm_running) {
> >          vm_start();
> >      }
> 
> It might actually even be true before this patch because the lock is
> already only taken for some parts of the function, but don't we need to
> call bdrv_drain_all_begin/end() around the whole function now?
> 
> We're stopping the VM, so hopefully no device is continuing to process
> requests, but can't we still have block jobs, NBD server requests etc.?
> 
> And the same is probably true for qemu_loadvm_state().

Yes, they currently rely on bdrv_drain_all() but that's not enough.
Thanks for the suggestion, will add a patch in v2.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-05-19  9:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-17 17:09 [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Stefan Hajnoczi
2017-05-17 17:09 ` [Qemu-devel] [PATCH 1/3] block: count bdrv_co_rw_vmstate() requests Stefan Hajnoczi
2017-05-17 20:09   ` Eric Blake
2017-05-17 17:09 ` [Qemu-devel] [PATCH 2/3] block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate() Stefan Hajnoczi
2017-05-17 20:16   ` Eric Blake
2017-05-17 20:47     ` Paolo Bonzini
2017-05-18  7:57     ` Kevin Wolf
2017-05-18  8:06       ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-05-18  8:22         ` Kevin Wolf
2017-05-18  8:05     ` Stefan Hajnoczi
2017-05-17 17:09 ` [Qemu-devel] [PATCH 3/3] migration: avoid recursive AioContext locking in save_vmstate() Stefan Hajnoczi
2017-05-17 20:24   ` Eric Blake
2017-05-18  8:18   ` Kevin Wolf
2017-05-19  9:07     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-05-17 20:49 ` [Qemu-devel] [PATCH 0/3] block: fix 'savevm' hang with -object iothread Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.