qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] test-bdrv-drain: fix iothread_join() hang
@ 2019-10-03 10:01 Stefan Hajnoczi
  2019-10-03 10:13 ` Paolo Bonzini
  2019-10-09 14:24 ` Stefan Hajnoczi
  0 siblings, 2 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2019-10-03 10:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Stefan Hajnoczi, qemu-block, Dr . David Alan Gilbert

tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run():

  while (!atomic_read(&iothread->stopping)) {
      aio_poll(iothread->ctx, true);
  }

The iothread_join() function works as follows:

  void iothread_join(IOThread *iothread)
  {
      iothread->stopping = true;
      aio_notify(iothread->ctx);
      qemu_thread_join(&iothread->thread);

If iothread_run() checks iothread->stopping before the iothread_join()
thread sets stopping to true, then aio_notify() may be optimized away
and iothread_run() hangs forever in aio_poll().

The correct way to change iothread->stopping is from a BH that executes
within iothread_run().  This ensures that iothread->stopping is checked
after we set it to true.

This was already fixed for ./iothread.c (note this is a different source
file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread:
fix iothread_stop() race condition"), but not for tests/iothread.c.

Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a
       ("aio: introduce aio_co_schedule and aio_co_wake")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/iothread.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tests/iothread.c b/tests/iothread.c
index 777d9eea46..13c9fdcd8d 100644
--- a/tests/iothread.c
+++ b/tests/iothread.c
@@ -55,10 +55,16 @@ static void *iothread_run(void *opaque)
     return NULL;
 }
 
-void iothread_join(IOThread *iothread)
+static void iothread_stop_bh(void *opaque)
 {
+    IOThread *iothread = opaque;
+
     iothread->stopping = true;
-    aio_notify(iothread->ctx);
+}
+
+void iothread_join(IOThread *iothread)
+{
+    aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
     qemu_thread_join(&iothread->thread);
     qemu_cond_destroy(&iothread->init_done_cond);
     qemu_mutex_destroy(&iothread->init_done_lock);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] test-bdrv-drain: fix iothread_join() hang
  2019-10-03 10:01 [PATCH] test-bdrv-drain: fix iothread_join() hang Stefan Hajnoczi
@ 2019-10-03 10:13 ` Paolo Bonzini
  2019-10-09 14:24 ` Stefan Hajnoczi
  1 sibling, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2019-10-03 10:13 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Dr . David Alan Gilbert, qemu-block

On 03/10/19 12:01, Stefan Hajnoczi wrote:
> tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run():
> 
>   while (!atomic_read(&iothread->stopping)) {
>       aio_poll(iothread->ctx, true);
>   }
> 
> The iothread_join() function works as follows:
> 
>   void iothread_join(IOThread *iothread)
>   {
>       iothread->stopping = true;
>       aio_notify(iothread->ctx);
>       qemu_thread_join(&iothread->thread);
> 
> If iothread_run() checks iothread->stopping before the iothread_join()
> thread sets stopping to true, then aio_notify() may be optimized away
> and iothread_run() hangs forever in aio_poll().
> 
> The correct way to change iothread->stopping is from a BH that executes
> within iothread_run().  This ensures that iothread->stopping is checked
> after we set it to true.
> 
> This was already fixed for ./iothread.c (note this is a different source
> file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread:
> fix iothread_stop() race condition"), but not for tests/iothread.c.
> 
> Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a
>        ("aio: introduce aio_co_schedule and aio_co_wake")
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/iothread.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/iothread.c b/tests/iothread.c
> index 777d9eea46..13c9fdcd8d 100644
> --- a/tests/iothread.c
> +++ b/tests/iothread.c
> @@ -55,10 +55,16 @@ static void *iothread_run(void *opaque)
>      return NULL;
>  }
>  
> -void iothread_join(IOThread *iothread)
> +static void iothread_stop_bh(void *opaque)
>  {
> +    IOThread *iothread = opaque;
> +
>      iothread->stopping = true;
> -    aio_notify(iothread->ctx);
> +}
> +
> +void iothread_join(IOThread *iothread)
> +{
> +    aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
>      qemu_thread_join(&iothread->thread);
>      qemu_cond_destroy(&iothread->init_done_cond);
>      qemu_mutex_destroy(&iothread->init_done_lock);
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] test-bdrv-drain: fix iothread_join() hang
  2019-10-03 10:01 [PATCH] test-bdrv-drain: fix iothread_join() hang Stefan Hajnoczi
  2019-10-03 10:13 ` Paolo Bonzini
@ 2019-10-09 14:24 ` Stefan Hajnoczi
  1 sibling, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2019-10-09 14:24 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Paolo Bonzini, qemu-devel, qemu-block, Dr . David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

On Thu, Oct 03, 2019 at 11:01:03AM +0100, Stefan Hajnoczi wrote:
> tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run():
> 
>   while (!atomic_read(&iothread->stopping)) {
>       aio_poll(iothread->ctx, true);
>   }
> 
> The iothread_join() function works as follows:
> 
>   void iothread_join(IOThread *iothread)
>   {
>       iothread->stopping = true;
>       aio_notify(iothread->ctx);
>       qemu_thread_join(&iothread->thread);
> 
> If iothread_run() checks iothread->stopping before the iothread_join()
> thread sets stopping to true, then aio_notify() may be optimized away
> and iothread_run() hangs forever in aio_poll().
> 
> The correct way to change iothread->stopping is from a BH that executes
> within iothread_run().  This ensures that iothread->stopping is checked
> after we set it to true.
> 
> This was already fixed for ./iothread.c (note this is a different source
> file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread:
> fix iothread_stop() race condition"), but not for tests/iothread.c.
> 
> Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a
>        ("aio: introduce aio_co_schedule and aio_co_wake")
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/iothread.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-10-09 18:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 10:01 [PATCH] test-bdrv-drain: fix iothread_join() hang Stefan Hajnoczi
2019-10-03 10:13 ` Paolo Bonzini
2019-10-09 14:24 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).