All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
@ 2021-07-12  5:38 Zhiyong Ye
  2021-07-12  5:38 ` [PATCH 1/1] " Zhiyong Ye
  0 siblings, 1 reply; 4+ messages in thread
From: Zhiyong Ye @ 2021-07-12  5:38 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: kwolf, Zhiyong Ye, mreitz

When bdrv_set_aio_context_ignore() is called in the main loop to change
the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
gets to run and the IO thread hangs at co_schedule_bh_cb().

This is because the AioContext is occupied by the main thread after
being attached to the IO thread, and the main thread poll in
bdrv_drained_end() waiting for the IO request to be drained, but the IO
thread cannot acquire the AioContext, which leads to deadlock.

Zhiyong Ye (1):
  block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context

 block.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
  2021-07-12  5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye
@ 2021-07-12  5:38 ` Zhiyong Ye
  2021-07-19 10:24   ` Kevin Wolf
  0 siblings, 1 reply; 4+ messages in thread
From: Zhiyong Ye @ 2021-07-12  5:38 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: kwolf, Zhiyong Ye, mreitz

When bdrv_set_aio_context_ignore() is called in the main loop to change
the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
gets to run and the IO thread hangs at co_schedule_bh_cb().

This is because the AioContext is occupied by the main thread after
being attached to the IO thread, and the main thread poll in
bdrv_drained_end() waiting for the IO request to be drained, but the IO
thread cannot acquire the AioContext, which leads to deadlock.

Just like below:

<------>
[Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
(gdb) bt
...
3  0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
4  0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
5  0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
6  0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
7  0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
8  0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
9  0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026
10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831
...

[Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
(gdb) bt
...
4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
5  0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
6  0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
7  0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
8  0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
9  0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
(gdb) f 4
4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
(gdb) p *mutex
$2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1,
      __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
      '\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
<------>

Therefore, we should never poll anywhere in
bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
commit e037c09c has also already elaborated on why we can't poll at
bdrv_do_drained_end().

Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
---
 block.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index be083f389e..ebbea72d64 100644
--- a/block.c
+++ b/block.c
@@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
     GSList *parents_to_process = NULL;
     GSList *entry;
     BdrvChild *child, *parent;
+    int drained_end_counter = 0;
 
     g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
 
@@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
         aio_context_release(old_context);
     }
 
-    bdrv_drained_end(bs);
+    bdrv_drained_end_no_poll(bs, &drained_end_counter);
 
     if (qemu_get_aio_context() != old_context) {
         aio_context_acquire(old_context);
@@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
     if (qemu_get_aio_context() != new_context) {
         aio_context_release(new_context);
     }
+    BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);
 }
 
 static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
  2021-07-12  5:38 ` [PATCH 1/1] " Zhiyong Ye
@ 2021-07-19 10:24   ` Kevin Wolf
  2021-07-20 13:07     ` Zhiyong Ye
  0 siblings, 1 reply; 4+ messages in thread
From: Kevin Wolf @ 2021-07-19 10:24 UTC (permalink / raw)
  To: Zhiyong Ye; +Cc: qemu-devel, qemu-block, mreitz

Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben:
> When bdrv_set_aio_context_ignore() is called in the main loop to change
> the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
> gets to run and the IO thread hangs at co_schedule_bh_cb().
> 
> This is because the AioContext is occupied by the main thread after
> being attached to the IO thread, and the main thread poll in
> bdrv_drained_end() waiting for the IO request to be drained, but the IO
> thread cannot acquire the AioContext, which leads to deadlock.

This shouldn't be the case:

 * The caller must own the AioContext lock for the old AioContext of bs, but it
 * must not own the AioContext lock for new_context (unless new_context is the
 * same as the current context of bs).

Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and
then calls bdrv_drained_end() while holding only the lock for the new
context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll()
should run without holding any AioContext locks.

If I'm not missing anything, the scenario you're seeing means that the
caller already held a lock for the new AioContext, so that it's locked
twice while AIO_WAIT_WHILE() drops the lock only once. This would be a
bug in the caller because the documentation I quoted explicitly forbids
holding the AioContext lock for the new context.

> Just like below:
> 
> <------>
> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
> (gdb) bt
> ...
> 3  0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
> 4  0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
> 5  0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
> 6  0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
> 7  0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
> 8  0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
> 9  0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026
> 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
> 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831

Which version of QEMU are you using? In current git master, line 831 is
something entirely different.

Are you using something before commit c7040ff6? Because this is a commit
that fixed a virtio-scsi bug where it would hold the wrong lock before
calling blk_set_aio_context().

> 
> [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
> (gdb) bt
> ...
> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
> 5  0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
> 6  0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
> 7  0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
> 8  0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
> 9  0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
> 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
> 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
> (gdb) f 4
> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
> (gdb) p *mutex
> $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1,
>       __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
>       __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
>       '\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
> <------>
> 
> Therefore, we should never poll anywhere in
> bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
> commit e037c09c has also already elaborated on why we can't poll at
> bdrv_do_drained_end().
> 
> Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
> ---
>  block.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index be083f389e..ebbea72d64 100644
> --- a/block.c
> +++ b/block.c
> @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>      GSList *parents_to_process = NULL;
>      GSList *entry;
>      BdrvChild *child, *parent;
> +    int drained_end_counter = 0;
>  
>      g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>  
> @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>          aio_context_release(old_context);
>      }
>  
> -    bdrv_drained_end(bs);
> +    bdrv_drained_end_no_poll(bs, &drained_end_counter);
>  
>      if (qemu_get_aio_context() != old_context) {
>          aio_context_acquire(old_context);
> @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>      if (qemu_get_aio_context() != new_context) {
>          aio_context_release(new_context);
>      }
> +    BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);

This would be wrong because bs is already in the new context, but you
wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the
lock for a context that isn't even locked, resulting in a crash.

>  }
>  
>  static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,

Kevin



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
  2021-07-19 10:24   ` Kevin Wolf
@ 2021-07-20 13:07     ` Zhiyong Ye
  0 siblings, 0 replies; 4+ messages in thread
From: Zhiyong Ye @ 2021-07-20 13:07 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, qemu-block, mreitz

Hi Kevin,

Thanks for your reply and detailed answer. It is true that 
AIO_WAIT_WHILE() will temporarily unlock the new context as you said, 
which is a point I overlooked. I'm using qemu version 5.2, and it works 
fine after I cherry-pick commit c7040ff6 into it.

Thanks again!

Zhiyong

On 7/19/21 6:24 PM, Kevin Wolf wrote:
> Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben:
>> When bdrv_set_aio_context_ignore() is called in the main loop to change
>> the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
>> gets to run and the IO thread hangs at co_schedule_bh_cb().
>>
>> This is because the AioContext is occupied by the main thread after
>> being attached to the IO thread, and the main thread poll in
>> bdrv_drained_end() waiting for the IO request to be drained, but the IO
>> thread cannot acquire the AioContext, which leads to deadlock.
> 
> This shouldn't be the case:
> 
>   * The caller must own the AioContext lock for the old AioContext of bs, but it
>   * must not own the AioContext lock for new_context (unless new_context is the
>   * same as the current context of bs).
> 
> Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and
> then calls bdrv_drained_end() while holding only the lock for the new
> context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll()
> should run without holding any AioContext locks.
> 
> If I'm not missing anything, the scenario you're seeing means that the
> caller already held a lock for the new AioContext, so that it's locked
> twice while AIO_WAIT_WHILE() drops the lock only once. This would be a
> bug in the caller because the documentation I quoted explicitly forbids
> holding the AioContext lock for the new context.
> 
>> Just like below:
>>
>> <------>
>> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
>> (gdb) bt
>> ...
>> 3  0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
>> 4  0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
>> 5  0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
>> 6  0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
>> 7  0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
>> 8  0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
>> 9  0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026
>> 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
>> 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831
> 
> Which version of QEMU are you using? In current git master, line 831 is
> something entirely different.
> 
> Are you using something before commit c7040ff6? Because this is a commit
> that fixed a virtio-scsi bug where it would hold the wrong lock before
> calling blk_set_aio_context().
> 
>>
>> [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
>> (gdb) bt
>> ...
>> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
>> 5  0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
>> 6  0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
>> 7  0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
>> 8  0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
>> 9  0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
>> 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
>> 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>> (gdb) f 4
>> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
>> (gdb) p *mutex
>> $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1,
>>        __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
>>        __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
>>        '\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
>> <------>
>>
>> Therefore, we should never poll anywhere in
>> bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
>> commit e037c09c has also already elaborated on why we can't poll at
>> bdrv_do_drained_end().
>>
>> Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
>> ---
>>   block.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/block.c b/block.c
>> index be083f389e..ebbea72d64 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>>       GSList *parents_to_process = NULL;
>>       GSList *entry;
>>       BdrvChild *child, *parent;
>> +    int drained_end_counter = 0;
>>   
>>       g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>>   
>> @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>>           aio_context_release(old_context);
>>       }
>>   
>> -    bdrv_drained_end(bs);
>> +    bdrv_drained_end_no_poll(bs, &drained_end_counter);
>>   
>>       if (qemu_get_aio_context() != old_context) {
>>           aio_context_acquire(old_context);
>> @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>>       if (qemu_get_aio_context() != new_context) {
>>           aio_context_release(new_context);
>>       }
>> +    BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);
> 
> This would be wrong because bs is already in the new context, but you
> wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the
> lock for a context that isn't even locked, resulting in a crash.
> 
>>   }
>>   
>>   static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
> 
> Kevin
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-20 13:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye
2021-07-12  5:38 ` [PATCH 1/1] " Zhiyong Ye
2021-07-19 10:24   ` Kevin Wolf
2021-07-20 13:07     ` Zhiyong Ye

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.