All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive
@ 2017-03-28 16:12 Denis V. Lunev
  2017-03-29  0:20 ` Fam Zheng
  2017-03-29 20:31 ` Max Reitz
  0 siblings, 2 replies; 3+ messages in thread
From: Denis V. Lunev @ 2017-03-28 16:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Denis V. Lunev, Kevin Wolf, Max Reitz, Eric Blake, Markus Armbruster

Recently we expirience hang with iothreads enabled with the following
call trace:
Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
0  ppoll () from /lib64/libc.so.6
2  qemu_poll_ns () at qemu-timer.c:313
3  aio_poll () at aio-posix.c:457
4  bdrv_flush () at block/io.c:2641
5  bdrv_close () at block.c:2143
6  bdrv_delete () at block.c:2352
7  bdrv_unref () at block.c:3429
8  blk_remove_bs () at block/block-backend.c:427
9  blk_delete () at block/block-backend.c:178
10 blk_unref () at block/block-backend.c:226
11 object_property_del_all () at qom/object.c:399
12 object_finalize () at qom/object.c:461
13 object_unref () at qom/object.c:898
14 object_property_del_child () at qom/object.c:422
15 qmp_marshal_device_del () at qmp-marshal.c:1145
16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929

Technically bdrv_flush() stucks in
    while (rwco.ret == NOT_DONE) {
        aio_poll(aio_context, true);
    }
but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
reveals that we do not have performed aio_context_acquire() on this call
stack.

This patch adds missed lock.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
---
 hw/core/qdev-properties-system.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index c34be1c..e885e65 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -124,8 +124,12 @@ static void release_drive(Object *obj, const char *name, void *opaque)
     BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
 
     if (*ptr) {
+        AioContext *ctx = blk_get_aio_context(*ptr);
+
+        aio_context_acquire(ctx);
         blockdev_auto_del(*ptr);
         blk_detach_dev(*ptr, dev);
+        aio_context_release(ctx);
     }
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive
  2017-03-28 16:12 [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive Denis V. Lunev
@ 2017-03-29  0:20 ` Fam Zheng
  2017-03-29 20:31 ` Max Reitz
  1 sibling, 0 replies; 3+ messages in thread
From: Fam Zheng @ 2017-03-29  0:20 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-devel, Kevin Wolf, Markus Armbruster, Max Reitz

On Tue, 03/28 19:12, Denis V. Lunev wrote:
> Recently we expirience hang with iothreads enabled with the following
> call trace:
> Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
> 0  ppoll () from /lib64/libc.so.6
> 2  qemu_poll_ns () at qemu-timer.c:313
> 3  aio_poll () at aio-posix.c:457
> 4  bdrv_flush () at block/io.c:2641
> 5  bdrv_close () at block.c:2143
> 6  bdrv_delete () at block.c:2352
> 7  bdrv_unref () at block.c:3429
> 8  blk_remove_bs () at block/block-backend.c:427
> 9  blk_delete () at block/block-backend.c:178
> 10 blk_unref () at block/block-backend.c:226
> 11 object_property_del_all () at qom/object.c:399
> 12 object_finalize () at qom/object.c:461
> 13 object_unref () at qom/object.c:898
> 14 object_property_del_child () at qom/object.c:422
> 15 qmp_marshal_device_del () at qmp-marshal.c:1145
> 16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929
> 
> Technically bdrv_flush() stucks in
>     while (rwco.ret == NOT_DONE) {
>         aio_poll(aio_context, true);
>     }
> but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
> reveals that we do not have performed aio_context_acquire() on this call
> stack.
> 
> This patch adds missed lock.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>

Nit: reading the subject I thought it's an unbalanced acquire/release, but it is
actually a missing pair.

In bdrv_unref we should have asserted we have acquired the AioContext, that way
you wouldn't have been bit by this bug.

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive
  2017-03-28 16:12 [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive Denis V. Lunev
  2017-03-29  0:20 ` Fam Zheng
@ 2017-03-29 20:31 ` Max Reitz
  1 sibling, 0 replies; 3+ messages in thread
From: Max Reitz @ 2017-03-29 20:31 UTC (permalink / raw)
  To: Denis V. Lunev, qemu-devel; +Cc: Kevin Wolf, Eric Blake, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 1771 bytes --]

On 28.03.2017 18:12, Denis V. Lunev wrote:
> Recently we expirience hang with iothreads enabled with the following
> call trace:
> Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
> 0  ppoll () from /lib64/libc.so.6
> 2  qemu_poll_ns () at qemu-timer.c:313
> 3  aio_poll () at aio-posix.c:457
> 4  bdrv_flush () at block/io.c:2641
> 5  bdrv_close () at block.c:2143
> 6  bdrv_delete () at block.c:2352
> 7  bdrv_unref () at block.c:3429
> 8  blk_remove_bs () at block/block-backend.c:427
> 9  blk_delete () at block/block-backend.c:178
> 10 blk_unref () at block/block-backend.c:226
> 11 object_property_del_all () at qom/object.c:399
> 12 object_finalize () at qom/object.c:461
> 13 object_unref () at qom/object.c:898
> 14 object_property_del_child () at qom/object.c:422
> 15 qmp_marshal_device_del () at qmp-marshal.c:1145
> 16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929
> 
> Technically bdrv_flush() stucks in
>     while (rwco.ret == NOT_DONE) {
>         aio_poll(aio_context, true);
>     }
> but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
> reveals that we do not have performed aio_context_acquire() on this call
> stack.
> 
> This patch adds missed lock.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> CC: Markus Armbruster <armbru@redhat.com>
> ---
>  hw/core/qdev-properties-system.c | 4 ++++
>  1 file changed, 4 insertions(+)

Since this file is unmaintained but this patch is mostly a matter of the
block layer (and the subject starts with "block" :-)):

Thanks, applied to my block tree:

https://github.com/xanClic/qemu/commits/block

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-03-29 20:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-28 16:12 [Qemu-devel] [PATCH for 2.9 1/1] block: add missed aio_context_acquire into release_drive Denis V. Lunev
2017-03-29  0:20 ` Fam Zheng
2017-03-29 20:31 ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.