All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment
@ 2019-01-24  7:48 Denis Plotnikov
  2019-01-24  9:34 ` Vladimir Sementsov-Ogievskiy
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Denis Plotnikov @ 2019-01-24  7:48 UTC (permalink / raw)
  To: mreitz, kwolf; +Cc: qemu-block, qemu-devel, den, vsementsov

When there is a Backup Block Job running and shutdown command is sent to
a guest, the guest crushes due to assert(!bs->walking_aio_notifiers).

Call stack:

0  __GI_raise
1  __GI_abort
2  __assert_fail_base
3  __GI___assert_fail
4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7  block_job_attached_aio_context
8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9  bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main

This happens because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.

The patch fixes the problem by skipping the bds-es in recursive calls
which have started attachment/detachment already.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
---
 block.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 9b1c5c5eaf..35a19fe1df 100644
--- a/block.c
+++ b/block.c
@@ -5064,7 +5064,9 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
         bs->drv->bdrv_detach_aio_context(bs);
     }
     QLIST_FOREACH(child, &bs->children, next) {
-        bdrv_detach_aio_context(child->bs);
+        if (!child->bs->walking_aio_notifiers) {
+            bdrv_detach_aio_context(child->bs);
+        }
     }
 
     bs->aio_context = NULL;
@@ -5083,7 +5085,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
     bs->aio_context = new_context;
 
     QLIST_FOREACH(child, &bs->children, next) {
-        bdrv_attach_aio_context(child->bs, new_context);
+        if (!child->bs->walking_aio_notifiers) {
+            bdrv_attach_aio_context(child->bs, new_context);
+        }
     }
     if (bs->drv->bdrv_attach_aio_context) {
         bs->drv->bdrv_attach_aio_context(bs, new_context);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment
  2019-01-24  7:48 [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment Denis Plotnikov
@ 2019-01-24  9:34 ` Vladimir Sementsov-Ogievskiy
  2019-02-08  8:10 ` Denis Plotnikov
  2019-02-08 11:03 ` Kevin Wolf
  2 siblings, 0 replies; 5+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-01-24  9:34 UTC (permalink / raw)
  To: Denis Plotnikov, mreitz, kwolf; +Cc: qemu-block, qemu-devel, Denis Lunev

24.01.2019 10:48, Denis Plotnikov wrote:
> When there is a Backup Block Job running and shutdown command is sent to
> a guest, the guest crushes due to assert(!bs->walking_aio_notifiers).

Clarification: not ordinary backup, but fleecing scheme: backup with sync=none,
when source is a backing for target.


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment
  2019-01-24  7:48 [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment Denis Plotnikov
  2019-01-24  9:34 ` Vladimir Sementsov-Ogievskiy
@ 2019-02-08  8:10 ` Denis Plotnikov
  2019-02-08 11:03 ` Kevin Wolf
  2 siblings, 0 replies; 5+ messages in thread
From: Denis Plotnikov @ 2019-02-08  8:10 UTC (permalink / raw)
  To: mreitz, kwolf
  Cc: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block, Denis Lunev

ping ping!

On 24.01.2019 10:48, Denis Plotnikov wrote:
> When there is a Backup Block Job running and shutdown command is sent to
> a guest, the guest crushes due to assert(!bs->walking_aio_notifiers).
> 
> Call stack:
> 
> 0  __GI_raise
> 1  __GI_abort
> 2  __assert_fail_base
> 3  __GI___assert_fail
> 4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
> 5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
> 6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
> 7  block_job_attached_aio_context
> 8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
> 9  bdrv_set_aio_context (bs=0x55f54d65c000)
> 10 blk_set_aio_context
> 11 virtio_blk_data_plane_stop
> 12 virtio_bus_stop_ioeventfd
> 13 virtio_vmstate_change
> 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
> 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
> 16 vm_stop (state=RUN_STATE_SHUTDOWN)
> 17 main_loop_should_exit
> 18 main_loop
> 19 main
> 
> This happens because of "new" context attachment to VM disk bds.
> When attaching a new context the corresponding aio context handler is
> called for each of aio_notifiers registered on the VM disk bds context.
> Among those handlers there is the block_job_attached_aio_context handler
> which sets a new aio context for the block job bds. When doing so,
> the old context is detached from all the block job bds children and one of
> them is the VM disk bds, serving as backing store for the blockjob bds,
> although the VM disk bds is actually the initializer of that process.
> Since the VM disk bds is protected with walking_aio_notifiers flag
> from double processing in recursive calls, the assert fires.
> 
> The patch fixes the problem by skipping the bds-es in recursive calls
> which have started attachment/detachment already.
> 
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> ---
>   block.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 9b1c5c5eaf..35a19fe1df 100644
> --- a/block.c
> +++ b/block.c
> @@ -5064,7 +5064,9 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
>           bs->drv->bdrv_detach_aio_context(bs);
>       }
>       QLIST_FOREACH(child, &bs->children, next) {
> -        bdrv_detach_aio_context(child->bs);
> +        if (!child->bs->walking_aio_notifiers) {
> +            bdrv_detach_aio_context(child->bs);
> +        }
>       }
>   
>       bs->aio_context = NULL;
> @@ -5083,7 +5085,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
>       bs->aio_context = new_context;
>   
>       QLIST_FOREACH(child, &bs->children, next) {
> -        bdrv_attach_aio_context(child->bs, new_context);
> +        if (!child->bs->walking_aio_notifiers) {
> +            bdrv_attach_aio_context(child->bs, new_context);
> +        }
>       }
>       if (bs->drv->bdrv_attach_aio_context) {
>           bs->drv->bdrv_attach_aio_context(bs, new_context);
> 

-- 
Best,
Denis

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment
  2019-01-24  7:48 [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment Denis Plotnikov
  2019-01-24  9:34 ` Vladimir Sementsov-Ogievskiy
  2019-02-08  8:10 ` Denis Plotnikov
@ 2019-02-08 11:03 ` Kevin Wolf
  2019-02-15  8:37   ` Denis Plotnikov
  2 siblings, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2019-02-08 11:03 UTC (permalink / raw)
  To: Denis Plotnikov; +Cc: mreitz, qemu-block, qemu-devel, den, vsementsov, berto

Am 24.01.2019 um 08:48 hat Denis Plotnikov geschrieben:
> When there is a Backup Block Job running and shutdown command is sent to
> a guest, the guest crushes due to assert(!bs->walking_aio_notifiers).
> 
> Call stack:
> 
> 0  __GI_raise
> 1  __GI_abort
> 2  __assert_fail_base
> 3  __GI___assert_fail
> 4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
> 5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
> 6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
> 7  block_job_attached_aio_context
> 8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
> 9  bdrv_set_aio_context (bs=0x55f54d65c000)
> 10 blk_set_aio_context
> 11 virtio_blk_data_plane_stop
> 12 virtio_bus_stop_ioeventfd
> 13 virtio_vmstate_change
> 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
> 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
> 16 vm_stop (state=RUN_STATE_SHUTDOWN)
> 17 main_loop_should_exit
> 18 main_loop
> 19 main
> 
> This happens because of "new" context attachment to VM disk bds.
> When attaching a new context the corresponding aio context handler is
> called for each of aio_notifiers registered on the VM disk bds context.
> Among those handlers there is the block_job_attached_aio_context handler
> which sets a new aio context for the block job bds. When doing so,
> the old context is detached from all the block job bds children and one of
> them is the VM disk bds, serving as backing store for the blockjob bds,
> although the VM disk bds is actually the initializer of that process.
> Since the VM disk bds is protected with walking_aio_notifiers flag
> from double processing in recursive calls, the assert fires.
> 
> The patch fixes the problem by skipping the bds-es in recursive calls
> which have started attachment/detachment already.
> 
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>

I'm not sure if this fix is actually correct in more than just your
special case. It assumes without checking that the nested attach is for
the same context as the outer attach. (Of course, I'm not sure what you
would do if they were for different contexts! It's a case to avoid, and
it should fail very visibly instead of moving nodes to unexpected
contexts.)

bdrv_set_aio_context() is pretty messy, so it may still turn out that
leaving the greater problem unadressed and just fixing what breaks in
practice is what we need to do.

But did you consider just forbidding nested bdrv_set_aio_context()
instead? Essentially, instead of your checks you would assert
!walking_aio_notifiers. This would simplify the interface instead of
adding more complexity.

If I understand the code correctly, at the time of the nested call, the
respective nodes have already been moved to the new context, so avoiding
the nested calls might be as easy as adding a shortcut to
bdrv_set_aio_context() for this case:

    if (ctx == new_context) {
        return;
    }

Most importantly, though, please write a test case for this. We will
probably need to rework bdrv_set_aio_context() fundamentally (as Berto
recently noticed), and we will want to avoid regressing on this bug.

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment
  2019-02-08 11:03 ` Kevin Wolf
@ 2019-02-15  8:37   ` Denis Plotnikov
  0 siblings, 0 replies; 5+ messages in thread
From: Denis Plotnikov @ 2019-02-15  8:37 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: mreitz, qemu-block, qemu-devel, Denis Lunev,
	Vladimir Sementsov-Ogievskiy, berto



On 08.02.2019 14:03, Kevin Wolf wrote:
> Am 24.01.2019 um 08:48 hat Denis Plotnikov geschrieben:
>> When there is a Backup Block Job running and shutdown command is sent to
>> a guest, the guest crushes due to assert(!bs->walking_aio_notifiers).
>>
>> Call stack:
>>
>> 0  __GI_raise
>> 1  __GI_abort
>> 2  __assert_fail_base
>> 3  __GI___assert_fail
>> 4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
>> 5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
>> 6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
>> 7  block_job_attached_aio_context
>> 8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
>> 9  bdrv_set_aio_context (bs=0x55f54d65c000)
>> 10 blk_set_aio_context
>> 11 virtio_blk_data_plane_stop
>> 12 virtio_bus_stop_ioeventfd
>> 13 virtio_vmstate_change
>> 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
>> 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
>> 16 vm_stop (state=RUN_STATE_SHUTDOWN)
>> 17 main_loop_should_exit
>> 18 main_loop
>> 19 main
>>
>> This happens because of "new" context attachment to VM disk bds.
>> When attaching a new context the corresponding aio context handler is
>> called for each of aio_notifiers registered on the VM disk bds context.
>> Among those handlers there is the block_job_attached_aio_context handler
>> which sets a new aio context for the block job bds. When doing so,
>> the old context is detached from all the block job bds children and one of
>> them is the VM disk bds, serving as backing store for the blockjob bds,
>> although the VM disk bds is actually the initializer of that process.
>> Since the VM disk bds is protected with walking_aio_notifiers flag
>> from double processing in recursive calls, the assert fires.
>>
>> The patch fixes the problem by skipping the bds-es in recursive calls
>> which have started attachment/detachment already.
>>
>> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> 
> I'm not sure if this fix is actually correct in more than just your
> special case. It assumes without checking that the nested attach is for
> the same context as the outer attach. (Of course, I'm not sure what you
> would do if they were for different contexts! It's a case to avoid, and
> it should fail very visibly instead of moving nodes to unexpected
> contexts.)
> 
> bdrv_set_aio_context() is pretty messy, so it may still turn out that
> leaving the greater problem unadressed and just fixing what breaks in
> practice is what we need to do.
> 
> But did you consider just forbidding nested bdrv_set_aio_context()
> instead? Essentially, instead of your checks you would assert
> !walking_aio_notifiers. This would simplify the interface instead of
> adding more complexity.
> 
> If I understand the code correctly, at the time of the nested call, the
> respective nodes have already been moved to the new context, so avoiding
> the nested calls might be as easy as adding a shortcut to
> bdrv_set_aio_context() for this case:
> 
>      if (ctx == new_context) {
>          return;
>      }
> 
Yes, this is probably the best solution. I'll send the patch shortly.
> Most importantly, though, please write a test case for this. We will
> probably need to rework bdrv_set_aio_context() fundamentally (as Berto
> recently noticed), and we will want to avoid regressing on this bug.
> 
Will do
> Kevin
> 

Denis

-- 
Best,
Denis

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-15  8:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24  7:48 [Qemu-devel] [PATCH] block: skip initializer BDS on recursive aio context attachment/detachment Denis Plotnikov
2019-01-24  9:34 ` Vladimir Sementsov-Ogievskiy
2019-02-08  8:10 ` Denis Plotnikov
2019-02-08 11:03 ` Kevin Wolf
2019-02-15  8:37   ` Denis Plotnikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.