All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
@ 2021-12-14 14:35 Stefan Hajnoczi
  2021-12-14 14:59 ` Kevin Wolf
  2022-01-10 18:57 ` Hanna Reitz
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2021-12-14 14:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block,
	qemu-stable, Hanna Reitz, Stefan Hajnoczi

The BlockBackend root child can change when aio_poll() is invoked. This
happens when a temporary filter node is removed upon blockjob
completion, for example.

Functions in block/block-backend.c must be aware of this when using a
blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
may reach 0, resulting in a stale pointer.

One example is scsi_device_purge_requests(), which calls blk_drain() to
wait for in-flight requests to cancel. If the backup blockjob is active,
then the BlockBackend root child is a temporary filter BDS owned by the
blockjob. The blockjob can complete during bdrv_drained_begin() and the
last reference to the BDS is released when the temporary filter node is
removed. This results in a use-after-free when blk_drain() calls
bdrv_drained_end(bs) on the dangling pointer.

Explicitly hold a reference to bs across block APIs that invoke
aio_poll().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
- Audit block/block-backend.c and fix additional cases
---
 block/block-backend.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 12ef80ea17..a40ad7fa92 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
     notifier_list_notify(&blk->remove_bs_notifiers, blk);
     if (tgm->throttle_state) {
         bs = blk_bs(blk);
+        bdrv_ref(bs);
         bdrv_drained_begin(bs);
         throttle_group_detach_aio_context(tgm);
         throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
         bdrv_drained_end(bs);
+        bdrv_unref(bs);
     }
 
     blk_update_root_state(blk);
@@ -1705,6 +1707,7 @@ void blk_drain(BlockBackend *blk)
     BlockDriverState *bs = blk_bs(blk);
 
     if (bs) {
+        bdrv_ref(bs);
         bdrv_drained_begin(bs);
     }
 
@@ -1714,6 +1717,7 @@ void blk_drain(BlockBackend *blk)
 
     if (bs) {
         bdrv_drained_end(bs);
+        bdrv_unref(bs);
     }
 }
 
@@ -2044,10 +2048,13 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
     int ret;
 
     if (bs) {
+        bdrv_ref(bs);
+
         if (update_root_node) {
             ret = bdrv_child_try_set_aio_context(bs, new_context, blk->root,
                                                  errp);
             if (ret < 0) {
+                bdrv_unref(bs);
                 return ret;
             }
         }
@@ -2057,6 +2064,8 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
             throttle_group_attach_aio_context(tgm, new_context);
             bdrv_drained_end(bs);
         }
+
+        bdrv_unref(bs);
     }
 
     blk->ctx = new_context;
@@ -2326,11 +2335,13 @@ void blk_io_limits_disable(BlockBackend *blk)
     ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
     assert(tgm->throttle_state);
     if (bs) {
+        bdrv_ref(bs);
         bdrv_drained_begin(bs);
     }
     throttle_group_unregister_tgm(tgm);
     if (bs) {
         bdrv_drained_end(bs);
+        bdrv_unref(bs);
     }
 }
 
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-14 14:35 [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
@ 2021-12-14 14:59 ` Kevin Wolf
  2021-12-15 11:28   ` Stefan Hajnoczi
  2022-01-10 18:57 ` Hanna Reitz
  1 sibling, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2021-12-14 14:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-block, qemu-stable

Am 14.12.2021 um 15:35 hat Stefan Hajnoczi geschrieben:
> The BlockBackend root child can change when aio_poll() is invoked. This
> happens when a temporary filter node is removed upon blockjob
> completion, for example.
> 
> Functions in block/block-backend.c must be aware of this when using a
> blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> may reach 0, resulting in a stale pointer.
> 
> One example is scsi_device_purge_requests(), which calls blk_drain() to
> wait for in-flight requests to cancel. If the backup blockjob is active,
> then the BlockBackend root child is a temporary filter BDS owned by the
> blockjob. The blockjob can complete during bdrv_drained_begin() and the
> last reference to the BDS is released when the temporary filter node is
> removed. This results in a use-after-free when blk_drain() calls
> bdrv_drained_end(bs) on the dangling pointer.
> 
> Explicitly hold a reference to bs across block APIs that invoke
> aio_poll().
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v2:
> - Audit block/block-backend.c and fix additional cases
> ---
>  block/block-backend.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 12ef80ea17..a40ad7fa92 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
>      notifier_list_notify(&blk->remove_bs_notifiers, blk);
>      if (tgm->throttle_state) {
>          bs = blk_bs(blk);
> +        bdrv_ref(bs);
>          bdrv_drained_begin(bs);
>          throttle_group_detach_aio_context(tgm);
>          throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
>          bdrv_drained_end(bs);
> +        bdrv_unref(bs);
>      }
>  
>      blk_update_root_state(blk);

This hunk is unnecessary, we still hold a reference that is only given
up a few lines down with bdrv_root_unref_child(root).

The rest looks good to me, so without this hunk:

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-14 14:59 ` Kevin Wolf
@ 2021-12-15 11:28   ` Stefan Hajnoczi
  2021-12-15 15:31     ` Kevin Wolf
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2021-12-15 11:28 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2723 bytes --]

On Tue, Dec 14, 2021 at 03:59:49PM +0100, Kevin Wolf wrote:
> Am 14.12.2021 um 15:35 hat Stefan Hajnoczi geschrieben:
> > The BlockBackend root child can change when aio_poll() is invoked. This
> > happens when a temporary filter node is removed upon blockjob
> > completion, for example.
> > 
> > Functions in block/block-backend.c must be aware of this when using a
> > blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> > may reach 0, resulting in a stale pointer.
> > 
> > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > wait for in-flight requests to cancel. If the backup blockjob is active,
> > then the BlockBackend root child is a temporary filter BDS owned by the
> > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > last reference to the BDS is released when the temporary filter node is
> > removed. This results in a use-after-free when blk_drain() calls
> > bdrv_drained_end(bs) on the dangling pointer.
> > 
> > Explicitly hold a reference to bs across block APIs that invoke
> > aio_poll().
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > v2:
> > - Audit block/block-backend.c and fix additional cases
> > ---
> >  block/block-backend.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index 12ef80ea17..a40ad7fa92 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
> >      notifier_list_notify(&blk->remove_bs_notifiers, blk);
> >      if (tgm->throttle_state) {
> >          bs = blk_bs(blk);
> > +        bdrv_ref(bs);
> >          bdrv_drained_begin(bs);
> >          throttle_group_detach_aio_context(tgm);
> >          throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
> >          bdrv_drained_end(bs);
> > +        bdrv_unref(bs);
> >      }
> >  
> >      blk_update_root_state(blk);
> 
> This hunk is unnecessary, we still hold a reference that is only given
> up a few lines down with bdrv_root_unref_child(root).

That's not the only place where the reference can be dropped:
bdrv_drop_filter() removes the filter node from the graph.

Here is a case where it happens: block/backup.c:backup_clean() ->
bdrv_cbw_drop() -> bdrv_drop_filter() -> bdrv_replace_node_common() ->
bdrv_replace_child_commit(). After we reach this bdrv_unref() is called
a few times and all references are dropped because the node is no longer
in the graph.

This happens during blk_remove_bs() -> bdrv_drained_begin(), so the bs
pointer in the above hunk can be stale.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-15 11:28   ` Stefan Hajnoczi
@ 2021-12-15 15:31     ` Kevin Wolf
  2021-12-15 16:46       ` Stefan Hajnoczi
  2022-01-11 11:09       ` Stefan Hajnoczi
  0 siblings, 2 replies; 8+ messages in thread
From: Kevin Wolf @ 2021-12-15 15:31 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 3447 bytes --]

Am 15.12.2021 um 12:28 hat Stefan Hajnoczi geschrieben:
> On Tue, Dec 14, 2021 at 03:59:49PM +0100, Kevin Wolf wrote:
> > Am 14.12.2021 um 15:35 hat Stefan Hajnoczi geschrieben:
> > > The BlockBackend root child can change when aio_poll() is invoked. This
> > > happens when a temporary filter node is removed upon blockjob
> > > completion, for example.
> > > 
> > > Functions in block/block-backend.c must be aware of this when using a
> > > blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> > > may reach 0, resulting in a stale pointer.
> > > 
> > > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > > wait for in-flight requests to cancel. If the backup blockjob is active,
> > > then the BlockBackend root child is a temporary filter BDS owned by the
> > > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > > last reference to the BDS is released when the temporary filter node is
> > > removed. This results in a use-after-free when blk_drain() calls
> > > bdrv_drained_end(bs) on the dangling pointer.
> > > 
> > > Explicitly hold a reference to bs across block APIs that invoke
> > > aio_poll().
> > > 
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > > v2:
> > > - Audit block/block-backend.c and fix additional cases
> > > ---
> > >  block/block-backend.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > > 
> > > diff --git a/block/block-backend.c b/block/block-backend.c
> > > index 12ef80ea17..a40ad7fa92 100644
> > > --- a/block/block-backend.c
> > > +++ b/block/block-backend.c
> > > @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
> > >      notifier_list_notify(&blk->remove_bs_notifiers, blk);
> > >      if (tgm->throttle_state) {
> > >          bs = blk_bs(blk);
> > > +        bdrv_ref(bs);
> > >          bdrv_drained_begin(bs);
> > >          throttle_group_detach_aio_context(tgm);
> > >          throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
> > >          bdrv_drained_end(bs);
> > > +        bdrv_unref(bs);
> > >      }
> > >  
> > >      blk_update_root_state(blk);
> > 
> > This hunk is unnecessary, we still hold a reference that is only given
> > up a few lines down with bdrv_root_unref_child(root).
> 
> That's not the only place where the reference can be dropped:
> bdrv_drop_filter() removes the filter node from the graph.
> 
> Here is a case where it happens: block/backup.c:backup_clean() ->
> bdrv_cbw_drop() -> bdrv_drop_filter() -> bdrv_replace_node_common() ->
> bdrv_replace_child_commit(). After we reach this bdrv_unref() is called
> a few times and all references are dropped because the node is no longer
> in the graph.
> 
> This happens during blk_remove_bs() -> bdrv_drained_begin(), so the bs
> pointer in the above hunk can be stale.

Is the scenario that blk->root doesn't go away, but the node it
references changes? So the unref below will be for a different node than
we're draining here?

If so, let's add a comment that blk_bs(blk) can change after the drain,
and maybe move the BlockDriverState *bs declaration inside the if block
because the variable is invalid after it anyway.

Can it also happen that instead of removing a node from the chain, a new
one is inserted and we actually end up having drained the wrong node
before switching the context for tgm?

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-15 15:31     ` Kevin Wolf
@ 2021-12-15 16:46       ` Stefan Hajnoczi
  2022-01-11 11:09       ` Stefan Hajnoczi
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2021-12-15 16:46 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 3830 bytes --]

On Wed, Dec 15, 2021 at 04:31:26PM +0100, Kevin Wolf wrote:
> Am 15.12.2021 um 12:28 hat Stefan Hajnoczi geschrieben:
> > On Tue, Dec 14, 2021 at 03:59:49PM +0100, Kevin Wolf wrote:
> > > Am 14.12.2021 um 15:35 hat Stefan Hajnoczi geschrieben:
> > > > The BlockBackend root child can change when aio_poll() is invoked. This
> > > > happens when a temporary filter node is removed upon blockjob
> > > > completion, for example.
> > > > 
> > > > Functions in block/block-backend.c must be aware of this when using a
> > > > blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> > > > may reach 0, resulting in a stale pointer.
> > > > 
> > > > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > > > wait for in-flight requests to cancel. If the backup blockjob is active,
> > > > then the BlockBackend root child is a temporary filter BDS owned by the
> > > > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > > > last reference to the BDS is released when the temporary filter node is
> > > > removed. This results in a use-after-free when blk_drain() calls
> > > > bdrv_drained_end(bs) on the dangling pointer.
> > > > 
> > > > Explicitly hold a reference to bs across block APIs that invoke
> > > > aio_poll().
> > > > 
> > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > > v2:
> > > > - Audit block/block-backend.c and fix additional cases
> > > > ---
> > > >  block/block-backend.c | 11 +++++++++++
> > > >  1 file changed, 11 insertions(+)
> > > > 
> > > > diff --git a/block/block-backend.c b/block/block-backend.c
> > > > index 12ef80ea17..a40ad7fa92 100644
> > > > --- a/block/block-backend.c
> > > > +++ b/block/block-backend.c
> > > > @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
> > > >      notifier_list_notify(&blk->remove_bs_notifiers, blk);
> > > >      if (tgm->throttle_state) {
> > > >          bs = blk_bs(blk);
> > > > +        bdrv_ref(bs);
> > > >          bdrv_drained_begin(bs);
> > > >          throttle_group_detach_aio_context(tgm);
> > > >          throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
> > > >          bdrv_drained_end(bs);
> > > > +        bdrv_unref(bs);
> > > >      }
> > > >  
> > > >      blk_update_root_state(blk);
> > > 
> > > This hunk is unnecessary, we still hold a reference that is only given
> > > up a few lines down with bdrv_root_unref_child(root).
> > 
> > That's not the only place where the reference can be dropped:
> > bdrv_drop_filter() removes the filter node from the graph.
> > 
> > Here is a case where it happens: block/backup.c:backup_clean() ->
> > bdrv_cbw_drop() -> bdrv_drop_filter() -> bdrv_replace_node_common() ->
> > bdrv_replace_child_commit(). After we reach this bdrv_unref() is called
> > a few times and all references are dropped because the node is no longer
> > in the graph.
> > 
> > This happens during blk_remove_bs() -> bdrv_drained_begin(), so the bs
> > pointer in the above hunk can be stale.
> 
> Is the scenario that blk->root doesn't go away, but the node it
> references changes? So the unref below will be for a different node than
> we're draining here?

Yes, exactly.

> If so, let's add a comment that blk_bs(blk) can change after the drain,
> and maybe move the BlockDriverState *bs declaration inside the if block
> because the variable is invalid after it anyway.

Will fix.

> Can it also happen that instead of removing a node from the chain, a new
> one is inserted and we actually end up having drained the wrong node
> before switching the context for tgm?

I'll investigate that. There is already some level of support for
draining new nodes but I'm not sure it covers all insert and replace
cases.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-14 14:35 [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
  2021-12-14 14:59 ` Kevin Wolf
@ 2022-01-10 18:57 ` Hanna Reitz
  2022-01-11 15:36   ` Stefan Hajnoczi
  1 sibling, 1 reply; 8+ messages in thread
From: Hanna Reitz @ 2022-01-10 18:57 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-stable, qemu-block

On 14.12.21 15:35, Stefan Hajnoczi wrote:
> The BlockBackend root child can change when aio_poll() is invoked. This
> happens when a temporary filter node is removed upon blockjob
> completion, for example.
>
> Functions in block/block-backend.c must be aware of this when using a
> blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> may reach 0, resulting in a stale pointer.
>
> One example is scsi_device_purge_requests(), which calls blk_drain() to
> wait for in-flight requests to cancel. If the backup blockjob is active,
> then the BlockBackend root child is a temporary filter BDS owned by the
> blockjob. The blockjob can complete during bdrv_drained_begin() and the
> last reference to the BDS is released when the temporary filter node is
> removed. This results in a use-after-free when blk_drain() calls
> bdrv_drained_end(bs) on the dangling pointer.

By the way, I have a BZ for this, though it’s about block-stream instead 
of backup (https://bugzilla.redhat.com/show_bug.cgi?id=2036178).  But 
I’m happy to report your patch seems* to fix that problem, too!  (Thanks 
for fixing my BZs! :))

*I’ve written a reproducer in iotest form 
(https://gitlab.com/hreitz/qemu/-/blob/stefans-fix-and-a-test/tests/qemu-iotests/tests/stream-error-on-reset), 
and so far I can only assume it indeed reproduces the report, but I 
found that iotest to indeed be fixed by this patch.  (Which made me very 
happy.)

Hanna

> Explicitly hold a reference to bs across block APIs that invoke
> aio_poll().
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v2:
> - Audit block/block-backend.c and fix additional cases
> ---
>   block/block-backend.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 12ef80ea17..a40ad7fa92 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
>       notifier_list_notify(&blk->remove_bs_notifiers, blk);
>       if (tgm->throttle_state) {
>           bs = blk_bs(blk);
> +        bdrv_ref(bs);
>           bdrv_drained_begin(bs);
>           throttle_group_detach_aio_context(tgm);
>           throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
>           bdrv_drained_end(bs);
> +        bdrv_unref(bs);
>       }
>   
>       blk_update_root_state(blk);
> @@ -1705,6 +1707,7 @@ void blk_drain(BlockBackend *blk)
>       BlockDriverState *bs = blk_bs(blk);
>   
>       if (bs) {
> +        bdrv_ref(bs);
>           bdrv_drained_begin(bs);
>       }
>   
> @@ -1714,6 +1717,7 @@ void blk_drain(BlockBackend *blk)
>   
>       if (bs) {
>           bdrv_drained_end(bs);
> +        bdrv_unref(bs);
>       }
>   }
>   
> @@ -2044,10 +2048,13 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
>       int ret;
>   
>       if (bs) {
> +        bdrv_ref(bs);
> +
>           if (update_root_node) {
>               ret = bdrv_child_try_set_aio_context(bs, new_context, blk->root,
>                                                    errp);
>               if (ret < 0) {
> +                bdrv_unref(bs);
>                   return ret;
>               }
>           }
> @@ -2057,6 +2064,8 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
>               throttle_group_attach_aio_context(tgm, new_context);
>               bdrv_drained_end(bs);
>           }
> +
> +        bdrv_unref(bs);
>       }
>   
>       blk->ctx = new_context;
> @@ -2326,11 +2335,13 @@ void blk_io_limits_disable(BlockBackend *blk)
>       ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
>       assert(tgm->throttle_state);
>       if (bs) {
> +        bdrv_ref(bs);
>           bdrv_drained_begin(bs);
>       }
>       throttle_group_unregister_tgm(tgm);
>       if (bs) {
>           bdrv_drained_end(bs);
> +        bdrv_unref(bs);
>       }
>   }
>   



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2021-12-15 15:31     ` Kevin Wolf
  2021-12-15 16:46       ` Stefan Hajnoczi
@ 2022-01-11 11:09       ` Stefan Hajnoczi
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2022-01-11 11:09 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	qemu-block, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 3969 bytes --]

On Wed, Dec 15, 2021 at 04:31:26PM +0100, Kevin Wolf wrote:
> Am 15.12.2021 um 12:28 hat Stefan Hajnoczi geschrieben:
> > On Tue, Dec 14, 2021 at 03:59:49PM +0100, Kevin Wolf wrote:
> > > Am 14.12.2021 um 15:35 hat Stefan Hajnoczi geschrieben:
> > > > The BlockBackend root child can change when aio_poll() is invoked. This
> > > > happens when a temporary filter node is removed upon blockjob
> > > > completion, for example.
> > > > 
> > > > Functions in block/block-backend.c must be aware of this when using a
> > > > blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> > > > may reach 0, resulting in a stale pointer.
> > > > 
> > > > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > > > wait for in-flight requests to cancel. If the backup blockjob is active,
> > > > then the BlockBackend root child is a temporary filter BDS owned by the
> > > > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > > > last reference to the BDS is released when the temporary filter node is
> > > > removed. This results in a use-after-free when blk_drain() calls
> > > > bdrv_drained_end(bs) on the dangling pointer.
> > > > 
> > > > Explicitly hold a reference to bs across block APIs that invoke
> > > > aio_poll().
> > > > 
> > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > > v2:
> > > > - Audit block/block-backend.c and fix additional cases
> > > > ---
> > > >  block/block-backend.c | 11 +++++++++++
> > > >  1 file changed, 11 insertions(+)
> > > > 
> > > > diff --git a/block/block-backend.c b/block/block-backend.c
> > > > index 12ef80ea17..a40ad7fa92 100644
> > > > --- a/block/block-backend.c
> > > > +++ b/block/block-backend.c
> > > > @@ -828,10 +828,12 @@ void blk_remove_bs(BlockBackend *blk)
> > > >      notifier_list_notify(&blk->remove_bs_notifiers, blk);
> > > >      if (tgm->throttle_state) {
> > > >          bs = blk_bs(blk);
> > > > +        bdrv_ref(bs);
> > > >          bdrv_drained_begin(bs);
> > > >          throttle_group_detach_aio_context(tgm);
> > > >          throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
> > > >          bdrv_drained_end(bs);
> > > > +        bdrv_unref(bs);
> > > >      }
> > > >  
> > > >      blk_update_root_state(blk);
> > > 
> > > This hunk is unnecessary, we still hold a reference that is only given
> > > up a few lines down with bdrv_root_unref_child(root).
> > 
> > That's not the only place where the reference can be dropped:
> > bdrv_drop_filter() removes the filter node from the graph.
> > 
> > Here is a case where it happens: block/backup.c:backup_clean() ->
> > bdrv_cbw_drop() -> bdrv_drop_filter() -> bdrv_replace_node_common() ->
> > bdrv_replace_child_commit(). After we reach this bdrv_unref() is called
> > a few times and all references are dropped because the node is no longer
> > in the graph.
> > 
> > This happens during blk_remove_bs() -> bdrv_drained_begin(), so the bs
> > pointer in the above hunk can be stale.
> 
> Is the scenario that blk->root doesn't go away, but the node it
> references changes?

Yes, blk->root remains non-NULL but its value changes across aio_poll().

> So the unref below will be for a different node than
> we're draining here?

Yes.

> If so, let's add a comment that blk_bs(blk) can change after the drain,
> and maybe move the BlockDriverState *bs declaration inside the if block
> because the variable is invalid after it anyway.

Sounds good.

> Can it also happen that instead of removing a node from the chain, a new
> one is inserted and we actually end up having drained the wrong node
> before switching the context for tgm?

I think a new node can theoretically be inserted. I don't know if that
can happen in practice since insertions typically happen due to monitor
commands and monitor commands don't execute during aio_poll().

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll()
  2022-01-10 18:57 ` Hanna Reitz
@ 2022-01-11 15:36   ` Stefan Hajnoczi
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2022-01-11 15:36 UTC (permalink / raw)
  To: Hanna Reitz
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block,
	qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1651 bytes --]

On Mon, Jan 10, 2022 at 07:57:05PM +0100, Hanna Reitz wrote:
> On 14.12.21 15:35, Stefan Hajnoczi wrote:
> > The BlockBackend root child can change when aio_poll() is invoked. This
> > happens when a temporary filter node is removed upon blockjob
> > completion, for example.
> > 
> > Functions in block/block-backend.c must be aware of this when using a
> > blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
> > may reach 0, resulting in a stale pointer.
> > 
> > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > wait for in-flight requests to cancel. If the backup blockjob is active,
> > then the BlockBackend root child is a temporary filter BDS owned by the
> > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > last reference to the BDS is released when the temporary filter node is
> > removed. This results in a use-after-free when blk_drain() calls
> > bdrv_drained_end(bs) on the dangling pointer.
> 
> By the way, I have a BZ for this, though it’s about block-stream instead of
> backup (https://bugzilla.redhat.com/show_bug.cgi?id=2036178).  But I’m happy
> to report your patch seems* to fix that problem, too!  (Thanks for fixing my
> BZs! :))
> 
> *I’ve written a reproducer in iotest form (https://gitlab.com/hreitz/qemu/-/blob/stefans-fix-and-a-test/tests/qemu-iotests/tests/stream-error-on-reset),
> and so far I can only assume it indeed reproduces the report, but I found
> that iotest to indeed be fixed by this patch.  (Which made me very happy.)

Great, I have merged your test case and sent a v3.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-11 15:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-14 14:35 [PATCH v2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
2021-12-14 14:59 ` Kevin Wolf
2021-12-15 11:28   ` Stefan Hajnoczi
2021-12-15 15:31     ` Kevin Wolf
2021-12-15 16:46       ` Stefan Hajnoczi
2022-01-11 11:09       ` Stefan Hajnoczi
2022-01-10 18:57 ` Hanna Reitz
2022-01-11 15:36   ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.