All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] backup-top: Don't crash on post-finalize accesses
@ 2021-02-19 15:33 Max Reitz
  2021-02-19 15:33 ` [PATCH 1/3] backup: Remove nodes from job in .clean() Max Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Max Reitz @ 2021-02-19 15:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel,
	Max Reitz

Hi,

After job-finalize, the backup-top node generally stays around.  That’s
quite a problem, because its BlockCopyState is already freed then, and
it has no filtered child.  We really want the node to be gone.

The only reference that realistically can keep it alive is that of the
backup job (though block_job_add_bdrv() called by block_job_create()).
Dropping that reference before bdrv_backup_top_drop() should[1] ensure
bdrv_backup_top_drop() will delete the node.

[1]: bdrv_backup_top_drop() replaces the backup-top node by its filtered
     child, which detaches all parents from backup-top but the ones with
     .stay_at_node set.  The only parent that does this is a block job.
     I don’t think nodes can be in use by multiple block jobs at once,
     so the only parent with .stay_at_node set can be backup-top’s own
     backup job.


Patch 2 is there kind of as a failsafe, and kind of because it just made
sense to me, even if it won’t do anything.


Max Reitz (3):
  backup: Remove nodes from job in .clean()
  backup-top: Refuse I/O in inactive state
  iotests/283: Check that finalize drops backup-top

 block/backup-top.c         | 10 +++++++
 block/backup.c             |  1 +
 tests/qemu-iotests/283     | 55 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/283.out | 15 +++++++++++
 4 files changed, 81 insertions(+)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] backup: Remove nodes from job in .clean()
  2021-02-19 15:33 [PATCH 0/3] backup-top: Don't crash on post-finalize accesses Max Reitz
@ 2021-02-19 15:33 ` Max Reitz
  2021-02-24 15:33   ` Kevin Wolf
  2021-02-19 15:33 ` [PATCH 2/3] backup-top: Refuse I/O in inactive state Max Reitz
  2021-02-19 15:33 ` [PATCH 3/3] iotests/283: Check that finalize drops backup-top Max Reitz
  2 siblings, 1 reply; 7+ messages in thread
From: Max Reitz @ 2021-02-19 15:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel,
	Max Reitz

The block job holds a reference to the backup-top node (because it is
passed as the main job BDS to block_job_create()).  Therefore,
bdrv_backup_top_drop() cannot delete the backup-top node (replacing it
by its child does not affect the job parent, because that has
.stay_at_node set).  That is a problem, because all of its I/O functions
assume the BlockCopyState (s->bcs) to be valid and that it has a
filtered child; but after bdrv_backup_top_drop(), neither of those
things are true.

It does not make sense to add new parents to backup-top after
backup_clean(), so we should detach it from the job before
bdrv_backup_top_drop().  Because there is no function to do that for a
single node, just detach all of the job's nodes -- the job does not do
anything past backup_clean() anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/backup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/backup.c b/block/backup.c
index 94e6dcd72e..6cf2f974aa 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -103,6 +103,7 @@ static void backup_abort(Job *job)
 static void backup_clean(Job *job)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
+    block_job_remove_all_bdrv(&s->common);
     bdrv_backup_top_drop(s->backup_top);
 }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] backup-top: Refuse I/O in inactive state
  2021-02-19 15:33 [PATCH 0/3] backup-top: Don't crash on post-finalize accesses Max Reitz
  2021-02-19 15:33 ` [PATCH 1/3] backup: Remove nodes from job in .clean() Max Reitz
@ 2021-02-19 15:33 ` Max Reitz
  2021-02-19 15:33 ` [PATCH 3/3] iotests/283: Check that finalize drops backup-top Max Reitz
  2 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2021-02-19 15:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel,
	Max Reitz

When the backup-top node transitions from active to inactive in
bdrv_backup_top_drop(), the BlockCopyState is freed and the filtered
child is removed, so the node effectively becomes unusable.

However, noone told its I/O functions this, so they will happily
continue accessing bs->backing and s->bcs.  Prevent that by aborting
early when s->active is false.

(After the preceding patch, the node should be gone after
bdrv_backup_top_drop(), so this should largely be a theoretical problem.
But still, better to be safe than sorry, and also I think it just makes
sense to check s->active in the I/O functions.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/backup-top.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/backup-top.c b/block/backup-top.c
index d1253e1aa6..589e8b651d 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -45,6 +45,12 @@ static coroutine_fn int backup_top_co_preadv(
         BlockDriverState *bs, uint64_t offset, uint64_t bytes,
         QEMUIOVector *qiov, int flags)
 {
+    BDRVBackupTopState *s = bs->opaque;
+
+    if (!s->active) {
+        return -EIO;
+    }
+
     return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
 }
 
@@ -54,6 +60,10 @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, uint64_t offset,
     BDRVBackupTopState *s = bs->opaque;
     uint64_t off, end;
 
+    if (!s->active) {
+        return -EIO;
+    }
+
     if (flags & BDRV_REQ_WRITE_UNCHANGED) {
         return 0;
     }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] iotests/283: Check that finalize drops backup-top
  2021-02-19 15:33 [PATCH 0/3] backup-top: Don't crash on post-finalize accesses Max Reitz
  2021-02-19 15:33 ` [PATCH 1/3] backup: Remove nodes from job in .clean() Max Reitz
  2021-02-19 15:33 ` [PATCH 2/3] backup-top: Refuse I/O in inactive state Max Reitz
@ 2021-02-19 15:33 ` Max Reitz
  2021-02-19 15:59   ` Max Reitz
  2 siblings, 1 reply; 7+ messages in thread
From: Max Reitz @ 2021-02-19 15:33 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel,
	Max Reitz

Without any of HEAD^ or HEAD^^ applied, qemu will most likely crash on
the qemu-io invocation, for a variety of immediate reasons.  The
underlying problem is generally a use-after-free access into
backup-top's BlockCopyState.

With only HEAD^ applied, qemu-io will run into an EIO (which is not
capture by the output, but you can see that the qemu-io invocation will
be accepted (i.e., qemu-io will run) in contrast to the reference
output, where the node name cannot be found), and qemu will then crash
in query-named-block-nodes: bdrv_get_allocated_file_size() detects
backup-top to be a filter and passes the request through to its child.
However, after bdrv_backup_top_drop(), that child is NULL, so the
recursive call crashes.

With HEAD^^ applied, this test should pass.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/283     | 55 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/283.out | 15 +++++++++++
 2 files changed, 70 insertions(+)

diff --git a/tests/qemu-iotests/283 b/tests/qemu-iotests/283
index 79643e375b..509dcbbcf4 100755
--- a/tests/qemu-iotests/283
+++ b/tests/qemu-iotests/283
@@ -97,3 +97,58 @@ vm.qmp_log('blockdev-add', **{
 vm.qmp_log('blockdev-backup', sync='full', device='source', target='target')
 
 vm.shutdown()
+
+
+"""
+Check that the backup-top node is gone after job-finalize.
+
+During finalization, the node becomes inactive and can no longer
+function.  If it is still present, new parents might be attached, and
+there would be no meaningful way to handle their I/O requests.
+"""
+
+print('\n=== backup-top should be gone after job-finalize ===\n')
+
+vm = iotests.VM()
+vm.launch()
+
+vm.qmp_log('blockdev-add', **{
+    'node-name': 'source',
+    'driver': 'null-co',
+})
+
+vm.qmp_log('blockdev-add', **{
+    'node-name': 'target',
+    'driver': 'null-co',
+})
+
+vm.qmp_log('blockdev-backup',
+           job_id='backup',
+           device='source',
+           target='target',
+           sync='full',
+           filter_node_name='backup-filter',
+           auto_finalize=False,
+           auto_dismiss=False)
+
+vm.event_wait('BLOCK_JOB_PENDING', 5.0)
+
+# The backup-top filter should still be present prior to finalization
+assert vm.node_info('backup-filter') is not None
+
+vm.qmp_log('job-finalize', id='backup')
+vm.event_wait('BLOCK_JOB_COMPLETED', 5.0)
+
+# The filter should be gone now.  Check that by trying to access it
+# with qemu-io (which will most likely crash qemu if it is still
+# there.).
+vm.qmp_log('human-monitor-command',
+           command_line='qemu-io backup-filter "write 0 1M"')
+
+# (Also, do an explicit check.)
+assert vm.node_info('backup-filter') is None
+
+vm.qmp_log('job-dismiss', id='backup')
+vm.event_wait('JOB_STATUS_CHANGE', 5.0, {'data': {'status': 'null'}})
+
+vm.shutdown()
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index d8cff22cc1..7e9cd9a7d4 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -6,3 +6,18 @@
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": "full", "target": "target"}}
 {"error": {"class": "GenericError", "desc": "Cannot set permissions for backup-top filter: Conflicts with use by other as 'image', which uses 'write' on base"}}
+
+=== backup-top should be gone after job-finalize ===
+
+{"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "source"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "target"}}
+{"return": {}}
+{"execute": "blockdev-backup", "arguments": {"auto-dismiss": false, "auto-finalize": false, "device": "source", "filter-node-name": "backup-filter", "job-id": "backup", "sync": "full", "target": "target"}}
+{"return": {}}
+{"execute": "job-finalize", "arguments": {"id": "backup"}}
+{"return": {}}
+{"execute": "human-monitor-command", "arguments": {"command-line": "qemu-io backup-filter \"write 0 1M\""}}
+{"return": "Error: Cannot find device= nor node_name=backup-filter\r\n"}
+{"execute": "job-dismiss", "arguments": {"id": "backup"}}
+{"return": {}}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] iotests/283: Check that finalize drops backup-top
  2021-02-19 15:33 ` [PATCH 3/3] iotests/283: Check that finalize drops backup-top Max Reitz
@ 2021-02-19 15:59   ` Max Reitz
  2021-02-24 15:50     ` Kevin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Max Reitz @ 2021-02-19 15:59 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel

On 19.02.21 16:33, Max Reitz wrote:
> Without any of HEAD^ or HEAD^^ applied, qemu will most likely crash on
> the qemu-io invocation, for a variety of immediate reasons.  The
> underlying problem is generally a use-after-free access into
> backup-top's BlockCopyState.
> 
> With only HEAD^ applied, qemu-io will run into an EIO (which is not
> capture by the output, but you can see that the qemu-io invocation will
> be accepted (i.e., qemu-io will run) in contrast to the reference
> output, where the node name cannot be found), and qemu will then crash
> in query-named-block-nodes: bdrv_get_allocated_file_size() detects
> backup-top to be a filter and passes the request through to its child.
> However, after bdrv_backup_top_drop(), that child is NULL, so the
> recursive call crashes.
> 
> With HEAD^^ applied, this test should pass.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   tests/qemu-iotests/283     | 55 ++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/283.out | 15 +++++++++++
>   2 files changed, 70 insertions(+)
> 
> diff --git a/tests/qemu-iotests/283 b/tests/qemu-iotests/283
> index 79643e375b..509dcbbcf4 100755
> --- a/tests/qemu-iotests/283
> +++ b/tests/qemu-iotests/283
> @@ -97,3 +97,58 @@ vm.qmp_log('blockdev-add', **{
>   vm.qmp_log('blockdev-backup', sync='full', device='source', target='target')
>   
>   vm.shutdown()
> +
> +
> +"""
> +Check that the backup-top node is gone after job-finalize.
> +
> +During finalization, the node becomes inactive and can no longer
> +function.  If it is still present, new parents might be attached, and
> +there would be no meaningful way to handle their I/O requests.
> +"""

Oh no, 297/pylint complains that this “string statement has no effect”. 
  Guess it should be a normal comment under the following print() then...

Max

> +print('\n=== backup-top should be gone after job-finalize ===\n')



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] backup: Remove nodes from job in .clean()
  2021-02-19 15:33 ` [PATCH 1/3] backup: Remove nodes from job in .clean() Max Reitz
@ 2021-02-24 15:33   ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2021-02-24 15:33 UTC (permalink / raw)
  To: Max Reitz; +Cc: Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel, qemu-block

Am 19.02.2021 um 16:33 hat Max Reitz geschrieben:
> The block job holds a reference to the backup-top node (because it is
> passed as the main job BDS to block_job_create()).  Therefore,
> bdrv_backup_top_drop() cannot delete the backup-top node (replacing it
> by its child does not affect the job parent, because that has
> .stay_at_node set).  That is a problem, because all of its I/O functions
> assume the BlockCopyState (s->bcs) to be valid and that it has a
> filtered child; but after bdrv_backup_top_drop(), neither of those
> things are true.

This kind of suggests that block_copy_state_free() doesn't really belong
in bdrv_backup_top_drop(), but in a .bdrv_close callback.

Doesn't make this patch less correct, of course. We still want to have
all references dropped at the end of bdrv_backup_top_drop().

> It does not make sense to add new parents to backup-top after
> backup_clean(), so we should detach it from the job before
> bdrv_backup_top_drop().  Because there is no function to do that for a
> single node, just detach all of the job's nodes -- the job does not do
> anything past backup_clean() anyway.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] iotests/283: Check that finalize drops backup-top
  2021-02-19 15:59   ` Max Reitz
@ 2021-02-24 15:50     ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2021-02-24 15:50 UTC (permalink / raw)
  To: Max Reitz; +Cc: Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel, qemu-block

Am 19.02.2021 um 16:59 hat Max Reitz geschrieben:
> On 19.02.21 16:33, Max Reitz wrote:
> > Without any of HEAD^ or HEAD^^ applied, qemu will most likely crash on
> > the qemu-io invocation, for a variety of immediate reasons.  The
> > underlying problem is generally a use-after-free access into
> > backup-top's BlockCopyState.
> > 
> > With only HEAD^ applied, qemu-io will run into an EIO (which is not
> > capture by the output, but you can see that the qemu-io invocation will
> > be accepted (i.e., qemu-io will run) in contrast to the reference
> > output, where the node name cannot be found), and qemu will then crash
> > in query-named-block-nodes: bdrv_get_allocated_file_size() detects
> > backup-top to be a filter and passes the request through to its child.
> > However, after bdrv_backup_top_drop(), that child is NULL, so the
> > recursive call crashes.
> > 
> > With HEAD^^ applied, this test should pass.
> > 
> > Signed-off-by: Max Reitz <mreitz@redhat.com>
> > ---
> >   tests/qemu-iotests/283     | 55 ++++++++++++++++++++++++++++++++++++++
> >   tests/qemu-iotests/283.out | 15 +++++++++++
> >   2 files changed, 70 insertions(+)
> > 
> > diff --git a/tests/qemu-iotests/283 b/tests/qemu-iotests/283
> > index 79643e375b..509dcbbcf4 100755
> > --- a/tests/qemu-iotests/283
> > +++ b/tests/qemu-iotests/283
> > @@ -97,3 +97,58 @@ vm.qmp_log('blockdev-add', **{
> >   vm.qmp_log('blockdev-backup', sync='full', device='source', target='target')
> >   vm.shutdown()
> > +
> > +
> > +"""
> > +Check that the backup-top node is gone after job-finalize.
> > +
> > +During finalization, the node becomes inactive and can no longer
> > +function.  If it is still present, new parents might be attached, and
> > +there would be no meaningful way to handle their I/O requests.
> > +"""
> 
> Oh no, 297/pylint complains that this “string statement has no effect”.
> Guess it should be a normal comment under the following print() then...

Thanks, fixed up the comment as you suggest and applied to the block
branch.

Kevin



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-24 15:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19 15:33 [PATCH 0/3] backup-top: Don't crash on post-finalize accesses Max Reitz
2021-02-19 15:33 ` [PATCH 1/3] backup: Remove nodes from job in .clean() Max Reitz
2021-02-24 15:33   ` Kevin Wolf
2021-02-19 15:33 ` [PATCH 2/3] backup-top: Refuse I/O in inactive state Max Reitz
2021-02-19 15:33 ` [PATCH 3/3] iotests/283: Check that finalize drops backup-top Max Reitz
2021-02-19 15:59   ` Max Reitz
2021-02-24 15:50     ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.