* [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll()
@ 2022-01-11 15:36 Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 1/2] " Stefan Hajnoczi
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-01-11 15:36 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block,
qemu-stable, Hanna Reitz, Stefan Hajnoczi
This series fixes use-after-free bugs when blk->root changes across aio_poll().
For example, a temporary filter node can be removed by a blockjob when a
drained section begins. If the caller doesn't hold a ref on the BDS then it
will have been freed.
Hanna Reitz (1):
iotests/stream-error-on-reset: New test
Stefan Hajnoczi (1):
block-backend: prevent dangling BDS pointers across aio_poll()
block/block-backend.c | 19 ++-
.../qemu-iotests/tests/stream-error-on-reset | 140 ++++++++++++++++++
.../tests/stream-error-on-reset.out | 5 +
3 files changed, 162 insertions(+), 2 deletions(-)
create mode 100755 tests/qemu-iotests/tests/stream-error-on-reset
create mode 100644 tests/qemu-iotests/tests/stream-error-on-reset.out
--
2.33.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/2] block-backend: prevent dangling BDS pointers across aio_poll()
2022-01-11 15:36 [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
@ 2022-01-11 15:36 ` Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 2/2] iotests/stream-error-on-reset: New test Stefan Hajnoczi
2022-01-12 10:23 ` [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-01-11 15:36 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block,
qemu-stable, Hanna Reitz, Stefan Hajnoczi
The BlockBackend root child can change when aio_poll() is invoked. This
happens when a temporary filter node is removed upon blockjob
completion, for example.
Functions in block/block-backend.c must be aware of this when using a
blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
may reach 0, resulting in a stale pointer.
One example is scsi_device_purge_requests(), which calls blk_drain() to
wait for in-flight requests to cancel. If the backup blockjob is active,
then the BlockBackend root child is a temporary filter BDS owned by the
blockjob. The blockjob can complete during bdrv_drained_begin() and the
last reference to the BDS is released when the temporary filter node is
removed. This results in a use-after-free when blk_drain() calls
bdrv_drained_end(bs) on the dangling pointer.
Explicitly hold a reference to bs across block APIs that invoke
aio_poll().
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2021778
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2036178
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3:
- Add comment in blk_remove_bs() and reduce scope of bs local variable [Kevin]
v2:
- Audit block/block-backend.c and fix additional cases
---
block/block-backend.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index 12ef80ea17..23e727199b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -822,16 +822,22 @@ BlockBackend *blk_by_public(BlockBackendPublic *public)
void blk_remove_bs(BlockBackend *blk)
{
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
- BlockDriverState *bs;
BdrvChild *root;
notifier_list_notify(&blk->remove_bs_notifiers, blk);
if (tgm->throttle_state) {
- bs = blk_bs(blk);
+ BlockDriverState *bs = blk_bs(blk);
+
+ /*
+ * Take a ref in case blk_bs() changes across bdrv_drained_begin(), for
+ * example, if a temporary filter node is removed by a blockjob.
+ */
+ bdrv_ref(bs);
bdrv_drained_begin(bs);
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
bdrv_drained_end(bs);
+ bdrv_unref(bs);
}
blk_update_root_state(blk);
@@ -1705,6 +1711,7 @@ void blk_drain(BlockBackend *blk)
BlockDriverState *bs = blk_bs(blk);
if (bs) {
+ bdrv_ref(bs);
bdrv_drained_begin(bs);
}
@@ -1714,6 +1721,7 @@ void blk_drain(BlockBackend *blk)
if (bs) {
bdrv_drained_end(bs);
+ bdrv_unref(bs);
}
}
@@ -2044,10 +2052,13 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
int ret;
if (bs) {
+ bdrv_ref(bs);
+
if (update_root_node) {
ret = bdrv_child_try_set_aio_context(bs, new_context, blk->root,
errp);
if (ret < 0) {
+ bdrv_unref(bs);
return ret;
}
}
@@ -2057,6 +2068,8 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
throttle_group_attach_aio_context(tgm, new_context);
bdrv_drained_end(bs);
}
+
+ bdrv_unref(bs);
}
blk->ctx = new_context;
@@ -2326,11 +2339,13 @@ void blk_io_limits_disable(BlockBackend *blk)
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
assert(tgm->throttle_state);
if (bs) {
+ bdrv_ref(bs);
bdrv_drained_begin(bs);
}
throttle_group_unregister_tgm(tgm);
if (bs) {
bdrv_drained_end(bs);
+ bdrv_unref(bs);
}
}
--
2.33.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 2/2] iotests/stream-error-on-reset: New test
2022-01-11 15:36 [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 1/2] " Stefan Hajnoczi
@ 2022-01-11 15:36 ` Stefan Hajnoczi
2022-01-12 10:23 ` [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-01-11 15:36 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block,
qemu-stable, Hanna Reitz, Stefan Hajnoczi
From: Hanna Reitz <hreitz@redhat.com>
Test the following scenario:
- Simple stream block in two-layer backing chain (base and top)
- The job is drained via blk_drain(), then an error occurs while the job
settles the ongoing request
- And so the job completes while in blk_drain()
This was reported as a segfault, but is fixed by "block-backend: prevent
dangling BDS pointers across aio_poll()".
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2036178
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
.../qemu-iotests/tests/stream-error-on-reset | 140 ++++++++++++++++++
.../tests/stream-error-on-reset.out | 5 +
2 files changed, 145 insertions(+)
create mode 100755 tests/qemu-iotests/tests/stream-error-on-reset
create mode 100644 tests/qemu-iotests/tests/stream-error-on-reset.out
diff --git a/tests/qemu-iotests/tests/stream-error-on-reset b/tests/qemu-iotests/tests/stream-error-on-reset
new file mode 100755
index 0000000000..7eaedb24d7
--- /dev/null
+++ b/tests/qemu-iotests/tests/stream-error-on-reset
@@ -0,0 +1,140 @@
+#!/usr/bin/env python3
+# group: rw quick
+#
+# Test what happens when a stream job completes in a blk_drain().
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, qemu_io_silent, QMPTestCase
+
+
+image_size = 1 * 1024 * 1024
+data_size = 64 * 1024
+base = os.path.join(iotests.test_dir, 'base.img')
+top = os.path.join(iotests.test_dir, 'top.img')
+
+
+# We want to test completing a stream job in a blk_drain().
+#
+# The blk_drain() we are going to use is a virtio-scsi device resetting,
+# which we can trigger by resetting the system.
+#
+# In order to have the block job complete on drain, we (1) throttle its
+# base image so we can start the drain after it has begun, but before it
+# completes, and (2) make it encounter an I/O error on the ensuing write.
+# (If it completes regularly, the completion happens after the drain for
+# some reason.)
+
+class TestStreamErrorOnReset(QMPTestCase):
+ def setUp(self) -> None:
+ """
+ Create two images:
+ - base image {base} with {data_size} bytes allocated
+ - top image {top} without any data allocated
+
+ And the following VM configuration:
+ - base image throttled to {data_size}
+ - top image with a blkdebug configuration so the first write access
+ to it will result in an error
+ - top image is attached to a virtio-scsi device
+ """
+ assert qemu_img_create('-f', imgfmt, base, str(image_size)) == 0
+ assert qemu_io_silent('-c', f'write 0 {data_size}', base) == 0
+ assert qemu_img_create('-f', imgfmt, top, str(image_size)) == 0
+
+ self.vm = iotests.VM()
+ self.vm.add_args('-accel', 'tcg') # Make throttling work properly
+ self.vm.add_object(self.vm.qmp_to_opts({
+ 'qom-type': 'throttle-group',
+ 'id': 'thrgr',
+ 'x-bps-total': str(data_size)
+ }))
+ self.vm.add_blockdev(self.vm.qmp_to_opts({
+ 'driver': imgfmt,
+ 'node-name': 'base',
+ 'file': {
+ 'driver': 'throttle',
+ 'throttle-group': 'thrgr',
+ 'file': {
+ 'driver': 'file',
+ 'filename': base
+ }
+ }
+ }))
+ self.vm.add_blockdev(self.vm.qmp_to_opts({
+ 'driver': imgfmt,
+ 'node-name': 'top',
+ 'file': {
+ 'driver': 'blkdebug',
+ 'node-name': 'top-blkdebug',
+ 'inject-error': [{
+ 'event': 'pwritev',
+ 'immediately': 'true',
+ 'once': 'true'
+ }],
+ 'image': {
+ 'driver': 'file',
+ 'filename': top
+ }
+ },
+ 'backing': 'base'
+ }))
+ self.vm.add_device(self.vm.qmp_to_opts({
+ 'driver': 'virtio-scsi',
+ 'id': 'vscsi'
+ }))
+ self.vm.add_device(self.vm.qmp_to_opts({
+ 'driver': 'scsi-hd',
+ 'bus': 'vscsi.0',
+ 'drive': 'top'
+ }))
+ self.vm.launch()
+
+ def tearDown(self) -> None:
+ self.vm.shutdown()
+ os.remove(top)
+ os.remove(base)
+
+ def test_stream_error_on_reset(self) -> None:
+ # Launch a stream job, which will take at least a second to
+ # complete, because the base image is throttled (so we can
+ # get in between it having started and it having completed)
+ res = self.vm.qmp('block-stream', job_id='stream', device='top')
+ self.assert_qmp(res, 'return', {})
+
+ while True:
+ ev = self.vm.event_wait('JOB_STATUS_CHANGE')
+ if ev['data']['status'] == 'running':
+ # Once the stream job is running, reset the system, which
+ # forces the virtio-scsi device to be reset, thus draining
+ # the stream job, and making it complete. Completing
+ # inside of that drain should not result in a segfault.
+ res = self.vm.qmp('system_reset')
+ self.assert_qmp(res, 'return', {})
+ elif ev['data']['status'] == 'null':
+ # The test is done once the job is gone
+ break
+
+
+if __name__ == '__main__':
+ # Passes with any format with backing file support, but qed and
+ # qcow1 do not seem to exercise the used-to-be problematic code
+ # path, so there is no point in having them in this list
+ iotests.main(supported_fmts=['qcow2', 'vmdk'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/tests/stream-error-on-reset.out b/tests/qemu-iotests/tests/stream-error-on-reset.out
new file mode 100644
index 0000000000..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/tests/stream-error-on-reset.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
--
2.33.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll()
2022-01-11 15:36 [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 1/2] " Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 2/2] iotests/stream-error-on-reset: New test Stefan Hajnoczi
@ 2022-01-12 10:23 ` Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2022-01-12 10:23 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
qemu-block, qemu-stable
Am 11.01.2022 um 16:36 hat Stefan Hajnoczi geschrieben:
> This series fixes use-after-free bugs when blk->root changes across aio_poll().
> For example, a temporary filter node can be removed by a blockjob when a
> drained section begins. If the caller doesn't hold a ref on the BDS then it
> will have been freed.
>
> Hanna Reitz (1):
> iotests/stream-error-on-reset: New test
>
> Stefan Hajnoczi (1):
> block-backend: prevent dangling BDS pointers across aio_poll()
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-01-12 10:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-11 15:36 [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 1/2] " Stefan Hajnoczi
2022-01-11 15:36 ` [PATCH v3 2/2] iotests/stream-error-on-reset: New test Stefan Hajnoczi
2022-01-12 10:23 ` [PATCH v3 0/2] block-backend: prevent dangling BDS pointers across aio_poll() Kevin Wolf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.