All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] block: Make bdrv_refresh_limits() non-recursive
@ 2022-02-15 13:57 Hanna Reitz
  2022-02-15 13:57 ` [PATCH 1/3] " Hanna Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Hanna Reitz @ 2022-02-15 13:57 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Hanna Reitz, qemu-devel, Stefan Hajnoczi

Hi,

Most bdrv_refresh_limits() callers do not drain the subtree of the node
whose limits are refreshed, so concurrent I/O requests to child nodes
can occur (if the node is in an I/O thread).  bdrv_refresh_limits() is
recursive, so such requests can happen to a node whose limits are being
refreshed.

bdrv_refresh_limits() is not atomic, and so the I/O requests can
encounter invalid limits, like a 0 request_alignment.  This will crash
qemu (e.g. because of a division by 0, or a failed assertion).

On inspection, bdrv_refresh_limits() doesn’t look like it really needs
to be recursive.  It just has always been.  Dropping the recursion fixes
those crashes, because all callers of bdrv_refresh_limits() make sure
one way or another that concurrent requests to the node whose limits are
to be refreshed are at leased paused (by draining, and/or by acquiring
the AioContext).

I see two other ways to fix it:
(A) Have all bdrv_refresh_limits() callers drain the entire subtree,
(B) Protect BDS.bl with RCU, which would make concurrent I/O just fine.

(A) is kind of ugly, and after starting down that path two times, both
times I decided I didn’t want to follow through with it.  It was always
an AioContext-juggling mess.  (E.g. bdrv_set_backing_hd() would need to
drain the subtree; but that means having to acquire the `backing_hd`
context, too, because `bs` might be moved into that context, and so when
`backing_hd` is attached to `bs`, `backing_hd` would be drained in the
new context.  But we can’t acquire a context twice, so we can only
acquire `backing_hd`’s context if the caller hasn’t done so already.
But the worst is that we can’t actually acquire that context: If `bs` is
moved into `backing_hd`’s context, then `bdrv_set_aio_context_ignore()`
requires us not to hold that context.  It’s just kind of a mess.)

I tried (B), and it worked, and I liked it very much; but it requires
quite a bit of refactoring (every BDS.bl reader must then use
qatomic_rcu_read() and take the RCU read lock), so it feels really
difficult to justify when the fix this series proposes just removes four
lines of code.


Hanna Reitz (3):
  block: Make bdrv_refresh_limits() non-recursive
  iotests: Allow using QMP with the QSD
  iotests/graph-changes-while-io: New test

 block/io.c                                    |  4 -
 tests/qemu-iotests/iotests.py                 | 29 +++++-
 .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++
 .../tests/graph-changes-while-io.out          |  5 +
 4 files changed, 124 insertions(+), 5 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io
 create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out

-- 
2.34.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] block: Make bdrv_refresh_limits() non-recursive
  2022-02-15 13:57 [PATCH 0/3] block: Make bdrv_refresh_limits() non-recursive Hanna Reitz
@ 2022-02-15 13:57 ` Hanna Reitz
  2022-02-15 22:16   ` Eric Blake
  2022-02-15 13:57 ` [PATCH 2/3] iotests: Allow using QMP with the QSD Hanna Reitz
  2022-02-15 13:57 ` [PATCH 3/3] iotests/graph-changes-while-io: New test Hanna Reitz
  2 siblings, 1 reply; 9+ messages in thread
From: Hanna Reitz @ 2022-02-15 13:57 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Hanna Reitz, qemu-devel, Stefan Hajnoczi

bdrv_refresh_limits() recurses down to the node's children.  That does
not seem necessary: When we refresh limits on some node, and then
recurse down and were to change one of its children's BlockLimits, then
that would mean we noticed the changed limits by pure chance.  The fact
that we refresh the parent's limits has nothing to do with it, so the
reason for the change probably happened before this point in time, and
we should have refreshed the limits then.

On the other hand, we do not have infrastructure for noticing that block
limits change after they have been initialized for the first time (this
would require propagating the change upwards to the respective node's
parents), and so evidently we consider this case impossible.

If this case is impossible, then we will not need to recurse down in
bdrv_refresh_limits().  Every node's limits are initialized in
bdrv_open_driver(), and are refreshed whenever its children change.
We want to use the childrens' limits to get some initial default, but we
can just take them, we do not need to refresh them.

The problem with recursing is that bdrv_refresh_limits() is not atomic.
It begins with zeroing BDS.bl, and only then sets proper, valid limits.
If we do not drain all nodes whose limits are refreshed, then concurrent
I/O requests can encounter invalid request_alignment values and crash
qemu.  Therefore, a recursing bdrv_refresh_limits() requires the whole
subtree to be drained, which is currently not ensured by most callers.

A non-recursive bdrv_refresh_limits() only requires the node in question
to not receive I/O requests, and this is done by most callers in some
way or another:
- bdrv_open_driver() deals with a new node with no parents yet
- bdrv_set_file_or_backing_noperm() acts on a drained node
- bdrv_reopen_commit() acts only on drained nodes
- bdrv_append() should in theory require the node to be drained; in
  practice most callers just lock the AioContext, which should at least
  be enough to prevent concurrent I/O requests from accessing invalid
  limits

So we can resolve the bug by making bdrv_refresh_limits() non-recursive.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1879437
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block/io.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/block/io.c b/block/io.c
index 4e4cb556c5..c3e7301613 100644
--- a/block/io.c
+++ b/block/io.c
@@ -189,10 +189,6 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
     QLIST_FOREACH(c, &bs->children, next) {
         if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
         {
-            bdrv_refresh_limits(c->bs, tran, errp);
-            if (*errp) {
-                return;
-            }
             bdrv_merge_limits(&bs->bl, &c->bs->bl);
             have_limits = true;
         }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] iotests: Allow using QMP with the QSD
  2022-02-15 13:57 [PATCH 0/3] block: Make bdrv_refresh_limits() non-recursive Hanna Reitz
  2022-02-15 13:57 ` [PATCH 1/3] " Hanna Reitz
@ 2022-02-15 13:57 ` Hanna Reitz
  2022-02-15 22:19   ` Eric Blake
  2022-02-15 13:57 ` [PATCH 3/3] iotests/graph-changes-while-io: New test Hanna Reitz
  2 siblings, 1 reply; 9+ messages in thread
From: Hanna Reitz @ 2022-02-15 13:57 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Hanna Reitz, qemu-devel, Stefan Hajnoczi

Add a parameter to optionally open a QMP connection when creating a
QemuStorageDaemon instance.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 tests/qemu-iotests/iotests.py | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6ba65eb1ff..47e3808ab9 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -39,6 +39,7 @@
 
 from qemu.machine import qtest
 from qemu.qmp import QMPMessage
+from qemu.aqmp.legacy import QEMUMonitorProtocol
 
 # Use this logger for logging messages directly from the iotests module
 logger = logging.getLogger('qemu.iotests')
@@ -348,14 +349,30 @@ def cmd(self, cmd):
 
 
 class QemuStorageDaemon:
-    def __init__(self, *args: str, instance_id: str = 'a'):
+    _qmp: Optional[QEMUMonitorProtocol] = None
+    _qmpsock: Optional[str] = None
+    # Python < 3.8 would complain if this type were not a string literal
+    # (importing `annotations` from `__future__` would work; but not on <= 3.6)
+    _p: 'Optional[subprocess.Popen[bytes]]' = None
+
+    def __init__(self, *args: str, instance_id: str = 'a', qmp: bool = False):
         assert '--pidfile' not in args
         self.pidfile = os.path.join(test_dir, f'qsd-{instance_id}-pid')
         all_args = [qsd_prog] + list(args) + ['--pidfile', self.pidfile]
 
+        if qmp:
+            self._qmpsock = os.path.join(sock_dir, f'qsd-{instance_id}.sock')
+            all_args += ['--chardev',
+                         f'socket,id=qmp-sock,path={self._qmpsock}',
+                         '--monitor', 'qmp-sock']
+
+            self._qmp = QEMUMonitorProtocol(self._qmpsock, server=True)
+
         # Cannot use with here, we want the subprocess to stay around
         # pylint: disable=consider-using-with
         self._p = subprocess.Popen(all_args)
+        if self._qmp is not None:
+            self._qmp.accept()
         while not os.path.exists(self.pidfile):
             if self._p.poll() is not None:
                 cmd = ' '.join(all_args)
@@ -370,12 +387,22 @@ def __init__(self, *args: str, instance_id: str = 'a'):
 
         assert self._pid == self._p.pid
 
+    def qmp(self, cmd: str, args: Optional[Dict[str, object]] = None) \
+            -> QMPMessage:
+        assert self._qmp is not None
+        return self._qmp.cmd(cmd, args)
+
     def stop(self, kill_signal=15):
         self._p.send_signal(kill_signal)
         self._p.wait()
         self._p = None
 
+        if self._qmp:
+            self._qmp.close()
+
         try:
+            if self._qmpsock is not None:
+                os.remove(self._qmpsock)
             os.remove(self.pidfile)
         except OSError:
             pass
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] iotests/graph-changes-while-io: New test
  2022-02-15 13:57 [PATCH 0/3] block: Make bdrv_refresh_limits() non-recursive Hanna Reitz
  2022-02-15 13:57 ` [PATCH 1/3] " Hanna Reitz
  2022-02-15 13:57 ` [PATCH 2/3] iotests: Allow using QMP with the QSD Hanna Reitz
@ 2022-02-15 13:57 ` Hanna Reitz
  2022-02-15 22:22   ` Eric Blake
  2 siblings, 1 reply; 9+ messages in thread
From: Hanna Reitz @ 2022-02-15 13:57 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Hanna Reitz, qemu-devel, Stefan Hajnoczi

Test the following scenario:
1. Some block node (null-co) attached to a user (here: NBD server) that
   performs I/O and keeps the node in an I/O thread
2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay
   to/from that node

Each blockdev-add triggers bdrv_refresh_limits(), and because
blockdev-add runs in the main thread, it does not stop the I/O requests.
I/O can thus happen while the limits are refreshed, and when such a
request sees a temporarily invalid block limit (e.g. alignment is 0),
this may easily crash qemu (or the storage daemon in this case).

The block layer needs to ensure that I/O requests to a node are paused
while that node's BlockLimits are refreshed.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++
 .../tests/graph-changes-while-io.out          |  5 +
 2 files changed, 96 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io
 create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out

diff --git a/tests/qemu-iotests/tests/graph-changes-while-io b/tests/qemu-iotests/tests/graph-changes-while-io
new file mode 100755
index 0000000000..567e8cf21e
--- /dev/null
+++ b/tests/qemu-iotests/tests/graph-changes-while-io
@@ -0,0 +1,91 @@
+#!/usr/bin/env python3
+# group: rw
+#
+# Test graph changes while I/O is happening
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+from threading import Thread
+import iotests
+from iotests import imgfmt, qemu_img, qemu_img_create, QMPTestCase, \
+        QemuStorageDaemon
+
+
+top = os.path.join(iotests.test_dir, 'top.img')
+nbd_sock = os.path.join(iotests.sock_dir, 'nbd.sock')
+
+
+def do_qemu_img_bench() -> None:
+    """
+    Do some I/O requests on `nbd_sock`.
+    """
+    assert qemu_img('bench', '-f', 'raw', '-c', '2000000',
+                    f'nbd+unix:///node0?socket={nbd_sock}') == 0
+
+
+class TestGraphChangesWhileIO(QMPTestCase):
+    def setUp(self) -> None:
+        # Create an overlay that can be added at runtime on top of the
+        # null-co block node that will receive I/O
+        assert qemu_img_create('-f', imgfmt, '-F', 'raw', '-b', 'null-co://',
+                               top) == 0
+
+        # QSD instance with a null-co block node in an I/O thread,
+        # exported over NBD (on `nbd_sock`, export name "node0")
+        self.qsd = QemuStorageDaemon(
+            '--object', 'iothread,id=iothread0',
+            '--blockdev', 'null-co,node-name=node0,read-zeroes=true',
+            '--nbd-server', f'addr.type=unix,addr.path={nbd_sock}',
+            '--export', 'nbd,id=exp0,node-name=node0,iothread=iothread0,' +
+                        'fixed-iothread=true,writable=true',
+            qmp=True
+        )
+
+    def tearDown(self) -> None:
+        self.qsd.stop()
+
+    def test_blockdev_add_while_io(self) -> None:
+        # Run qemu-img bench in the background
+        bench_thr = Thread(target=do_qemu_img_bench)
+        bench_thr.start()
+
+        # While qemu-img bench is running, repeatedly add and remove an
+        # overlay to/from node0
+        while bench_thr.is_alive():
+            result = self.qsd.qmp('blockdev-add', {
+                'driver': imgfmt,
+                'node-name': 'overlay',
+                'backing': 'node0',
+                'file': {
+                    'driver': 'file',
+                    'filename': top
+                }
+            })
+            self.assert_qmp(result, 'return', {})
+
+            result = self.qsd.qmp('blockdev-del', {
+                'node-name': 'overlay'
+            })
+            self.assert_qmp(result, 'return', {})
+
+        bench_thr.join()
+
+if __name__ == '__main__':
+    # Format must support raw backing files
+    iotests.main(supported_fmts=['qcow', 'qcow2', 'qed'],
+                 supported_protocols=['file'])
diff --git a/tests/qemu-iotests/tests/graph-changes-while-io.out b/tests/qemu-iotests/tests/graph-changes-while-io.out
new file mode 100644
index 0000000000..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/tests/graph-changes-while-io.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] block: Make bdrv_refresh_limits() non-recursive
  2022-02-15 13:57 ` [PATCH 1/3] " Hanna Reitz
@ 2022-02-15 22:16   ` Eric Blake
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Blake @ 2022-02-15 22:16 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block

On Tue, Feb 15, 2022 at 02:57:25PM +0100, Hanna Reitz wrote:
> bdrv_refresh_limits() recurses down to the node's children.  That does
> not seem necessary: When we refresh limits on some node, and then
> recurse down and were to change one of its children's BlockLimits, then
> that would mean we noticed the changed limits by pure chance.  The fact
> that we refresh the parent's limits has nothing to do with it, so the
> reason for the change probably happened before this point in time, and
> we should have refreshed the limits then.
> 
> On the other hand, we do not have infrastructure for noticing that block
> limits change after they have been initialized for the first time (this
> would require propagating the change upwards to the respective node's
> parents), and so evidently we consider this case impossible.
> 
> If this case is impossible, then we will not need to recurse down in
> bdrv_refresh_limits().  Every node's limits are initialized in
> bdrv_open_driver(), and are refreshed whenever its children change.
> We want to use the childrens' limits to get some initial default, but we
> can just take them, we do not need to refresh them.
> 
> The problem with recursing is that bdrv_refresh_limits() is not atomic.
> It begins with zeroing BDS.bl, and only then sets proper, valid limits.
> If we do not drain all nodes whose limits are refreshed, then concurrent
> I/O requests can encounter invalid request_alignment values and crash
> qemu.  Therefore, a recursing bdrv_refresh_limits() requires the whole
> subtree to be drained, which is currently not ensured by most callers.
> 
> A non-recursive bdrv_refresh_limits() only requires the node in question
> to not receive I/O requests, and this is done by most callers in some
> way or another:
> - bdrv_open_driver() deals with a new node with no parents yet
> - bdrv_set_file_or_backing_noperm() acts on a drained node
> - bdrv_reopen_commit() acts only on drained nodes
> - bdrv_append() should in theory require the node to be drained; in
>   practice most callers just lock the AioContext, which should at least
>   be enough to prevent concurrent I/O requests from accessing invalid
>   limits
> 
> So we can resolve the bug by making bdrv_refresh_limits() non-recursive.

Long explanation, but very helpful.

> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1879437
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
>  block/io.c | 4 ----
>  1 file changed, 4 deletions(-)

And deceptively simple fix!

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block/io.c b/block/io.c
> index 4e4cb556c5..c3e7301613 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -189,10 +189,6 @@ void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, Error **errp)
>      QLIST_FOREACH(c, &bs->children, next) {
>          if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED | BDRV_CHILD_COW))
>          {
> -            bdrv_refresh_limits(c->bs, tran, errp);
> -            if (*errp) {
> -                return;
> -            }
>              bdrv_merge_limits(&bs->bl, &c->bs->bl);
>              have_limits = true;
>          }
> -- 
> 2.34.1
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] iotests: Allow using QMP with the QSD
  2022-02-15 13:57 ` [PATCH 2/3] iotests: Allow using QMP with the QSD Hanna Reitz
@ 2022-02-15 22:19   ` Eric Blake
  2022-02-16  9:43     ` Hanna Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Blake @ 2022-02-15 22:19 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block

On Tue, Feb 15, 2022 at 02:57:26PM +0100, Hanna Reitz wrote:
> Add a parameter to optionally open a QMP connection when creating a
> QemuStorageDaemon instance.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
>  tests/qemu-iotests/iotests.py | 29 ++++++++++++++++++++++++++++-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 6ba65eb1ff..47e3808ab9 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -39,6 +39,7 @@
>  
>  from qemu.machine import qtest
>  from qemu.qmp import QMPMessage
> +from qemu.aqmp.legacy import QEMUMonitorProtocol

I thought we were trying to get rid of aqmp.legacy usage, so this
feels like a temporary regression.  Oh well, not the end of the
testing world.

>      def stop(self, kill_signal=15):
>          self._p.send_signal(kill_signal)
>          self._p.wait()
>          self._p = None
>  
> +        if self._qmp:
> +            self._qmp.close()
> +
>          try:
> +            if self._qmpsock is not None:
> +                os.remove(self._qmpsock)
>              os.remove(self.pidfile)
>          except OSError:
>              pass

Do we need two try: blocks here, to remove self.pidfile even if
os.remove(self._qmpsock) failed?

Otherwise, makes sense to me.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] iotests/graph-changes-while-io: New test
  2022-02-15 13:57 ` [PATCH 3/3] iotests/graph-changes-while-io: New test Hanna Reitz
@ 2022-02-15 22:22   ` Eric Blake
  2022-02-16  9:53     ` Hanna Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Blake @ 2022-02-15 22:22 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block

On Tue, Feb 15, 2022 at 02:57:27PM +0100, Hanna Reitz wrote:
> Test the following scenario:
> 1. Some block node (null-co) attached to a user (here: NBD server) that
>    performs I/O and keeps the node in an I/O thread
> 2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay
>    to/from that node
> 
> Each blockdev-add triggers bdrv_refresh_limits(), and because
> blockdev-add runs in the main thread, it does not stop the I/O requests.
> I/O can thus happen while the limits are refreshed, and when such a
> request sees a temporarily invalid block limit (e.g. alignment is 0),
> this may easily crash qemu (or the storage daemon in this case).
> 
> The block layer needs to ensure that I/O requests to a node are paused
> while that node's BlockLimits are refreshed.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
>  .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++
>  .../tests/graph-changes-while-io.out          |  5 +
>  2 files changed, 96 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io
>  create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out

Reviewed-by: Eric Blake <eblake@redhat.com>

Since we found this with the help of NBD, should I be considering this
series for my NBD queue, or is there a better block-related maintainer
queue that it should go through?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] iotests: Allow using QMP with the QSD
  2022-02-15 22:19   ` Eric Blake
@ 2022-02-16  9:43     ` Hanna Reitz
  0 siblings, 0 replies; 9+ messages in thread
From: Hanna Reitz @ 2022-02-16  9:43 UTC (permalink / raw)
  To: Eric Blake; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block

On 15.02.22 23:19, Eric Blake wrote:
> On Tue, Feb 15, 2022 at 02:57:26PM +0100, Hanna Reitz wrote:
>> Add a parameter to optionally open a QMP connection when creating a
>> QemuStorageDaemon instance.
>>
>> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
>> ---
>>   tests/qemu-iotests/iotests.py | 29 ++++++++++++++++++++++++++++-
>>   1 file changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
>> index 6ba65eb1ff..47e3808ab9 100644
>> --- a/tests/qemu-iotests/iotests.py
>> +++ b/tests/qemu-iotests/iotests.py
>> @@ -39,6 +39,7 @@
>>   
>>   from qemu.machine import qtest
>>   from qemu.qmp import QMPMessage
>> +from qemu.aqmp.legacy import QEMUMonitorProtocol
> I thought we were trying to get rid of aqmp.legacy usage, so this
> feels like a temporary regression.  Oh well, not the end of the
> testing world.

I fiddled around with the non-legacy interface and wasn’t very 
successful...  I thought since machine.py still uses qemu.aqmp.legacy 
for QEMUMachine, when one is reworked to get rid of it (if that ever 
becomes necessary), then we can just do it here, too.

>
>>       def stop(self, kill_signal=15):
>>           self._p.send_signal(kill_signal)
>>           self._p.wait()
>>           self._p = None
>>   
>> +        if self._qmp:
>> +            self._qmp.close()
>> +
>>           try:
>> +            if self._qmpsock is not None:
>> +                os.remove(self._qmpsock)
>>               os.remove(self.pidfile)
>>           except OSError:
>>               pass
> Do we need two try: blocks here, to remove self.pidfile even if
> os.remove(self._qmpsock) failed?

Honestly, no reason not to use two blocks except it being longer. You’re 
right, I should’ve just done that.

> Otherwise, makes sense to me.

Thanks for reviewing!

Hanna



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] iotests/graph-changes-while-io: New test
  2022-02-15 22:22   ` Eric Blake
@ 2022-02-16  9:53     ` Hanna Reitz
  0 siblings, 0 replies; 9+ messages in thread
From: Hanna Reitz @ 2022-02-16  9:53 UTC (permalink / raw)
  To: Eric Blake; +Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, qemu-block

On 15.02.22 23:22, Eric Blake wrote:
> On Tue, Feb 15, 2022 at 02:57:27PM +0100, Hanna Reitz wrote:
>> Test the following scenario:
>> 1. Some block node (null-co) attached to a user (here: NBD server) that
>>     performs I/O and keeps the node in an I/O thread
>> 2. Repeatedly run blockdev-add/blockdev-del to add/remove an overlay
>>     to/from that node
>>
>> Each blockdev-add triggers bdrv_refresh_limits(), and because
>> blockdev-add runs in the main thread, it does not stop the I/O requests.
>> I/O can thus happen while the limits are refreshed, and when such a
>> request sees a temporarily invalid block limit (e.g. alignment is 0),
>> this may easily crash qemu (or the storage daemon in this case).
>>
>> The block layer needs to ensure that I/O requests to a node are paused
>> while that node's BlockLimits are refreshed.
>>
>> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
>> ---
>>   .../qemu-iotests/tests/graph-changes-while-io | 91 +++++++++++++++++++
>>   .../tests/graph-changes-while-io.out          |  5 +
>>   2 files changed, 96 insertions(+)
>>   create mode 100755 tests/qemu-iotests/tests/graph-changes-while-io
>>   create mode 100644 tests/qemu-iotests/tests/graph-changes-while-io.out
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
> Since we found this with the help of NBD, should I be considering this
> series for my NBD queue, or is there a better block-related maintainer
> queue that it should go through?

Well, we found it by using a guest, it’s just that using a guest in the 
iotests is not quite that great, so we need some other way to induce I/O 
(concurrently to monitor commands).  I could’ve used FUSE, too, but NBD 
is always compiled in, so. :)

In any case, of course I don’t mind who takes this series.  If you want 
to take it, go ahead (and thanks!) – I’ll be sending a v2 to split the 
`try` block in patch 2, though.

Hanna



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-16  9:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-15 13:57 [PATCH 0/3] block: Make bdrv_refresh_limits() non-recursive Hanna Reitz
2022-02-15 13:57 ` [PATCH 1/3] " Hanna Reitz
2022-02-15 22:16   ` Eric Blake
2022-02-15 13:57 ` [PATCH 2/3] iotests: Allow using QMP with the QSD Hanna Reitz
2022-02-15 22:19   ` Eric Blake
2022-02-16  9:43     ` Hanna Reitz
2022-02-15 13:57 ` [PATCH 3/3] iotests/graph-changes-while-io: New test Hanna Reitz
2022-02-15 22:22   ` Eric Blake
2022-02-16  9:53     ` Hanna Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.