All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix crash due to NBD export leak
@ 2020-06-25 14:25 Vladimir Sementsov-Ogievskiy
  2020-06-25 14:25 ` [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt Vladimir Sementsov-Ogievskiy
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-25 14:25 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den

Hi all!

We've faced crash bug, which is reproducing on master branch as well.
The case is described in 03, where fix is suggested.
New iotest in 04 crashes without that fix.

Vladimir Sementsov-Ogievskiy (4):
  iotests: QemuIoInteractive: use qemu_io_args_no_fmt
  iotests.py: QemuIoInteractive: print output on failure
  nbd: make client_close synchronous
  iotests: test shutdown when bitmap is exported through NBD

 nbd/server.c                  |  3 ++
 tests/qemu-iotests/299        | 65 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/299.out    | 10 ++++++
 tests/qemu-iotests/group      |  1 +
 tests/qemu-iotests/iotests.py | 15 +++++---
 5 files changed, 90 insertions(+), 4 deletions(-)
 create mode 100644 tests/qemu-iotests/299
 create mode 100644 tests/qemu-iotests/299.out

-- 
2.21.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt
  2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:25 ` Vladimir Sementsov-Ogievskiy
  2020-06-25 19:32   ` Eric Blake
  2020-06-25 14:25 ` [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-25 14:25 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den

All users of QemuIoInteractive provides -f argument, so it's incorrect
to use qemu_io_args, which contains -f too. Let's use
qemu_io_args_no_fmt, which also makes possible to use --image-opts with
QemuIoInteractive in the following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5ea4c4df8b..efe9958f5e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -211,7 +211,7 @@ def get_virtio_scsi_device():
 
 class QemuIoInteractive:
     def __init__(self, *args):
-        self.args = qemu_io_args + list(args)
+        self.args = qemu_io_args_no_fmt + list(args)
         self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
                                    stdout=subprocess.PIPE,
                                    stderr=subprocess.STDOUT,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure
  2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
  2020-06-25 14:25 ` [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:25 ` Vladimir Sementsov-Ogievskiy
  2020-06-25 14:46   ` Eric Blake
  2020-06-25 14:25 ` [PATCH 3/4] nbd: make client_close synchronous Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-25 14:25 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den

Make it simpler to debug when qemu-io failes due to wrong arguments or
environment.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index efe9958f5e..ac9d199a1e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -216,7 +216,13 @@ class QemuIoInteractive:
                                    stdout=subprocess.PIPE,
                                    stderr=subprocess.STDOUT,
                                    universal_newlines=True)
-        assert self._p.stdout.read(9) == 'qemu-io> '
+        out = self._p.stdout.read(9)
+        if out != 'qemu-io> ':
+            # Most probably qemu-io just failed to start.
+            # Let's collect the whole output and exit.
+            out += self._p.stdout.read()
+            self._p.wait(timeout=1)
+            raise ValueError(out)
 
     def close(self):
         self._p.communicate('q\n')
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] nbd: make client_close synchronous
  2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
  2020-06-25 14:25 ` [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt Vladimir Sementsov-Ogievskiy
  2020-06-25 14:25 ` [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:25 ` Vladimir Sementsov-Ogievskiy
  2020-06-25 14:48   ` Eric Blake
  2020-06-29  7:55   ` Vladimir Sementsov-Ogievskiy
  2020-06-25 14:25 ` [PATCH 4/4] iotests: test shutdown when bitmap is exported through NBD Vladimir Sementsov-Ogievskiy
  2020-06-25 15:15 ` [PATCH 0/4] Fix crash due to NBD export leak no-reply
  4 siblings, 2 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-25 14:25 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den

client_close doesn't guarantee that client is closed: nbd_trip() keeps
reference to it. Let's wait for nbd_trip to finish.

Without this fix, the following crash is possible:

- export bitmap through unternal Qemu NBD server
- connect a client
- shutdown Qemu

On shutdown nbd_export_close_all is called, but it actually don't wait
for nbd_trip() to finish and to release its references. So, export is
not release, and exported bitmap remains busy, and on try to remove the
bitmap (which is part of bdrv_close()) the assertion fairs:

bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 nbd/server.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/nbd/server.c b/nbd/server.c
index 20754e9ebc..5e27a8d31a 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1419,6 +1419,8 @@ static void client_close(NBDClient *client, bool negotiated)
     qio_channel_shutdown(client->ioc, QIO_CHANNEL_SHUTDOWN_BOTH,
                          NULL);
 
+    AIO_WAIT_WHILE(client->exp->ctx, client->recv_coroutine);
+
     /* Also tell the client, so that they release their reference.  */
     if (client->close_fn) {
         client->close_fn(client, negotiated);
@@ -2450,6 +2452,7 @@ static coroutine_fn void nbd_trip(void *opaque)
 
     trace_nbd_trip();
     if (client->closing) {
+        client->recv_coroutine = NULL;
         nbd_client_put(client);
         return;
     }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] iotests: test shutdown when bitmap is exported through NBD
  2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-06-25 14:25 ` [PATCH 3/4] nbd: make client_close synchronous Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:25 ` Vladimir Sementsov-Ogievskiy
  2020-06-25 15:15 ` [PATCH 0/4] Fix crash due to NBD export leak no-reply
  4 siblings, 0 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-25 14:25 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, qemu-devel, mreitz, den

Test shutdown when bitmap is exported through NBD and active client
exists. The previous patch fixes a crash, provoked by this scenario.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/299        | 65 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/299.out    | 10 ++++++
 tests/qemu-iotests/group      |  1 +
 tests/qemu-iotests/iotests.py |  5 +--
 4 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 tests/qemu-iotests/299
 create mode 100644 tests/qemu-iotests/299.out

diff --git a/tests/qemu-iotests/299 b/tests/qemu-iotests/299
new file mode 100644
index 0000000000..e129c7f7cb
--- /dev/null
+++ b/tests/qemu-iotests/299
@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+#
+# Test shutdown when bitmap is exported through NBD server
+#
+# Copyright (c) 2020 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import iotests
+
+# The test is unrelated to formats, restrict it to qcow2 to avoid extra runs
+iotests.script_initialize(
+    supported_fmts=['qcow2'],
+)
+
+nbd_sock = iotests.file_path('nbd.sock', base_dir=iotests.sock_dir)
+nbd_uri = 'nbd+unix:///disk?socket=' + nbd_sock
+size = 1024 * 1024
+
+vm = iotests.VM()
+vm.launch()
+
+vm.qmp_log('blockdev-add', **{
+    'node-name': 'disk',
+    'driver': 'null-co',
+    'size': 1024 * 1024,
+})
+
+vm.qmp_log('block-dirty-bitmap-add', **{
+    'node': 'disk',
+    'name': 'bitmap0'
+})
+
+vm.qmp_log('nbd-server-start', **{
+    'addr': {
+        'type': 'unix',
+        'data': {'path': nbd_sock}
+    }
+}, filters=[iotests.filter_qmp_testfiles])
+
+vm.qmp_log('nbd-server-add', **{
+    'device': 'disk',
+    'writable': True,
+    'bitmap': 'bitmap0'
+})
+
+p = iotests.QemuIoInteractive('-f', 'raw', nbd_uri)
+# wait for connection and check it:
+iotests.log(p.cmd('read 0 512').rstrip(), filters=[iotests.filter_qemu_io])
+
+vm.shutdown()
+
+p.close()
diff --git a/tests/qemu-iotests/299.out b/tests/qemu-iotests/299.out
new file mode 100644
index 0000000000..bba4252923
--- /dev/null
+++ b/tests/qemu-iotests/299.out
@@ -0,0 +1,10 @@
+{"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "disk", "size": 1048576}}
+{"return": {}}
+{"execute": "block-dirty-bitmap-add", "arguments": {"name": "bitmap0", "node": "disk"}}
+{"return": {}}
+{"execute": "nbd-server-start", "arguments": {"addr": {"data": {"path": "SOCK_DIR/PID-nbd.sock"}, "type": "unix"}}}
+{"return": {}}
+{"execute": "nbd-server-add", "arguments": {"bitmap": "bitmap0", "device": "disk", "writable": true}}
+{"return": {}}
+read 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index d886fa0cb3..250192352c 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -302,3 +302,4 @@
 291 rw quick
 292 rw auto quick
 297 meta
+299 auto quick
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ac9d199a1e..31d4b105ca 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -345,8 +345,9 @@ def filter_qmp(qmsg, filter_fn):
     return qmsg
 
 def filter_testfiles(msg):
-    prefix = os.path.join(test_dir, "%s-" % (os.getpid()))
-    return msg.replace(prefix, 'TEST_DIR/PID-')
+    pref1 = os.path.join(test_dir, "%s-" % (os.getpid()))
+    pref2 = os.path.join(sock_dir, "%s-" % (os.getpid()))
+    return msg.replace(pref1, 'TEST_DIR/PID-').replace(pref2, 'SOCK_DIR/PID-')
 
 def filter_qmp_testfiles(qmsg):
     def _filter(_key, value):
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure
  2020-06-25 14:25 ` [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:46   ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-06-25 14:46 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 6/25/20 9:25 AM, Vladimir Sementsov-Ogievskiy wrote:
> Make it simpler to debug when qemu-io failes due to wrong arguments or

fails

> environment.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   tests/qemu-iotests/iotests.py | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index efe9958f5e..ac9d199a1e 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -216,7 +216,13 @@ class QemuIoInteractive:
>                                      stdout=subprocess.PIPE,
>                                      stderr=subprocess.STDOUT,
>                                      universal_newlines=True)
> -        assert self._p.stdout.read(9) == 'qemu-io> '
> +        out = self._p.stdout.read(9)
> +        if out != 'qemu-io> ':
> +            # Most probably qemu-io just failed to start.
> +            # Let's collect the whole output and exit.
> +            out += self._p.stdout.read()
> +            self._p.wait(timeout=1)
> +            raise ValueError(out)
>   

Looks sensible.
Reviewed-by: Eric Blake <eblake@redhat.com>

>       def close(self):
>           self._p.communicate('q\n')
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] nbd: make client_close synchronous
  2020-06-25 14:25 ` [PATCH 3/4] nbd: make client_close synchronous Vladimir Sementsov-Ogievskiy
@ 2020-06-25 14:48   ` Eric Blake
  2020-06-29  7:55   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-06-25 14:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 6/25/20 9:25 AM, Vladimir Sementsov-Ogievskiy wrote:
> client_close doesn't guarantee that client is closed: nbd_trip() keeps
> reference to it. Let's wait for nbd_trip to finish.
> 
> Without this fix, the following crash is possible:
> 
> - export bitmap through unternal Qemu NBD server

internal

> - connect a client
> - shutdown Qemu
> 
> On shutdown nbd_export_close_all is called, but it actually don't wait
> for nbd_trip() to finish and to release its references. So, export is
> not release, and exported bitmap remains busy, and on try to remove the
> bitmap (which is part of bdrv_close()) the assertion fairs:

fails

> 
> bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   nbd/server.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 20754e9ebc..5e27a8d31a 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1419,6 +1419,8 @@ static void client_close(NBDClient *client, bool negotiated)
>       qio_channel_shutdown(client->ioc, QIO_CHANNEL_SHUTDOWN_BOTH,
>                            NULL);
>   
> +    AIO_WAIT_WHILE(client->exp->ctx, client->recv_coroutine);
> +
>       /* Also tell the client, so that they release their reference.  */
>       if (client->close_fn) {
>           client->close_fn(client, negotiated);
> @@ -2450,6 +2452,7 @@ static coroutine_fn void nbd_trip(void *opaque)
>   
>       trace_nbd_trip();
>       if (client->closing) {
> +        client->recv_coroutine = NULL;
>           nbd_client_put(client);
>           return;
>       }
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Fix crash due to NBD export leak
  2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-06-25 14:25 ` [PATCH 4/4] iotests: test shutdown when bitmap is exported through NBD Vladimir Sementsov-Ogievskiy
@ 2020-06-25 15:15 ` no-reply
  4 siblings, 0 replies; 12+ messages in thread
From: no-reply @ 2020-06-25 15:15 UTC (permalink / raw)
  To: vsementsov; +Cc: kwolf, vsementsov, qemu-block, qemu-devel, mreitz, den

Patchew URL: https://patchew.org/QEMU/20200625142540.24589-1-vsementsov@virtuozzo.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

Not run: 259
Failures: 140 143
Failed 2 of 120 iotests
make: *** [check-tests/check-block.sh] Error 1
make: *** Waiting for unfinished jobs....
  TEST    check-qtest-aarch64: tests/qtest/qos-test
Traceback (most recent call last):
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=0c91d8fcbb86459aa0e4fb7fbaad61d0', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-xdxld493/src/docker-src.2020-06-25-11.01.00.15808:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=0c91d8fcbb86459aa0e4fb7fbaad61d0
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-xdxld493/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    14m24.488s
user    0m9.266s


The full log is available at
http://patchew.org/logs/20200625142540.24589-1-vsementsov@virtuozzo.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt
  2020-06-25 14:25 ` [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt Vladimir Sementsov-Ogievskiy
@ 2020-06-25 19:32   ` Eric Blake
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2020-06-25 19:32 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, qemu-devel, mreitz

On 6/25/20 9:25 AM, Vladimir Sementsov-Ogievskiy wrote:
> All users of QemuIoInteractive provides -f argument, so it's incorrect

So far, all users is just test 205; although your series adds test 209 
that does likewise.  "incorrect" is a bit harsh since you can specify -f 
more than once (last one wins); maybe "inefficient" is better wording.

> to use qemu_io_args, which contains -f too. Let's use
> qemu_io_args_no_fmt, which also makes possible to use --image-opts with
> QemuIoInteractive in the following patch.

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   tests/qemu-iotests/iotests.py | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 5ea4c4df8b..efe9958f5e 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -211,7 +211,7 @@ def get_virtio_scsi_device():
>   
>   class QemuIoInteractive:
>       def __init__(self, *args):
> -        self.args = qemu_io_args + list(args)
> +        self.args = qemu_io_args_no_fmt + list(args)
>           self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
>                                      stdout=subprocess.PIPE,
>                                      stderr=subprocess.STDOUT,
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] nbd: make client_close synchronous
  2020-06-25 14:25 ` [PATCH 3/4] nbd: make client_close synchronous Vladimir Sementsov-Ogievskiy
  2020-06-25 14:48   ` Eric Blake
@ 2020-06-29  7:55   ` Vladimir Sementsov-Ogievskiy
  2020-06-29 13:56     ` Stefan Hajnoczi
  1 sibling, 1 reply; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-06-29  7:55 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel, mreitz, Stefan Hajnoczi, den

25.06.2020 17:25, Vladimir Sementsov-Ogievskiy wrote:
> client_close doesn't guarantee that client is closed: nbd_trip() keeps
> reference to it. Let's wait for nbd_trip to finish.
> 
> Without this fix, the following crash is possible:
> 
> - export bitmap through unternal Qemu NBD server
> - connect a client
> - shutdown Qemu
> 
> On shutdown nbd_export_close_all is called, but it actually don't wait
> for nbd_trip() to finish and to release its references. So, export is
> not release, and exported bitmap remains busy, and on try to remove the
> bitmap (which is part of bdrv_close()) the assertion fairs:
> 
> bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   nbd/server.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 20754e9ebc..5e27a8d31a 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1419,6 +1419,8 @@ static void client_close(NBDClient *client, bool negotiated)
>       qio_channel_shutdown(client->ioc, QIO_CHANNEL_SHUTDOWN_BOTH,
>                            NULL);
>   
> +    AIO_WAIT_WHILE(client->exp->ctx, client->recv_coroutine);
> +
>       /* Also tell the client, so that they release their reference.  */
>       if (client->close_fn) {
>           client->close_fn(client, negotiated);
> @@ -2450,6 +2452,7 @@ static coroutine_fn void nbd_trip(void *opaque)
>   
>       trace_nbd_trip();
>       if (client->closing) {
> +        client->recv_coroutine = NULL;
>           nbd_client_put(client);
>           return;
>       }
> 

I have a doubt, isn't it possible to hang in AIO_WAIT_WHILE() after this patch?

Do we need aio_wait_kick() in the nbd_trip()? Or may be, something like
aio_wait_kick(), but to wake exactly client->exp->ctx aio context?

Also, why in block/io.c we kick the main context, but not bs->aio_context?


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] nbd: make client_close synchronous
  2020-06-29  7:55   ` Vladimir Sementsov-Ogievskiy
@ 2020-06-29 13:56     ` Stefan Hajnoczi
  2020-07-01 10:19       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2020-06-29 13:56 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: kwolf, qemu-block, qemu-devel, mreitz, den

[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]

On Mon, Jun 29, 2020 at 10:55:06AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Also, why in block/io.c we kick the main context, but not bs->aio_context?

From AIO_WAIT_WHILE():

 * The caller's thread must be the IOThread that owns @ctx or the main loop
 * thread (with @ctx acquired exactly once).  This function cannot be used to
 * wait on conditions between two IOThreads since that could lead to deadlock,
 * go via the main loop instead.

Case 1: IOThread

  while ((cond)) {                                           \
      aio_poll(ctx_, true);                                  \
      waited_ = true;                                        \
  }                                                          \

In this case aio_poll() returns after every event loop iteration and we
will re-evaluate cond. Therefore we don't need to be kicked.

Case 2: Main loop thread

In this case we need the kick since we're waiting on the main loop
AioContext, not the IOThread AioContext that is doing the actual work.
aio_wait_kick() schedules a dummy BH to wake up the main loop thread.

There is no harm in scheduling the dummy BH in the main loop thread when
AIO_WAIT_WHILE() is called from an IOThread.

Hope this helps,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] nbd: make client_close synchronous
  2020-06-29 13:56     ` Stefan Hajnoczi
@ 2020-07-01 10:19       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-01 10:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, qemu-block, qemu-devel, mreitz, den

29.06.2020 16:56, Stefan Hajnoczi wrote:
> On Mon, Jun 29, 2020 at 10:55:06AM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Also, why in block/io.c we kick the main context, but not bs->aio_context?
> 
>  From AIO_WAIT_WHILE():
> 
>   * The caller's thread must be the IOThread that owns @ctx or the main loop
>   * thread (with @ctx acquired exactly once).  This function cannot be used to
>   * wait on conditions between two IOThreads since that could lead to deadlock,
>   * go via the main loop instead.
> 
> Case 1: IOThread
> 
>    while ((cond)) {                                           \
>        aio_poll(ctx_, true);                                  \
>        waited_ = true;                                        \
>    }                                                          \
> 
> In this case aio_poll() returns after every event loop iteration and we
> will re-evaluate cond. Therefore we don't need to be kicked.
> 
> Case 2: Main loop thread
> 
> In this case we need the kick since we're waiting on the main loop
> AioContext, not the IOThread AioContext that is doing the actual work.
> aio_wait_kick() schedules a dummy BH to wake up the main loop thread.
> 
> There is no harm in scheduling the dummy BH in the main loop thread when
> AIO_WAIT_WHILE() is called from an IOThread.
> 
> Hope this helps,

Thanks!

Looking at this all again, I think that client->recv_coroutine == NULL is a bad marker of finish, as it's not directly bound to _put. I'll try another approach, to make nbd_export_close_all() be synchronous instead by waiting for all export to be actually freed.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-01 10:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 14:25 [PATCH 0/4] Fix crash due to NBD export leak Vladimir Sementsov-Ogievskiy
2020-06-25 14:25 ` [PATCH 1/4] iotests: QemuIoInteractive: use qemu_io_args_no_fmt Vladimir Sementsov-Ogievskiy
2020-06-25 19:32   ` Eric Blake
2020-06-25 14:25 ` [PATCH 2/4] iotests.py: QemuIoInteractive: print output on failure Vladimir Sementsov-Ogievskiy
2020-06-25 14:46   ` Eric Blake
2020-06-25 14:25 ` [PATCH 3/4] nbd: make client_close synchronous Vladimir Sementsov-Ogievskiy
2020-06-25 14:48   ` Eric Blake
2020-06-29  7:55   ` Vladimir Sementsov-Ogievskiy
2020-06-29 13:56     ` Stefan Hajnoczi
2020-07-01 10:19       ` Vladimir Sementsov-Ogievskiy
2020-06-25 14:25 ` [PATCH 4/4] iotests: test shutdown when bitmap is exported through NBD Vladimir Sementsov-Ogievskiy
2020-06-25 15:15 ` [PATCH 0/4] Fix crash due to NBD export leak no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.