All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash
@ 2017-08-24 15:33 Stefan Hajnoczi
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 1/3] " Stefan Hajnoczi
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-24 15:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Eric Blake, qemu-block, Paolo Bonzini, Stefan Hajnoczi

See Patch 1 for the segfault fix.  Patches 2 & 3 add qemu-iotests coverage.

This is a rare crash that we'll probably only see in testing.  It only seems to
happen with UNIX domain sockets.

Stefan Hajnoczi (3):
  nbd-client: enter read_reply_co during init to avoid crash
  qemu-iotests: improve nbd-fault-injector.py startup protocol
  qemu-iotests: test NBD over UNIX domain sockets in 083

 block/nbd-client.c                       |   2 +-
 tests/qemu-iotests/083                   | 138 ++++++++++++++++++-----------
 tests/qemu-iotests/083.out               | 145 +++++++++++++++++++++++++++----
 tests/qemu-iotests/common.filter         |   4 +-
 tests/qemu-iotests/nbd-fault-injector.py |   4 +
 5 files changed, 220 insertions(+), 73 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 15:33 [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Stefan Hajnoczi
@ 2017-08-24 15:33 ` Stefan Hajnoczi
  2017-08-24 16:21   ` Paolo Bonzini
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol Stefan Hajnoczi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-24 15:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Eric Blake, qemu-block, Paolo Bonzini, Stefan Hajnoczi

The following segfault is encountered if the NBD server closes the UNIX
domain socket immediately after negotiation:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  441	    QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
  (gdb) bt
  #0  0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  #1  0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207
  #2  0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237
  #3  0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836
  #4  0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:1086
  #5  0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182
  #6  0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032
  #7  0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079
  #8  0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
  #9  0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6

The problem is that nbd_client_init() uses
nbd_client_attach_aio_context() -> aio_co_schedule(new_context,
client->read_reply_co).  Execution of read_reply_co is deferred to a BH
which doesn't run until later.

In the mean time blk_co_preadv() can be called and nbd_coroutine_end()
calls aio_wake() on read_reply_co.  At this point in time
read_reply_co's ctx isn't set because it has never been entered yet.

This patch enters read_reply_co directly in
nbd_client_attach_aio_context().  This is safe because new_context is
acquired by the caller.  This ensures that read_reply_co reaches its
first yield point and its ctx is set up.

Note this only happens with UNIX domain sockets on Linux.  It doesn't
seem possible to reproduce this with TCP sockets.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/nbd-client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 25bcaa2346..0a7f32779e 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -371,7 +371,7 @@ void nbd_client_attach_aio_context(BlockDriverState *bs,
 {
     NBDClientSession *client = nbd_get_client_session(bs);
     qio_channel_attach_aio_context(QIO_CHANNEL(client->ioc), new_context);
-    aio_co_schedule(new_context, client->read_reply_co);
+    qemu_aio_coroutine_enter(new_context, client->read_reply_co);
 }
 
 void nbd_client_close(BlockDriverState *bs)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol
  2017-08-24 15:33 [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Stefan Hajnoczi
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 1/3] " Stefan Hajnoczi
@ 2017-08-24 15:33 ` Stefan Hajnoczi
  2017-08-24 17:39   ` Eric Blake
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083 Stefan Hajnoczi
  2017-08-24 15:52 ` [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Eric Blake
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-24 15:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Eric Blake, qemu-block, Paolo Bonzini, Stefan Hajnoczi

Currently 083 waits for the nbd-fault-injector.py server to start up by
looping until netstat shows the TCP listen socket.

The startup protocol can be simplified by passing a 0 port number to
nbd-fault-injector.py.  The kernel will allocate a port in bind(2) and
the final port number can be printed by nbd-fault-injector.py.

This should make it slightly nicer and less TCP-specific to wait for
server startup.  This patch changes nbd-fault-injector.py, the next one
will rewrite server startup in 083.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/nbd-fault-injector.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/nbd-fault-injector.py b/tests/qemu-iotests/nbd-fault-injector.py
index 6c07191a5a..1c10dcb51c 100755
--- a/tests/qemu-iotests/nbd-fault-injector.py
+++ b/tests/qemu-iotests/nbd-fault-injector.py
@@ -235,11 +235,15 @@ def open_socket(path):
         sock = socket.socket()
         sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
         sock.bind((host, int(port)))
+
+        # If given port was 0 the final port number is now available
+        path = '%s:%d' % sock.getsockname()
     else:
         sock = socket.socket(socket.AF_UNIX)
         sock.bind(path)
     sock.listen(0)
     print 'Listening on %s' % path
+    sys.stdout.flush() # another process may be waiting, show message now
     return sock
 
 def usage(args):
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083
  2017-08-24 15:33 [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Stefan Hajnoczi
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 1/3] " Stefan Hajnoczi
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol Stefan Hajnoczi
@ 2017-08-24 15:33 ` Stefan Hajnoczi
  2017-08-24 17:45   ` Eric Blake
  2017-08-24 15:52 ` [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Eric Blake
  3 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-24 15:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Eric Blake, qemu-block, Paolo Bonzini, Stefan Hajnoczi

083 only tests TCP.  Some failures might be specific to UNIX domain
sockets.

A few adjustments are necessary:

1. Generating a port number and waiting for server startup is
   TCP-specific.  Use the new nbd-fault-injector.py startup protocol to
   fetch the address.  This is a little more elegant because we don't
   need netstat anymore.

2. The NBD filter does not work for the UNIX domain sockets URIs we
   generate and must be extended.

3. Run all tests twice: once for TCP and once for UNIX domain sockets.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/083           | 138 +++++++++++++++++++++++--------------
 tests/qemu-iotests/083.out       | 145 ++++++++++++++++++++++++++++++++++-----
 tests/qemu-iotests/common.filter |   4 +-
 3 files changed, 215 insertions(+), 72 deletions(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index bff9360048..0306f112da 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -27,6 +27,14 @@ echo "QA output created by $seq"
 here=`pwd`
 status=1	# failure is the default!
 
+_cleanup()
+{
+	rm -f nbd.sock
+	rm -f nbd-fault-injector.out
+	rm -f nbd-fault-injector.conf
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
 # get standard environment, filters and checks
 . ./common.rc
 . ./common.filter
@@ -35,81 +43,105 @@ _supported_fmt generic
 _supported_proto nbd
 _supported_os Linux
 
-# Pick a TCP port based on our pid.  This way multiple instances of this test
-# can run in parallel without conflicting.
-choose_tcp_port() {
-	echo $((($$ % 31744) + 1024)) # 1024 <= port < 32768
-}
-
-wait_for_tcp_port() {
-	while ! (netstat --tcp --listening --numeric | \
-		 grep "$1.*0\\.0\\.0\\.0:\\*.*LISTEN") >/dev/null 2>&1; do
-		sleep 0.1
-	done
-}
-
 check_disconnect() {
+	local event export_name=foo extra_args nbd_addr nbd_url proto when
+
+	while true; do
+		case $1 in
+		--classic-negotiation)
+			shift
+			extra_args=--classic-negotiation
+			export_name=
+			;;
+		--tcp)
+			shift
+			proto=tcp
+			;;
+		--unix)
+			shift
+			proto=unix
+			;;
+		*)
+			break
+			;;
+		esac
+	done
+
 	event=$1
 	when=$2
-	negotiation=$3
 	echo "=== Check disconnect $when $event ==="
 	echo
 
-	port=$(choose_tcp_port)
-
 	cat > "$TEST_DIR/nbd-fault-injector.conf" <<EOF
 [inject-error]
 event=$event
 when=$when
 EOF
 
-	if [ "$negotiation" = "--classic-negotiation" ]; then
-		extra_args=--classic-negotiation
-		nbd_url="nbd:127.0.0.1:$port"
+	if [ "$proto" = "tcp" ]; then
+		nbd_addr="127.0.0.1:0"
 	else
-		nbd_url="nbd:127.0.0.1:$port:exportname=foo"
+		nbd_addr="$TEST_DIR/nbd.sock"
+	fi
+
+	rm -f "$TEST_DIR/nbd.sock"
+
+	$PYTHON nbd-fault-injector.py $extra_args "$nbd_addr" "$TEST_DIR/nbd-fault-injector.conf" >"$TEST_DIR/nbd-fault-injector.out" 2>&1 &
+
+	# Wait for server to be ready
+	while ! grep -q 'Listening on ' "$TEST_DIR/nbd-fault-injector.out"; do
+		sleep 0.1
+	done
+
+	# Extract the final address (port number has now been assigned in tcp case)
+	nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' "$TEST_DIR/nbd-fault-injector.out")
+
+	if [ "$proto" = "tcp" ]; then
+		nbd_url="nbd+tcp://$nbd_addr/$export_name"
+	else
+		nbd_url="nbd+unix:///$export_name?socket=$nbd_addr"
 	fi
 
-	$PYTHON nbd-fault-injector.py $extra_args "127.0.0.1:$port" "$TEST_DIR/nbd-fault-injector.conf" >/dev/null 2>&1 &
-	wait_for_tcp_port "127\\.0\\.0\\.1:$port"
 	$QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | _filter_nbd
 
 	echo
 }
 
-for event in neg1 "export" neg2 request reply data; do
-	for when in before after; do
-		check_disconnect "$event" "$when"
-	done
-
-	# Also inject short replies from the NBD server
-	case "$event" in
-	neg1)
-		for when in 8 16; do
-			check_disconnect "$event" "$when"
-		done
-		;;
-	"export")
-		for when in 4 12 16; do
-			check_disconnect "$event" "$when"
-		done
-		;;
-	neg2)
-		for when in 8 10; do
-			check_disconnect "$event" "$when"
+for proto in tcp unix; do
+	for event in neg1 "export" neg2 request reply data; do
+		for when in before after; do
+			check_disconnect "--$proto" "$event" "$when"
 		done
-		;;
-	reply)
-		for when in 4 8; do
-			check_disconnect "$event" "$when"
-		done
-		;;
-	esac
-done
 
-# Also check classic negotiation without export information
-for when in before 8 16 24 28 after; do
-	check_disconnect "neg-classic" "$when" --classic-negotiation
+		# Also inject short replies from the NBD server
+		case "$event" in
+		neg1)
+			for when in 8 16; do
+				check_disconnect "--$proto" "$event" "$when"
+			done
+			;;
+		"export")
+			for when in 4 12 16; do
+				check_disconnect "--$proto" "$event" "$when"
+			done
+			;;
+		neg2)
+			for when in 8 10; do
+				check_disconnect "--$proto" "$event" "$when"
+			done
+			;;
+		reply)
+			for when in 4 8; do
+				check_disconnect "--$proto" "$event" "$when"
+			done
+			;;
+		esac
+	done
+
+	# Also check classic negotiation without export information
+	for when in before 8 16 24 28 after; do
+		check_disconnect "--$proto" --classic-negotiation "neg-classic" "$when"
+	done
 done
 
 # success, all done
diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
index a24c6bfece..a7fb081889 100644
--- a/tests/qemu-iotests/083.out
+++ b/tests/qemu-iotests/083.out
@@ -1,43 +1,43 @@
 QA output created by 083
 === Check disconnect before neg1 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect after neg1 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 8 neg1 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 16 neg1 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect before export ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect after export ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 4 export ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 12 export ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 16 export ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect before neg2 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect after neg2 ===
 
@@ -45,11 +45,11 @@ read failed: Input/output error
 
 === Check disconnect 8 neg2 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect 10 neg2 ===
 
-can't open device nbd:127.0.0.1:PORT:exportname=foo
+can't open device nbd+tcp://127.0.0.1:PORT/foo
 
 === Check disconnect before request ===
 
@@ -88,23 +88,134 @@ read 512/512 bytes at offset 0
 
 === Check disconnect before neg-classic ===
 
-can't open device nbd:127.0.0.1:PORT
+can't open device nbd+tcp://127.0.0.1:PORT/
 
 === Check disconnect 8 neg-classic ===
 
-can't open device nbd:127.0.0.1:PORT
+can't open device nbd+tcp://127.0.0.1:PORT/
 
 === Check disconnect 16 neg-classic ===
 
-can't open device nbd:127.0.0.1:PORT
+can't open device nbd+tcp://127.0.0.1:PORT/
 
 === Check disconnect 24 neg-classic ===
 
-can't open device nbd:127.0.0.1:PORT
+can't open device nbd+tcp://127.0.0.1:PORT/
 
 === Check disconnect 28 neg-classic ===
 
-can't open device nbd:127.0.0.1:PORT
+can't open device nbd+tcp://127.0.0.1:PORT/
+
+=== Check disconnect after neg-classic ===
+
+read failed: Input/output error
+
+=== Check disconnect before neg1 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect after neg1 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 8 neg1 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 16 neg1 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect before export ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect after export ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 4 export ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 12 export ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 16 export ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect before neg2 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect after neg2 ===
+
+read failed: Input/output error
+
+=== Check disconnect 8 neg2 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 10 neg2 ===
+
+can't open device nbd+unix:///foo?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect before request ===
+
+read failed: Input/output error
+
+=== Check disconnect after request ===
+
+read failed: Input/output error
+
+=== Check disconnect before reply ===
+
+read failed: Input/output error
+
+=== Check disconnect after reply ===
+
+read failed: Input/output error
+
+=== Check disconnect 4 reply ===
+
+read failed
+read failed: Input/output error
+
+=== Check disconnect 8 reply ===
+
+read failed
+read failed: Input/output error
+
+=== Check disconnect before data ===
+
+read failed: Input/output error
+
+=== Check disconnect after data ===
+
+read 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Check disconnect before neg-classic ===
+
+can't open device nbd+unix:///?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 8 neg-classic ===
+
+can't open device nbd+unix:///?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 16 neg-classic ===
+
+can't open device nbd+unix:///?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 24 neg-classic ===
+
+can't open device nbd+unix:///?socket=TEST_DIR/nbd.sock
+
+=== Check disconnect 28 neg-classic ===
+
+can't open device nbd+unix:///?socket=TEST_DIR/nbd.sock
 
 === Check disconnect after neg-classic ===
 
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 7a58e57317..9d5442ecd9 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -170,9 +170,9 @@ _filter_nbd()
     #
     # Filter out the TCP port number since this changes between runs.
     sed -e '/nbd\/.*\.c:/d' \
-        -e 's#nbd:\(//\)\?127\.0\.0\.1:[0-9]*#nbd:\1127.0.0.1:PORT#g' \
+        -e 's#127\.0\.0\.1:[0-9]*#127.0.0.1:PORT#g' \
         -e "s#?socket=$TEST_DIR#?socket=TEST_DIR#g" \
-        -e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
+        -e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
 }
 
 # make sure this script returns success
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 15:33 [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083 Stefan Hajnoczi
@ 2017-08-24 15:52 ` Eric Blake
  2017-08-24 16:05   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  3 siblings, 1 reply; 15+ messages in thread
From: Eric Blake @ 2017-08-24 15:52 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, qemu-block, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1151 bytes --]

On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
> See Patch 1 for the segfault fix.  Patches 2 & 3 add qemu-iotests coverage.
> 
> This is a rare crash that we'll probably only see in testing.  It only seems to
> happen with UNIX domain sockets.

Rare enough that I don't think it is 2.10-rc4 material.  I'll review,
and if I don't find anything needing a respin, I'll add it to my NBD
queue for 2.11.

> 
> Stefan Hajnoczi (3):
>   nbd-client: enter read_reply_co during init to avoid crash
>   qemu-iotests: improve nbd-fault-injector.py startup protocol
>   qemu-iotests: test NBD over UNIX domain sockets in 083
> 
>  block/nbd-client.c                       |   2 +-
>  tests/qemu-iotests/083                   | 138 ++++++++++++++++++-----------
>  tests/qemu-iotests/083.out               | 145 +++++++++++++++++++++++++++----
>  tests/qemu-iotests/common.filter         |   4 +-
>  tests/qemu-iotests/nbd-fault-injector.py |   4 +
>  5 files changed, 220 insertions(+), 73 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 15:52 ` [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Eric Blake
@ 2017-08-24 16:05   ` Stefan Hajnoczi
  2017-08-25 15:08     ` Eric Blake
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-24 16:05 UTC (permalink / raw)
  To: Eric Blake
  Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, Paolo Bonzini, qemu block

On Thu, Aug 24, 2017 at 4:52 PM, Eric Blake <eblake@redhat.com> wrote:
> On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
>> See Patch 1 for the segfault fix.  Patches 2 & 3 add qemu-iotests coverage.
>>
>> This is a rare crash that we'll probably only see in testing.  It only seems to
>> happen with UNIX domain sockets.
>
> Rare enough that I don't think it is 2.10-rc4 material.  I'll review,
> and if I don't find anything needing a respin, I'll add it to my NBD
> queue for 2.11.

Great.  This is 2.11 material.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 1/3] " Stefan Hajnoczi
@ 2017-08-24 16:21   ` Paolo Bonzini
  2017-08-24 17:37     ` Eric Blake
  2017-08-25 10:40     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-08-24 16:21 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, Eric Blake, qemu-block

On 24/08/2017 17:33, Stefan Hajnoczi wrote:
> This patch enters read_reply_co directly in
> nbd_client_attach_aio_context().  This is safe because new_context is
> acquired by the caller.  This ensures that read_reply_co reaches its
> first yield point and its ctx is set up.

I'm not very confident with this patch.  aio_context_acquire/release is
going to go away, and this then becomes possible

	main context			new_context
	qemu_aio_coroutine_enter
					send request
					wait for reply
	read first reply
	wake coroutine

where the "wake coroutine" part thinks it's running in new_context, and
thus simply enters the coroutine instead of using the bottom half.

But blk_co_preadv() should need the read_reply_co itself, in order to be
woken up after reading the reply header.  The core issue here is that
nbd_co_receive_reply was never called, I suspect.  And if it was never
called, read_reply_co should not be woken up by nbd_coroutine_end.

So the fix is:

1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails

2) move this to nbd_co_receive_reply:

    s->recv_coroutine[i] = NULL;

    /* Kick the read_reply_co to get the next reply.  */
    if (s->read_reply_co) {
        aio_co_wake(s->read_reply_co);
    }

Does this make sense?  (Note that the read_reply_co idea actually came
from you, or from my recollections of your proposed design :)).

Paolo

> Note this only happens with UNIX domain sockets on Linux.  It doesn't
> seem possible to reproduce this with TCP sockets.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/nbd-client.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 25bcaa2346..0a7f32779e 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -371,7 +371,7 @@ void nbd_client_attach_aio_context(BlockDriverState *bs,
>  {
>      NBDClientSession *client = nbd_get_client_session(bs);
>      qio_channel_attach_aio_context(QIO_CHANNEL(client->ioc), new_context);
> -    aio_co_schedule(new_context, client->read_reply_co);
> +    qemu_aio_coroutine_enter(new_context, client->read_reply_co);
>  }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 16:21   ` Paolo Bonzini
@ 2017-08-24 17:37     ` Eric Blake
  2017-08-24 17:42       ` Paolo Bonzini
  2017-08-25 10:40     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  1 sibling, 1 reply; 15+ messages in thread
From: Eric Blake @ 2017-08-24 17:37 UTC (permalink / raw)
  To: Paolo Bonzini, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, qemu-block, Vladimir Sementsov-Ogievskiy

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]

On 08/24/2017 11:21 AM, Paolo Bonzini wrote:
> On 24/08/2017 17:33, Stefan Hajnoczi wrote:
>> This patch enters read_reply_co directly in
>> nbd_client_attach_aio_context().  This is safe because new_context is
>> acquired by the caller.  This ensures that read_reply_co reaches its
>> first yield point and its ctx is set up.
> 
> I'm not very confident with this patch.  aio_context_acquire/release is
> going to go away, and this then becomes possible
> 
> 	main context			new_context
> 	qemu_aio_coroutine_enter
> 					send request
> 					wait for reply
> 	read first reply
> 	wake coroutine
> 
> where the "wake coroutine" part thinks it's running in new_context, and
> thus simply enters the coroutine instead of using the bottom half.
> 
> But blk_co_preadv() should need the read_reply_co itself, in order to be
> woken up after reading the reply header.  The core issue here is that
> nbd_co_receive_reply was never called, I suspect.  And if it was never
> called, read_reply_co should not be woken up by nbd_coroutine_end.
> 
> So the fix is:
> 
> 1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails
> 
> 2) move this to nbd_co_receive_reply:
> 
>     s->recv_coroutine[i] = NULL;
> 
>     /* Kick the read_reply_co to get the next reply.  */
>     if (s->read_reply_co) {
>         aio_co_wake(s->read_reply_co);
>     }
> 
> Does this make sense?  (Note that the read_reply_co idea actually came
> from you, or from my recollections of your proposed design :)).

How much of this overlaps with Vladimir's proposal?
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00846.html

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol Stefan Hajnoczi
@ 2017-08-24 17:39   ` Eric Blake
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2017-08-24 17:39 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, qemu-block, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
> Currently 083 waits for the nbd-fault-injector.py server to start up by
> looping until netstat shows the TCP listen socket.
> 
> The startup protocol can be simplified by passing a 0 port number to
> nbd-fault-injector.py.  The kernel will allocate a port in bind(2) and
> the final port number can be printed by nbd-fault-injector.py.
> 
> This should make it slightly nicer and less TCP-specific to wait for
> server startup.  This patch changes nbd-fault-injector.py, the next one
> will rewrite server startup in 083.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/qemu-iotests/nbd-fault-injector.py | 4 ++++
>  1 file changed, 4 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 17:37     ` Eric Blake
@ 2017-08-24 17:42       ` Paolo Bonzini
  2017-08-25 15:57         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-08-24 17:42 UTC (permalink / raw)
  To: Eric Blake, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, qemu-block, Vladimir Sementsov-Ogievskiy

[-- Attachment #1: Type: text/plain, Size: 1807 bytes --]

On 24/08/2017 19:37, Eric Blake wrote:
> On 08/24/2017 11:21 AM, Paolo Bonzini wrote:
>> On 24/08/2017 17:33, Stefan Hajnoczi wrote:
>>> This patch enters read_reply_co directly in
>>> nbd_client_attach_aio_context().  This is safe because new_context is
>>> acquired by the caller.  This ensures that read_reply_co reaches its
>>> first yield point and its ctx is set up.
>>
>> I'm not very confident with this patch.  aio_context_acquire/release is
>> going to go away, and this then becomes possible
>>
>> 	main context			new_context
>> 	qemu_aio_coroutine_enter
>> 					send request
>> 					wait for reply
>> 	read first reply
>> 	wake coroutine
>>
>> where the "wake coroutine" part thinks it's running in new_context, and
>> thus simply enters the coroutine instead of using the bottom half.
>>
>> But blk_co_preadv() should need the read_reply_co itself, in order to be
>> woken up after reading the reply header.  The core issue here is that
>> nbd_co_receive_reply was never called, I suspect.  And if it was never
>> called, read_reply_co should not be woken up by nbd_coroutine_end.
>>
>> So the fix is:
>>
>> 1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails
>>
>> 2) move this to nbd_co_receive_reply:
>>
>>     s->recv_coroutine[i] = NULL;
>>
>>     /* Kick the read_reply_co to get the next reply.  */
>>     if (s->read_reply_co) {
>>         aio_co_wake(s->read_reply_co);
>>     }
>>
>> Does this make sense?  (Note that the read_reply_co idea actually came
>> from you, or from my recollections of your proposed design :)).
> 
> How much of this overlaps with Vladimir's proposal?
> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00846.html

The above should be about 15 lines added, 10 removed. :)

Paolo



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083
  2017-08-24 15:33 ` [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083 Stefan Hajnoczi
@ 2017-08-24 17:45   ` Eric Blake
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2017-08-24 17:45 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, qemu-block, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1118 bytes --]

On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
> 083 only tests TCP.  Some failures might be specific to UNIX domain
> sockets.
> 
> A few adjustments are necessary:
> 
> 1. Generating a port number and waiting for server startup is
>    TCP-specific.  Use the new nbd-fault-injector.py startup protocol to
>    fetch the address.  This is a little more elegant because we don't
>    need netstat anymore.
> 
> 2. The NBD filter does not work for the UNIX domain sockets URIs we
>    generate and must be extended.
> 
> 3. Run all tests twice: once for TCP and once for UNIX domain sockets.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/qemu-iotests/083           | 138 +++++++++++++++++++++++--------------
>  tests/qemu-iotests/083.out       | 145 ++++++++++++++++++++++++++++++++++-----
>  tests/qemu-iotests/common.filter |   4 +-
>  3 files changed, 215 insertions(+), 72 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 16:21   ` Paolo Bonzini
  2017-08-24 17:37     ` Eric Blake
@ 2017-08-25 10:40     ` Stefan Hajnoczi
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-08-25 10:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, qemu block

On Thu, Aug 24, 2017 at 5:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 24/08/2017 17:33, Stefan Hajnoczi wrote:
>> This patch enters read_reply_co directly in
>> nbd_client_attach_aio_context().  This is safe because new_context is
>> acquired by the caller.  This ensures that read_reply_co reaches its
>> first yield point and its ctx is set up.
>
> I'm not very confident with this patch.  aio_context_acquire/release is
> going to go away, and this then becomes possible
>
>         main context                    new_context
>         qemu_aio_coroutine_enter
>                                         send request
>                                         wait for reply
>         read first reply
>         wake coroutine
>
> where the "wake coroutine" part thinks it's running in new_context, and
> thus simply enters the coroutine instead of using the bottom half.
>
> But blk_co_preadv() should need the read_reply_co itself, in order to be
> woken up after reading the reply header.  The core issue here is that
> nbd_co_receive_reply was never called, I suspect.  And if it was never
> called, read_reply_co should not be woken up by nbd_coroutine_end.
>
> So the fix is:
>
> 1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails
>
> 2) move this to nbd_co_receive_reply:
>
>     s->recv_coroutine[i] = NULL;
>
>     /* Kick the read_reply_co to get the next reply.  */
>     if (s->read_reply_co) {
>         aio_co_wake(s->read_reply_co);
>     }
>
> Does this make sense?  (Note that the read_reply_co idea actually came
> from you, or from my recollections of your proposed design :)).

Yes, I think that would work.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 16:05   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2017-08-25 15:08     ` Eric Blake
  2017-08-25 20:33       ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Blake @ 2017-08-25 15:08 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, Paolo Bonzini, qemu block

[-- Attachment #1: Type: text/plain, Size: 992 bytes --]

On 08/24/2017 11:05 AM, Stefan Hajnoczi wrote:
> On Thu, Aug 24, 2017 at 4:52 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
>>> See Patch 1 for the segfault fix.  Patches 2 & 3 add qemu-iotests coverage.
>>>
>>> This is a rare crash that we'll probably only see in testing.  It only seems to
>>> happen with UNIX domain sockets.
>>
>> Rare enough that I don't think it is 2.10-rc4 material.  I'll review,
>> and if I don't find anything needing a respin, I'll add it to my NBD
>> queue for 2.11.
> 
> Great.  This is 2.11 material.

Hmm - I wonder if https://bugs.launchpad.net/qemu/+bug/1712818 is
related to this.  I really don't want to delay 2.10 further (downstream
distros can backport fixes) - but comment 3 stating that crashes are
less frequent but still possible with -rc4 is not good.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-24 17:42       ` Paolo Bonzini
@ 2017-08-25 15:57         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-08-25 15:57 UTC (permalink / raw)
  To: Paolo Bonzini, Eric Blake, Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, qemu-block

24.08.2017 20:42, Paolo Bonzini wrote:
> On 24/08/2017 19:37, Eric Blake wrote:
>> On 08/24/2017 11:21 AM, Paolo Bonzini wrote:
>>> On 24/08/2017 17:33, Stefan Hajnoczi wrote:
>>>> This patch enters read_reply_co directly in
>>>> nbd_client_attach_aio_context().  This is safe because new_context is
>>>> acquired by the caller.  This ensures that read_reply_co reaches its
>>>> first yield point and its ctx is set up.
>>> I'm not very confident with this patch.  aio_context_acquire/release is
>>> going to go away, and this then becomes possible
>>>
>>> 	main context			new_context
>>> 	qemu_aio_coroutine_enter
>>> 					send request
>>> 					wait for reply
>>> 	read first reply
>>> 	wake coroutine
>>>
>>> where the "wake coroutine" part thinks it's running in new_context, and
>>> thus simply enters the coroutine instead of using the bottom half.
>>>
>>> But blk_co_preadv() should need the read_reply_co itself, in order to be
>>> woken up after reading the reply header.  The core issue here is that
>>> nbd_co_receive_reply was never called, I suspect.  And if it was never
>>> called, read_reply_co should not be woken up by nbd_coroutine_end.
>>>
>>> So the fix is:
>>>
>>> 1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails
>>>
>>> 2) move this to nbd_co_receive_reply:
>>>
>>>      s->recv_coroutine[i] = NULL;
>>>
>>>      /* Kick the read_reply_co to get the next reply.  */
>>>      if (s->read_reply_co) {
>>>          aio_co_wake(s->read_reply_co);
>>>      }
>>>
>>> Does this make sense?  (Note that the read_reply_co idea actually came
>>> from you, or from my recollections of your proposed design :)).
>> How much of this overlaps with Vladimir's proposal?
>> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00846.html
> The above should be about 15 lines added, 10 removed. :)
>
> Paolo
>
>

I'll be on vocation for the next two weeks and will continue this work 
after it.

I think we have a lot of conflicts already with my series after 
different fixes,

so I'll rebase on them all, no problem.


PS: I think recent problems and bugs in nbd are related to not very good 
separation between

block/nbd-client.c and nbd/client.c. Ideally, I think, block/nbd-client 
should not

directly access the ioc, it should just call one or two high level 
functions of

nbd/client. The big problem of this separation is CMD_READ reply - only 
real client

knows, should we read a payload or not. I have two ideas on it:

1. We can add this information to request handle, then nbd/client will 
now by handle,

   that it should read the payload.. The length of payload should be in 
handle too. It looks

  possible, as handle is 64bit when request len is 32bit.

2. Move part of NBDClientSession to nbd/client, with NBDClientRequest 
requests[MAX_NBD_REQUESTS];

   to nbd/client, to implement one public function nbd_request(state, 
*request, *reply), which will do all the

  work for block/nbd-client..


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash
  2017-08-25 15:08     ` Eric Blake
@ 2017-08-25 20:33       ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-08-25 20:33 UTC (permalink / raw)
  To: Eric Blake, Stefan Hajnoczi
  Cc: Stefan Hajnoczi, qemu-devel, Kevin Wolf, qemu block

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

On 25/08/2017 17:08, Eric Blake wrote:
> On 08/24/2017 11:05 AM, Stefan Hajnoczi wrote:
>> On Thu, Aug 24, 2017 at 4:52 PM, Eric Blake <eblake@redhat.com> wrote:
>>> On 08/24/2017 10:33 AM, Stefan Hajnoczi wrote:
>>>> See Patch 1 for the segfault fix.  Patches 2 & 3 add qemu-iotests coverage.
>>>>
>>>> This is a rare crash that we'll probably only see in testing.  It only seems to
>>>> happen with UNIX domain sockets.
>>>
>>> Rare enough that I don't think it is 2.10-rc4 material.  I'll review,
>>> and if I don't find anything needing a respin, I'll add it to my NBD
>>> queue for 2.11.
>>
>> Great.  This is 2.11 material.
> 
> Hmm - I wonder if https://bugs.launchpad.net/qemu/+bug/1712818 is
> related to this.  I really don't want to delay 2.10 further (downstream
> distros can backport fixes) - but comment 3 stating that crashes are
> less frequent but still possible with -rc4 is not good.

Well, I won't be able to review or write the patch, so I think it's
better to wait.

Paolo


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-08-25 20:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-24 15:33 [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Stefan Hajnoczi
2017-08-24 15:33 ` [Qemu-devel] [PATCH 1/3] " Stefan Hajnoczi
2017-08-24 16:21   ` Paolo Bonzini
2017-08-24 17:37     ` Eric Blake
2017-08-24 17:42       ` Paolo Bonzini
2017-08-25 15:57         ` Vladimir Sementsov-Ogievskiy
2017-08-25 10:40     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-08-24 15:33 ` [Qemu-devel] [PATCH 2/3] qemu-iotests: improve nbd-fault-injector.py startup protocol Stefan Hajnoczi
2017-08-24 17:39   ` Eric Blake
2017-08-24 15:33 ` [Qemu-devel] [PATCH 3/3] qemu-iotests: test NBD over UNIX domain sockets in 083 Stefan Hajnoczi
2017-08-24 17:45   ` Eric Blake
2017-08-24 15:52 ` [Qemu-devel] [PATCH 0/3] nbd-client: enter read_reply_co during init to avoid crash Eric Blake
2017-08-24 16:05   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-08-25 15:08     ` Eric Blake
2017-08-25 20:33       ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.