All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all()
@ 2015-02-27 16:43 Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 01/11] iotests: Move _filter_nbd into common.filter Max Reitz
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
which can lead to data corruption (see the iotest added in the final
patch of this series) and is most certainly very ugly.

This series reworks bdrv_close_all() to instead eject the BDS trees from
all BlockBackends and then close the monitor-owned BDS trees, which are
the only BDSs without a BB. In effect, all BDSs are closed just by
getting closed automatically due to their reference count becoming 0.

The benefit over the approach taken in v1 and v2 is that in device
models we often cannot simply drop the reference to a BB because there
may be some user which we forgot about. By ejecting the BDS trees from
the BB, the BB itself becomes unusable, but in a clean way (it will
return errors when accessed, but nothing will crash). Also, it is much
simpler (no reference tracking necessary).

The only disadvantage (I can see) is that the BBs are leaked; but this
does not matter because the qemu process is about to exit anyway.


This series depends on v2 of my series
"blockdev: BlockBackend and media" (or any later version), and on my
patch "virtio-scsi: Allocate op blocker reason before blocking".


v4:
- Patch 1: Simpler regex for removing "nbd:" lines [Fam]
- Patch 2: Made the non-redirection optional instead of mandatory; still
  ugly, but at least much less invasive
- Patch 3: Additional line due to the mechanism introduced in patch 2
  being optional
- Patch 5:
  - Embedded the Notifier objects in VirtIOBlockDataPlane and NBDExport
    [Fam]
  - Put the notifiers for the block devices on a virtio-scsi bus into a
    list, because there can be more than one device [Fam]
    To demonstrate the difference, try:
    $ x86_64-softmmu/qemu-system-x86_64 \
        -object iothread,id=thr -device virtio-scsi-pci,iothread=thr \
        -drive if=none,file=foo.qcow2,format=qcow2,id=foo \
        -device scsi-hd,drive=foo \
        -drive if=none,file=bar.qcow2,format=qcow2,id=bar \
        -device scsi-hd,drive=bar
    With v3, the !s->blocker assertion in
    virtio_scsi_set_up_op_blockers() will fail. With v4, everything
    works fine.


git-backport-diff against v3:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/11:[0002] [FC] 'iotests: Move _filter_nbd into common.filter'
002/11:[down] 'iotests: Make redirecting qemu's stderr optional'
003/11:[0001] [FC] 'iotests: Add test for eject under NBD server'
004/11:[----] [--] 'quorum: Fix close path'
005/11:[0159] [FC] 'block: Move BDS close notifiers into BB'
006/11:[----] [--] 'block: Use blk_remove_bs() in blk_delete()'
007/11:[----] [--] 'blockdev: Use blk_remove_bs() in do_drive_del()'
008/11:[----] [--] 'block: Make bdrv_close() static'
009/11:[----] [--] 'blockdev: Keep track of monitor-owned BDS'
010/11:[----] [--] 'block: Eject BDS tree from BB at bdrv_close_all()'
011/11:[----] [--] 'iotests: Add test for multiple BB on BDS tree'


Max Reitz (11):
  iotests: Move _filter_nbd into common.filter
  iotests: Make redirecting qemu's stderr optional
  iotests: Add test for eject under NBD server
  quorum: Fix close path
  block: Move BDS close notifiers into BB
  block: Use blk_remove_bs() in blk_delete()
  blockdev: Use blk_remove_bs() in do_drive_del()
  block: Make bdrv_close() static
  blockdev: Keep track of monitor-owned BDS
  block: Eject BDS tree from BB at bdrv_close_all()
  iotests: Add test for multiple BB on BDS tree

 block.c                                | 25 +++-------
 block/block-backend.c                  | 41 ++++++++++++----
 block/quorum.c                         |  3 +-
 blockdev-nbd.c                         | 36 +-------------
 blockdev.c                             | 24 +++++++--
 hw/block/dataplane/virtio-blk.c        | 77 ++++++++++++++++++++++-------
 hw/scsi/virtio-scsi.c                  | 87 ++++++++++++++++++++++++++++++--
 include/block/block.h                  |  2 -
 include/block/block_int.h              |  6 ++-
 include/hw/virtio/virtio-scsi.h        | 10 ++++
 include/sysemu/block-backend.h         |  4 +-
 nbd.c                                  | 13 +++++
 stubs/Makefile.objs                    |  1 +
 stubs/blockdev-close-all-bdrv-states.c |  5 ++
 tests/qemu-iotests/083                 | 13 +----
 tests/qemu-iotests/083.out             | 10 ----
 tests/qemu-iotests/096                 | 90 ++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/096.out             | 16 ++++++
 tests/qemu-iotests/117                 | 86 ++++++++++++++++++++++++++++++++
 tests/qemu-iotests/117.out             | 14 ++++++
 tests/qemu-iotests/common.filter       | 12 +++++
 tests/qemu-iotests/common.qemu         | 12 ++++-
 tests/qemu-iotests/group               |  2 +
 23 files changed, 470 insertions(+), 119 deletions(-)
 create mode 100644 stubs/blockdev-close-all-bdrv-states.c
 create mode 100755 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out
 create mode 100755 tests/qemu-iotests/117
 create mode 100644 tests/qemu-iotests/117.out

-- 
2.1.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 01/11] iotests: Move _filter_nbd into common.filter
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 02/11] iotests: Make redirecting qemu's stderr optional Max Reitz
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

_filter_nbd can be useful for other NBD tests, too, therefore it should
reside in common.filter, and it should support URLs of the "nbd://"
format and export names.

The NBD log lines ("/your/source/dir/nbd.c:function():line: error")
should not be converted to empty lines but removed altogether.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/083           | 13 +------------
 tests/qemu-iotests/083.out       | 10 ----------
 tests/qemu-iotests/common.filter | 12 ++++++++++++
 3 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index 1b2d3f1..aa99278 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -49,17 +49,6 @@ wait_for_tcp_port() {
 	done
 }
 
-filter_nbd() {
-	# nbd.c error messages contain function names and line numbers that are prone
-	# to change.  Message ordering depends on timing between send and receive
-	# callbacks sometimes, making them unreliable.
-	#
-	# Filter out the TCP port number since this changes between runs.
-	sed -e 's#^.*nbd\.c:.*##g' \
-	    -e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
-            -e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
-}
-
 check_disconnect() {
 	event=$1
 	when=$2
@@ -84,7 +73,7 @@ EOF
 
 	$PYTHON nbd-fault-injector.py $extra_args "127.0.0.1:$port" "$TEST_DIR/nbd-fault-injector.conf" 2>&1 >/dev/null &
 	wait_for_tcp_port "127\\.0\\.0\\.1:$port"
-	$QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | filter_nbd
+	$QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | _filter_nbd
 
 	echo
 }
diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
index 8c1441b..5c9141b 100644
--- a/tests/qemu-iotests/083.out
+++ b/tests/qemu-iotests/083.out
@@ -51,7 +51,6 @@ no file open, try 'help open'
 
 === Check disconnect after neg2 ===
 
-
 read failed: Input/output error
 
 === Check disconnect 8 neg2 ===
@@ -66,42 +65,34 @@ no file open, try 'help open'
 
 === Check disconnect before request ===
 
-
 read failed: Input/output error
 
 === Check disconnect after request ===
 
-
 read failed: Input/output error
 
 === Check disconnect before reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect after reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect 4 reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect 8 reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect before data ===
 
-
 read failed: Input/output error
 
 === Check disconnect after data ===
 
-
 read 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
@@ -132,7 +123,6 @@ no file open, try 'help open'
 
 === Check disconnect after neg-classic ===
 
-
 read failed: Input/output error
 
 *** done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 012a812..99a1fc2 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -225,5 +225,17 @@ _filter_qemu_img_map()
         -e 's/Mapped to *//' | _filter_testdir | _filter_imgfmt
 }
 
+_filter_nbd()
+{
+    # nbd.c error messages contain function names and line numbers that are
+    # prone to change.  Message ordering depends on timing between send and
+    # receive callbacks sometimes, making them unreliable.
+    #
+    # Filter out the TCP port number since this changes between runs.
+    sed -e '/nbd\.c:/d' \
+        -e 's#nbd:\(//\)\?127\.0\.0\.1:[0-9]*#nbd:\1127.0.0.1:PORT#g' \
+        -e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
+}
+
 # make sure this script returns success
 true
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 02/11] iotests: Make redirecting qemu's stderr optional
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 01/11] iotests: Move _filter_nbd into common.filter Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 03/11] iotests: Add test for eject under NBD server Max Reitz
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

Redirecting qemu's stderr to stdout makes working with the stderr output
difficult due to the other file descriptor magic performed in
_launch_qemu ("ambiguous redirect").

Add an option which specifies whether stderr should be redirected to
stdout or not (allowing for other modes to be added in the future).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/common.qemu | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 4e1996c..1b5a554 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -131,6 +131,8 @@ function _send_qemu_cmd()
 # $qemu_comm_method: set this variable to 'monitor' (case insensitive)
 #                    to use the QEMU HMP monitor for communication.
 #                    Otherwise, the default of QMP is used.
+# $keep_stderr: Set this variable to 'y' to keep QEMU's stderr output on stderr.
+#               If this variable is empty, stderr will be redirected to stdout.
 # Returns:
 # $QEMU_HANDLE: set to a handle value to communicate with this QEMU instance.
 #
@@ -153,10 +155,18 @@ function _launch_qemu()
     mkfifo "${fifo_out}"
     mkfifo "${fifo_in}"
 
-    "${QEMU}" -nographic -serial none ${comm} -machine accel=qtest "${@}" \
+    if [ -z "$keep_stderr" ]; then
+        "${QEMU}" -nographic -serial none ${comm} -machine accel=qtest "${@}" \
                                                                 >"${fifo_out}" \
                                                                 2>&1 \
                                                                 <"${fifo_in}" &
+    elif [ "$keep_stderr" = "y" ]; then
+        "${QEMU}" -nographic -serial none ${comm} -machine accel=qtest "${@}" \
+                                                                >"${fifo_out}" \
+                                                                <"${fifo_in}" &
+    else
+        exit 1
+    fi
     QEMU_PID[${_QEMU_HANDLE}]=$!
 
     if [[ "${BASH_VERSINFO[0]}" -ge "5" ||
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 03/11] iotests: Add test for eject under NBD server
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 01/11] iotests: Move _filter_nbd into common.filter Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 02/11] iotests: Make redirecting qemu's stderr optional Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 04/11] quorum: Fix close path Max Reitz
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

This patch adds a test for ejecting the BlockBackend an NBD server is
connected to (the NBD server is supposed to stop).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/096     | 90 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/096.out | 16 +++++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 107 insertions(+)
 create mode 100755 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out

diff --git a/tests/qemu-iotests/096 b/tests/qemu-iotests/096
new file mode 100755
index 0000000..bcba0ec
--- /dev/null
+++ b/tests/qemu-iotests/096
@@ -0,0 +1,90 @@
+#!/bin/bash
+#
+# Test case for ejecting a BB with an NBD server attached to it
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=mreitz@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt generic
+_supported_proto file
+_supported_os Linux
+
+_make_test_img 64k
+
+$QEMU_IO -c 'write -P 42 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+keep_stderr=y \
+_launch_qemu -drive if=ide,media=cdrom,id=drv,file="$TEST_IMG",format=$IMGFMT \
+    2> >(_filter_nbd)
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'qmp_capabilities' }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'nbd-server-start',
+       'arguments': { 'addr': { 'type': 'inet',
+                                'data': { 'host': '127.0.0.1',
+                                          'port': '10809' }}}}" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'nbd-server-add',
+       'arguments': { 'device': 'drv' }}" \
+    'return'
+
+$QEMU_IO_PROG -f raw -c 'read -P 42 0 64k' 'nbd://127.0.0.1:10809/drv' 2>&1 \
+    | _filter_qemu_io | _filter_nbd
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'eject',
+       'arguments': { 'device': 'drv' }}" \
+    'return'
+
+$QEMU_IO_PROG -f raw -c close 'nbd://127.0.0.1:10809/drv' 2>&1 \
+    | _filter_qemu_io | _filter_nbd
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'quit' }" \
+    'return'
+
+wait=1 _cleanup_qemu
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/096.out b/tests/qemu-iotests/096.out
new file mode 100644
index 0000000..cc10e51
--- /dev/null
+++ b/tests/qemu-iotests/096.out
@@ -0,0 +1,16 @@
+QA output created by 096
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"return": {}}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drv", "tray-open": true}}
+{"return": {}}
+qemu-io: can't open device nbd://127.0.0.1:PORT/drv: Failed to read export length
+no file open, try 'help open'
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN"}
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ea43ebb..06d0485 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -102,6 +102,7 @@
 093 auto
 094 rw auto quick
 095 rw auto quick
+096 rw auto quick
 097 rw auto backing
 098 rw auto backing quick
 099 rw auto quick
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 04/11] quorum: Fix close path
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (2 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 03/11] iotests: Add test for eject under NBD server Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB Max Reitz
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

bdrv_unref() can lead to bdrv_close(), which in turn will result in
bdrv_drain_all(). This function will later be called blk_drain_all() and
iterate only over the BlockBackends for which blk_is_inserted() holds
true; therefore, bdrv_is_inserted() and thus quorum_is_inserted() will
probably be called.

This patch makes quorum_is_inserted() aware of the fact that some
children may have been closed already.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/quorum.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/quorum.c b/block/quorum.c
index 7a75cea..5ae2398 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1005,6 +1005,7 @@ static void quorum_close(BlockDriverState *bs)
 
     for (i = 0; i < s->num_children; i++) {
         bdrv_unref(s->bs[i]);
+        s->bs[i] = NULL;
     }
 
     g_free(s->bs);
@@ -1070,7 +1071,7 @@ static bool quorum_is_inserted(BlockDriverState *bs)
     int i;
 
     for (i = 0; i < s->num_children; i++) {
-        if (!bdrv_is_inserted(s->bs[i])) {
+        if (s->bs[i] && !bdrv_is_inserted(s->bs[i])) {
             return false;
         }
     }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (3 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 04/11] quorum: Fix close path Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 20:08   ` Max Reitz
  2015-02-28  2:55   ` Fam Zheng
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 06/11] block: Use blk_remove_bs() in blk_delete() Max Reitz
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

The only remaining user of the BDS close notifiers is NBD which uses
them to determine when a BDS tree is being ejected. This patch removes
the BDS-level close notifiers and adds a notifier list to the
BlockBackend structure that is invoked whenever a BDS is removed.

Symmetrically to that, another notifier list is added that is invoked
whenever a BDS is inserted. The dataplane implementations for virtio-blk
and virtio-scsi use both notifier types for setting up and removing op
blockers. This is not only important for setting up the op blockers on
insertion, but also for removing them on ejection since bdrv_delete()
asserts that there are no op blockers set up.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c                         |  7 ----
 block/block-backend.c           | 19 +++++++--
 blockdev-nbd.c                  | 36 +----------------
 hw/block/dataplane/virtio-blk.c | 77 +++++++++++++++++++++++++++---------
 hw/scsi/virtio-scsi.c           | 87 ++++++++++++++++++++++++++++++++++++++---
 include/block/block.h           |  1 -
 include/block/block_int.h       |  2 -
 include/hw/virtio/virtio-scsi.h | 10 +++++
 include/sysemu/block-backend.h  |  3 +-
 nbd.c                           | 13 ++++++
 10 files changed, 182 insertions(+), 73 deletions(-)

diff --git a/block.c b/block.c
index 7b912c6..41a9d24 100644
--- a/block.c
+++ b/block.c
@@ -371,7 +371,6 @@ BlockDriverState *bdrv_new(void)
     for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
         QLIST_INIT(&bs->op_blockers[i]);
     }
-    notifier_list_init(&bs->close_notifiers);
     notifier_with_return_list_init(&bs->before_write_notifiers);
     qemu_co_queue_init(&bs->throttled_reqs[0]);
     qemu_co_queue_init(&bs->throttled_reqs[1]);
@@ -381,11 +380,6 @@ BlockDriverState *bdrv_new(void)
     return bs;
 }
 
-void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify)
-{
-    notifier_list_add(&bs->close_notifiers, notify);
-}
-
 BlockDriver *bdrv_find_format(const char *format_name)
 {
     BlockDriver *drv1;
@@ -1898,7 +1892,6 @@ void bdrv_close(BlockDriverState *bs)
     bdrv_drain_all(); /* complete I/O */
     bdrv_flush(bs);
     bdrv_drain_all(); /* in case flush left pending I/O */
-    notifier_list_notify(&bs->close_notifiers, bs);
 
     if (bs->drv) {
         if (bs->backing_hd) {
diff --git a/block/block-backend.c b/block/block-backend.c
index 82ced04..254fde4 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -47,6 +47,8 @@ struct BlockBackend {
     BlockdevOnError on_read_error, on_write_error;
     bool iostatus_enabled;
     BlockDeviceIoStatus iostatus;
+
+    NotifierList remove_bs_notifiers, insert_bs_notifiers;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -97,6 +99,8 @@ BlockBackend *blk_new(const char *name, Error **errp)
     blk = g_new0(BlockBackend, 1);
     blk->name = g_strdup(name);
     blk->refcnt = 1;
+    notifier_list_init(&blk->remove_bs_notifiers);
+    notifier_list_init(&blk->insert_bs_notifiers);
     QTAILQ_INSERT_TAIL(&blk_backends, blk, link);
     return blk;
 }
@@ -320,6 +324,8 @@ void blk_remove_bs(BlockBackend *blk)
         return;
     }
 
+    notifier_list_notify(&blk->remove_bs_notifiers, blk);
+
     blk_update_root_state(blk);
 
     bdrv_unref(blk->bs);
@@ -345,6 +351,8 @@ void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs)
     }
 
     bs->blk = blk;
+
+    notifier_list_notify(&blk->insert_bs_notifiers, blk);
 }
 
 /*
@@ -1067,11 +1075,14 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
     }
 }
 
-void blk_add_close_notifier(BlockBackend *blk, Notifier *notify)
+void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify)
 {
-    if (blk->bs) {
-        bdrv_add_close_notifier(blk->bs, notify);
-    }
+    notifier_list_add(&blk->remove_bs_notifiers, notify);
+}
+
+void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify)
+{
+    notifier_list_add(&blk->insert_bs_notifiers, notify);
 }
 
 void blk_io_plug(BlockBackend *blk)
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 22e95d1..eb5f9a0 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -47,36 +47,11 @@ void qmp_nbd_server_start(SocketAddress *addr, Error **errp)
     }
 }
 
-/* Hook into the BlockDriverState notifiers to close the export when
- * the file is closed.
- */
-typedef struct NBDCloseNotifier {
-    Notifier n;
-    NBDExport *exp;
-    QTAILQ_ENTRY(NBDCloseNotifier) next;
-} NBDCloseNotifier;
-
-static QTAILQ_HEAD(, NBDCloseNotifier) close_notifiers =
-    QTAILQ_HEAD_INITIALIZER(close_notifiers);
-
-static void nbd_close_notifier(Notifier *n, void *data)
-{
-    NBDCloseNotifier *cn = DO_UPCAST(NBDCloseNotifier, n, n);
-
-    notifier_remove(&cn->n);
-    QTAILQ_REMOVE(&close_notifiers, cn, next);
-
-    nbd_export_close(cn->exp);
-    nbd_export_put(cn->exp);
-    g_free(cn);
-}
-
 void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
                         Error **errp)
 {
     BlockBackend *blk;
     NBDExport *exp;
-    NBDCloseNotifier *n;
 
     if (server_fd == -1) {
         error_setg(errp, "NBD server not running");
@@ -108,20 +83,11 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
     exp = nbd_export_new(blk, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL);
 
     nbd_export_set_name(exp, device);
-
-    n = g_new0(NBDCloseNotifier, 1);
-    n->n.notify = nbd_close_notifier;
-    n->exp = exp;
-    blk_add_close_notifier(blk, &n->n);
-    QTAILQ_INSERT_TAIL(&close_notifiers, n, next);
 }
 
 void qmp_nbd_server_stop(Error **errp)
 {
-    while (!QTAILQ_EMPTY(&close_notifiers)) {
-        NBDCloseNotifier *cn = QTAILQ_FIRST(&close_notifiers);
-        nbd_close_notifier(&cn->n, nbd_export_get_blockdev(cn->exp));
-    }
+    nbd_export_close_all();
 
     if (server_fd != -1) {
         qemu_set_fd_handler2(server_fd, NULL, NULL, NULL, NULL);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index cd41478..057498e 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -39,6 +39,8 @@ struct VirtIOBlockDataPlane {
     EventNotifier *guest_notifier;  /* irq */
     QEMUBH *bh;                     /* bh for guest notification */
 
+    Notifier insert_notifier, remove_notifier;
+
     /* Note that these EventNotifiers are assigned by value.  This is
      * fine as long as you do not call event_notifier_cleanup on them
      * (because you don't own the file descriptor or handle; you just
@@ -138,6 +140,54 @@ static void handle_notify(EventNotifier *e)
     blk_io_unplug(s->conf->conf.blk);
 }
 
+static void data_plane_set_up_op_blockers(VirtIOBlockDataPlane *s)
+{
+    assert(!s->blocker);
+    error_setg(&s->blocker, "block device is in use by data plane");
+    blk_op_block_all(s->conf->conf.blk, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_RESIZE, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_DRIVE_DEL, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_BACKUP_SOURCE, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_CHANGE, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_COMMIT_SOURCE, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_COMMIT_TARGET, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_EJECT, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT,
+                   s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT,
+                   s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE,
+                   s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_MIRROR, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_STREAM, s->blocker);
+    blk_op_unblock(s->conf->conf.blk, BLOCK_OP_TYPE_REPLACE, s->blocker);
+}
+
+static void data_plane_remove_op_blockers(VirtIOBlockDataPlane *s)
+{
+    if (s->blocker) {
+        blk_op_unblock_all(s->conf->conf.blk, s->blocker);
+        error_free(s->blocker);
+        s->blocker = NULL;
+    }
+}
+
+static void data_plane_blk_insert_notifier(Notifier *n, void *data)
+{
+    VirtIOBlockDataPlane *s = container_of(n, VirtIOBlockDataPlane,
+                                           insert_notifier);
+    assert(s->conf->conf.blk == data);
+    data_plane_set_up_op_blockers(s);
+}
+
+static void data_plane_blk_remove_notifier(Notifier *n, void *data)
+{
+    VirtIOBlockDataPlane *s = container_of(n, VirtIOBlockDataPlane,
+                                           remove_notifier);
+    assert(s->conf->conf.blk == data);
+    data_plane_remove_op_blockers(s);
+}
+
 /* Context: QEMU global mutex held */
 void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
                                   VirtIOBlockDataPlane **dataplane,
@@ -194,22 +244,12 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
     s->ctx = iothread_get_aio_context(s->iothread);
     s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
 
-    error_setg(&s->blocker, "block device is in use by data plane");
-    blk_op_block_all(conf->conf.blk, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_RESIZE, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_DRIVE_DEL, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_BACKUP_SOURCE, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_CHANGE, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_COMMIT_SOURCE, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_COMMIT_TARGET, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_EJECT, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE,
-                   s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_MIRROR, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_STREAM, s->blocker);
-    blk_op_unblock(conf->conf.blk, BLOCK_OP_TYPE_REPLACE, s->blocker);
+    s->insert_notifier.notify = data_plane_blk_insert_notifier;
+    s->remove_notifier.notify = data_plane_blk_remove_notifier;
+    blk_add_insert_bs_notifier(conf->conf.blk, &s->insert_notifier);
+    blk_add_remove_bs_notifier(conf->conf.blk, &s->remove_notifier);
+
+    data_plane_set_up_op_blockers(s);
 
     *dataplane = s;
 }
@@ -222,8 +262,9 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     }
 
     virtio_blk_data_plane_stop(s);
-    blk_op_unblock_all(s->conf->conf.blk, s->blocker);
-    error_free(s->blocker);
+    data_plane_remove_op_blockers(s);
+    notifier_remove(&s->insert_notifier);
+    notifier_remove(&s->remove_notifier);
     object_unref(OBJECT(s->iothread));
     qemu_bh_delete(s->bh);
     g_free(s);
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 5469bad..570f2fd 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -755,6 +755,37 @@ static void virtio_scsi_change(SCSIBus *bus, SCSIDevice *dev, SCSISense sense)
     }
 }
 
+static void virtio_scsi_set_up_op_blockers(VirtIOSCSI *s, SCSIDevice *sd)
+{
+    if (!s->blocker) {
+        error_setg(&s->blocker, "block device is in use by data plane");
+    }
+    blk_op_block_all(sd->conf.blk, s->blocker);
+}
+
+static void virtio_scsi_remove_op_blockers(VirtIOSCSI *s, SCSIDevice *sd)
+{
+    if (s->blocker) {
+        blk_op_unblock_all(sd->conf.blk, s->blocker);
+    }
+}
+
+static void virtio_scsi_blk_insert_notifier(Notifier *n, void *data)
+{
+    VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
+                                                n, n);
+    assert(cn->sd->conf.blk == data);
+    virtio_scsi_set_up_op_blockers(cn->s, cn->sd);
+}
+
+static void virtio_scsi_blk_remove_notifier(Notifier *n, void *data)
+{
+    VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
+                                                n, n);
+    assert(cn->sd->conf.blk == data);
+    virtio_scsi_remove_op_blockers(cn->s, cn->sd);
+}
+
 static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                 Error **errp)
 {
@@ -763,12 +794,26 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     SCSIDevice *sd = SCSI_DEVICE(dev);
 
     if (s->ctx && !s->dataplane_disabled) {
+        VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
+
+        insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
+        insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
+        insert_notifier->s = s;
+        insert_notifier->sd = sd;
+        blk_add_insert_bs_notifier(sd->conf.blk, &insert_notifier->n);
+        QTAILQ_INSERT_TAIL(&s->insert_notifiers, insert_notifier, next);
+
+        remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
+        remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
+        remove_notifier->s = s;
+        remove_notifier->sd = sd;
+        blk_add_remove_bs_notifier(sd->conf.blk, &remove_notifier->n);
+        QTAILQ_INSERT_TAIL(&s->remove_notifiers, remove_notifier, next);
+
         if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
             return;
         }
-        assert(!s->blocker);
-        error_setg(&s->blocker, "block device is in use by data plane");
-        blk_op_block_all(sd->conf.blk, s->blocker);
+        virtio_scsi_set_up_op_blockers(s, sd);
     }
 
     if ((vdev->guest_features >> VIRTIO_SCSI_F_HOTPLUG) & 1) {
@@ -784,6 +829,7 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev);
     VirtIOSCSI *s = VIRTIO_SCSI(vdev);
     SCSIDevice *sd = SCSI_DEVICE(dev);
+    VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
 
     if ((vdev->guest_features >> VIRTIO_SCSI_F_HOTPLUG) & 1) {
         virtio_scsi_push_event(s, sd,
@@ -791,11 +837,39 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                VIRTIO_SCSI_EVT_RESET_REMOVED);
     }
 
-    if (s->ctx && s->blocker) {
-        blk_op_unblock_all(sd->conf.blk, s->blocker);
+    if (s->ctx) {
+        virtio_scsi_remove_op_blockers(s, sd);
+    }
+
+    QTAILQ_FOREACH(insert_notifier, &s->insert_notifiers, next) {
+        if (insert_notifier->sd == sd) {
+            break;
+        }
+    }
+    if (insert_notifier) {
+        notifier_remove(&insert_notifier->n);
+        QTAILQ_REMOVE(&s->insert_notifiers, insert_notifier, next);
+        g_free(insert_notifier);
+    }
+
+    QTAILQ_FOREACH(remove_notifier, &s->remove_notifiers, next) {
+        if (remove_notifier->sd == sd) {
+            break;
+        }
+    }
+    if (remove_notifier) {
+        notifier_remove(&remove_notifier->n);
+        QTAILQ_REMOVE(&s->remove_notifiers, remove_notifier, next);
+        g_free(remove_notifier);
+    }
+
+    if (QTAILQ_EMPTY(&s->insert_notifiers) &&
+        QTAILQ_EMPTY(&s->remove_notifiers))
+    {
         error_free(s->blocker);
         s->blocker = NULL;
     }
+
     qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
 }
 
@@ -908,6 +982,9 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp)
                     virtio_scsi_save, virtio_scsi_load, s);
     s->migration_state_notifier.notify = virtio_scsi_migration_state_changed;
     add_migration_state_change_notifier(&s->migration_state_notifier);
+
+    QTAILQ_INIT(&s->insert_notifiers);
+    QTAILQ_INIT(&s->remove_notifiers);
 }
 
 static void virtio_scsi_instance_init(Object *obj)
diff --git a/include/block/block.h b/include/block/block.h
index 1422f01..dc94084 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -207,7 +207,6 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
 void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 void bdrv_close(BlockDriverState *bs);
-void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify);
 int bdrv_read(BlockDriverState *bs, int64_t sector_num,
               uint8_t *buf, int nb_sectors);
 int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8fcfc29..b2c1d87 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -367,8 +367,6 @@ struct BlockDriverState {
     BlockDriverState *backing_hd;
     BlockDriverState *file;
 
-    NotifierList close_notifiers;
-
     /* Callback before write request is processed */
     NotifierWithReturnList before_write_notifiers;
 
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index bf17cc9..3f90fc8 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -176,6 +176,13 @@ typedef struct VirtIOSCSICommon {
     VirtQueue **cmd_vqs;
 } VirtIOSCSICommon;
 
+typedef struct VirtIOSCSIBlkChangeNotifier {
+    Notifier n;
+    struct VirtIOSCSI *s;
+    SCSIDevice *sd;
+    QTAILQ_ENTRY(VirtIOSCSIBlkChangeNotifier) next;
+} VirtIOSCSIBlkChangeNotifier;
+
 typedef struct VirtIOSCSI {
     VirtIOSCSICommon parent_obj;
 
@@ -186,6 +193,9 @@ typedef struct VirtIOSCSI {
     /* Fields for dataplane below */
     AioContext *ctx; /* one iothread per virtio-scsi-pci for now */
 
+    QTAILQ_HEAD(, VirtIOSCSIBlkChangeNotifier) insert_notifiers;
+    QTAILQ_HEAD(, VirtIOSCSIBlkChangeNotifier) remove_notifiers;
+
     /* Vring is used instead of vq in dataplane code, because of the underlying
      * memory layer thread safety */
     VirtIOSCSIVring *ctrl_vring;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 3aad010..e0a2749 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -158,7 +158,8 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
                                                                   void *),
                                      void (*detach_aio_context)(void *),
                                      void *opaque);
-void blk_add_close_notifier(BlockBackend *blk, Notifier *notify);
+void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify);
+void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify);
 void blk_io_plug(BlockBackend *blk);
 void blk_io_unplug(BlockBackend *blk);
 BlockAcctStats *blk_get_stats(BlockBackend *blk);
diff --git a/nbd.c b/nbd.c
index 71159af..e1796c0 100644
--- a/nbd.c
+++ b/nbd.c
@@ -109,6 +109,8 @@ struct NBDExport {
     QTAILQ_ENTRY(NBDExport) next;
 
     AioContext *ctx;
+
+    Notifier eject_notifier;
 };
 
 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
@@ -964,6 +966,12 @@ static void blk_aio_detach(void *opaque)
     exp->ctx = NULL;
 }
 
+static void nbd_eject_notifier(Notifier *n, void *data)
+{
+    NBDExport *exp = container_of(n, NBDExport, eject_notifier);
+    nbd_export_close(exp);
+}
+
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
                           uint32_t nbdflags, void (*close)(NBDExport *))
 {
@@ -978,6 +986,10 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
     exp->ctx = blk_get_aio_context(blk);
     blk_ref(blk);
     blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp);
+
+    exp->eject_notifier.notify = nbd_eject_notifier;
+    blk_add_remove_bs_notifier(blk, &exp->eject_notifier);
+
     /*
      * NBD exports are used for non-shared storage migration.  Make sure
      * that BDRV_O_INCOMING is cleared and the image is ready for write
@@ -1031,6 +1043,7 @@ void nbd_export_close(NBDExport *exp)
     nbd_export_set_name(exp, NULL);
     nbd_export_put(exp);
     if (exp->blk) {
+        notifier_remove(&exp->eject_notifier);
         blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
                                         blk_aio_detach, exp);
         blk_unref(exp->blk);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 06/11] block: Use blk_remove_bs() in blk_delete()
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (4 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 07/11] blockdev: Use blk_remove_bs() in do_drive_del() Max Reitz
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/block-backend.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 254fde4..7e9d53a 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -163,12 +163,7 @@ static void blk_delete(BlockBackend *blk)
 {
     assert(!blk->refcnt);
     assert(!blk->dev);
-    if (blk->bs) {
-        assert(blk->bs->blk == blk);
-        blk->bs->blk = NULL;
-        bdrv_unref(blk->bs);
-        blk->bs = NULL;
-    }
+    blk_remove_bs(blk);
     /* Avoid double-remove after blk_hide_on_behalf_of_do_drive_del() */
     if (blk->name[0]) {
         QTAILQ_REMOVE(&blk_backends, blk, link);
@@ -324,6 +319,8 @@ void blk_remove_bs(BlockBackend *blk)
         return;
     }
 
+    assert(blk->bs->blk == blk);
+
     notifier_list_notify(&blk->remove_bs_notifiers, blk);
 
     blk_update_root_state(blk);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 07/11] blockdev: Use blk_remove_bs() in do_drive_del()
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (5 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 06/11] block: Use blk_remove_bs() in blk_delete() Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 08/11] block: Make bdrv_close() static Max Reitz
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 blockdev.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index f198be6..e5f5ebc 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2229,11 +2229,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
     }
 
     /* quiesce block driver; prevent further io */
-    bdrv_drain_all();
-    if (bs) {
-        bdrv_flush(bs);
-        bdrv_close(bs);
-    }
+    blk_remove_bs(blk);
 
     /* if we have a device attached to this BlockDriverState
      * then we need to make the drive anonymous until the device
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 08/11] block: Make bdrv_close() static
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (6 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 07/11] blockdev: Use blk_remove_bs() in do_drive_del() Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 09/11] blockdev: Keep track of monitor-owned BDS Max Reitz
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

There are no users of bdrv_close() left, except for one of bdrv_open()'s
failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
Make bdrv_close() static so nobody makes the mistake of directly using
bdrv_close() again.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block.c               | 4 +++-
 include/block/block.h | 1 -
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 41a9d24..819c54d 100644
--- a/block.c
+++ b/block.c
@@ -104,6 +104,8 @@ static void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
+static void bdrv_close(BlockDriverState *bs);
+
 #ifdef _WIN32
 static int is_windows_drive_prefix(const char *filename)
 {
@@ -1882,7 +1884,7 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 }
 
 
-void bdrv_close(BlockDriverState *bs)
+static void bdrv_close(BlockDriverState *bs)
 {
     BdrvAioNotifier *ban, *ban_next;
 
diff --git a/include/block/block.h b/include/block/block.h
index dc94084..eadb25f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -206,7 +206,6 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
                         BlockReopenQueue *queue, Error **errp);
 void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 void bdrv_reopen_abort(BDRVReopenState *reopen_state);
-void bdrv_close(BlockDriverState *bs);
 int bdrv_read(BlockDriverState *bs, int64_t sector_num,
               uint8_t *buf, int nb_sectors);
 int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 09/11] blockdev: Keep track of monitor-owned BDS
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (7 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 08/11] block: Make bdrv_close() static Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 10/11] block: Eject BDS tree from BB at bdrv_close_all() Max Reitz
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

...and release the reference on bdrv_close_all().

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c                                |  5 +++++
 blockdev.c                             | 18 ++++++++++++++++++
 include/block/block_int.h              |  4 ++++
 stubs/Makefile.objs                    |  1 +
 stubs/blockdev-close-all-bdrv-states.c |  5 +++++
 5 files changed, 33 insertions(+)
 create mode 100644 stubs/blockdev-close-all-bdrv-states.c

diff --git a/block.c b/block.c
index 819c54d..0b0792c 100644
--- a/block.c
+++ b/block.c
@@ -1943,6 +1943,8 @@ void bdrv_close_all(void)
 {
     BlockDriverState *bs;
 
+    blockdev_close_all_bdrv_states();
+
     QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
         AioContext *aio_context = bdrv_get_aio_context(bs);
 
@@ -2090,6 +2092,9 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
 
     /* keep the same entry in bdrv_states */
     bs_dest->device_list = bs_src->device_list;
+    /* keep the same entry in the list of monitor-owned BDS */
+    bs_dest->monitor_list = bs_src->monitor_list;
+
     bs_dest->blk = bs_src->blk;
 
     memcpy(bs_dest->op_blockers, bs_src->op_blockers,
diff --git a/blockdev.c b/blockdev.c
index e5f5ebc..535e83f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -47,6 +47,9 @@
 #include "trace.h"
 #include "sysemu/arch_init.h"
 
+static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
+    QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
+
 static const char *const if_name[IF_COUNT] = {
     [IF_NONE] = "none",
     [IF_IDE] = "ide",
@@ -648,6 +651,19 @@ fail:
     return NULL;
 }
 
+void blockdev_close_all_bdrv_states(void)
+{
+    BlockDriverState *bs, *next_bs;
+
+    QTAILQ_FOREACH_SAFE(bs, &monitor_bdrv_states, monitor_list, next_bs) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        bdrv_unref(bs);
+        aio_context_release(ctx);
+    }
+}
+
 static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to,
                             Error **errp)
 {
@@ -3166,6 +3182,8 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
         if (!bs) {
             goto fail;
         }
+
+        QTAILQ_INSERT_TAIL(&monitor_bdrv_states, bs, monitor_list);
     }
 
     if (bs && bdrv_key_required(bs)) {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index b2c1d87..1b01f45 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -399,6 +399,8 @@ struct BlockDriverState {
     QTAILQ_ENTRY(BlockDriverState) node_list;
     /* element of the list of "drives" the guest sees */
     QTAILQ_ENTRY(BlockDriverState) device_list;
+    /* element of the list of monitor-owned BDS */
+    QTAILQ_ENTRY(BlockDriverState) monitor_list;
     QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
     int refcnt;
 
@@ -622,4 +624,6 @@ bool blk_dev_is_tray_open(BlockBackend *blk);
 bool blk_dev_is_medium_locked(BlockBackend *blk);
 void blk_dev_resize_cb(BlockBackend *blk);
 
+void blockdev_close_all_bdrv_states(void);
+
 #endif /* BLOCK_INT_H */
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 5e347d0..9169a09 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -1,5 +1,6 @@
 stub-obj-y += arch-query-cpu-def.o
 stub-obj-y += bdrv-commit-all.o
+stub-obj-y += blockdev-close-all-bdrv-states.o
 stub-obj-y += chr-baum-init.o
 stub-obj-y += chr-msmouse.o
 stub-obj-y += chr-testdev.o
diff --git a/stubs/blockdev-close-all-bdrv-states.c b/stubs/blockdev-close-all-bdrv-states.c
new file mode 100644
index 0000000..12d2442
--- /dev/null
+++ b/stubs/blockdev-close-all-bdrv-states.c
@@ -0,0 +1,5 @@
+#include "block/block_int.h"
+
+void blockdev_close_all_bdrv_states(void)
+{
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 10/11] block: Eject BDS tree from BB at bdrv_close_all()
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (8 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 09/11] blockdev: Keep track of monitor-owned BDS Max Reitz
@ 2015-02-27 16:43 ` Max Reitz
  2015-02-27 16:44 ` [Qemu-devel] [PATCH v4 11/11] iotests: Add test for multiple BB on BDS tree Max Reitz
  2015-03-02  9:25 ` [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Kevin Wolf
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:43 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c                        | 11 +----------
 block/block-backend.c          | 13 +++++++++++++
 include/sysemu/block-backend.h |  1 +
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 0b0792c..a2637b6 100644
--- a/block.c
+++ b/block.c
@@ -1941,17 +1941,8 @@ static void bdrv_close(BlockDriverState *bs)
 
 void bdrv_close_all(void)
 {
-    BlockDriverState *bs;
-
     blockdev_close_all_bdrv_states();
-
-    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
-        AioContext *aio_context = bdrv_get_aio_context(bs);
-
-        aio_context_acquire(aio_context);
-        bdrv_close(bs);
-        aio_context_release(aio_context);
-    }
+    blk_remove_all_bs();
 }
 
 /* Check if any requests are in-flight (including throttled requests) */
diff --git a/block/block-backend.c b/block/block-backend.c
index 7e9d53a..2e820fe 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -207,6 +207,19 @@ void blk_unref(BlockBackend *blk)
     }
 }
 
+void blk_remove_all_bs(void)
+{
+    BlockBackend *blk;
+
+    QTAILQ_FOREACH(blk, &blk_backends, link) {
+        AioContext *ctx = blk_get_aio_context(blk);
+
+        aio_context_acquire(ctx);
+        blk_remove_bs(blk);
+        aio_context_release(ctx);
+    }
+}
+
 /*
  * Return the BlockBackend after @blk.
  * If @blk is null, return the first one.
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index e0a2749..ab765a7 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -67,6 +67,7 @@ BlockBackend *blk_new_open(const char *name, const char *filename,
                            Error **errp);
 void blk_ref(BlockBackend *blk);
 void blk_unref(BlockBackend *blk);
+void blk_remove_all_bs(void);
 const char *blk_name(BlockBackend *blk);
 BlockBackend *blk_by_name(const char *name);
 BlockBackend *blk_next(BlockBackend *blk);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v4 11/11] iotests: Add test for multiple BB on BDS tree
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (9 preceding siblings ...)
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 10/11] block: Eject BDS tree from BB at bdrv_close_all() Max Reitz
@ 2015-02-27 16:44 ` Max Reitz
  2015-03-02  9:25 ` [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Kevin Wolf
  11 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 16:44 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini, Max Reitz

This adds a test for having multiple BlockBackends in one BDS tree. In
this case, there is one BB for the protocol BDS and one BB for the
format BDS in a simple two-BDS tree (with the protocol BDS and BB added
first).

When bdrv_close_all() is executed, no cached data from any BDS should be
lost; the protocol BDS may not be closed until the format BDS is closed.
Otherwise, metadata updates may be lost.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/117     | 86 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/117.out | 14 ++++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 101 insertions(+)
 create mode 100755 tests/qemu-iotests/117
 create mode 100644 tests/qemu-iotests/117.out

diff --git a/tests/qemu-iotests/117 b/tests/qemu-iotests/117
new file mode 100755
index 0000000..7881fc7
--- /dev/null
+++ b/tests/qemu-iotests/117
@@ -0,0 +1,86 @@
+#!/bin/bash
+#
+# Test case for shared BDS between backend trees
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=mreitz@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+_make_test_img 64k
+
+_launch_qemu
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'qmp_capabilities' }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'blockdev-add',
+       'arguments': { 'options': { 'id': 'protocol',
+                                   'driver': 'file',
+                                   'filename': '$TEST_IMG' } } }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'blockdev-add',
+       'arguments': { 'options': { 'id': 'format',
+                                   'driver': '$IMGFMT',
+                                   'file': 'protocol' } } }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'human-monitor-command',
+       'arguments': { 'command-line': 'qemu-io format \"write -P 42 0 64k\"' } }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'quit' }" \
+    'return'
+
+wait=1 _cleanup_qemu
+
+_check_test_img
+
+$QEMU_IO -c 'read -P 42 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/117.out b/tests/qemu-iotests/117.out
new file mode 100644
index 0000000..f52dc1a
--- /dev/null
+++ b/tests/qemu-iotests/117.out
@@ -0,0 +1,14 @@
+QA output created by 117
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+{"return": {}}
+{"return": {}}
+{"return": {}}
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": ""}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN"}
+No errors were found on the image.
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 06d0485..7436300 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -121,5 +121,6 @@
 113 rw auto quick
 114 rw auto quick
 116 rw auto quick
+117 rw auto
 118 rw auto
 123 rw auto quick
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB Max Reitz
@ 2015-02-27 20:08   ` Max Reitz
  2015-02-28  2:55   ` Fam Zheng
  1 sibling, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-02-27 20:08 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On 2015-02-27 at 11:43, Max Reitz wrote:
> The only remaining user of the BDS close notifiers is NBD which uses
> them to determine when a BDS tree is being ejected. This patch removes
> the BDS-level close notifiers and adds a notifier list to the
> BlockBackend structure that is invoked whenever a BDS is removed.
>
> Symmetrically to that, another notifier list is added that is invoked
> whenever a BDS is inserted. The dataplane implementations for virtio-blk
> and virtio-scsi use both notifier types for setting up and removing op
> blockers. This is not only important for setting up the op blockers on
> insertion, but also for removing them on ejection since bdrv_delete()
> asserts that there are no op blockers set up.
>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c                         |  7 ----
>   block/block-backend.c           | 19 +++++++--
>   blockdev-nbd.c                  | 36 +----------------
>   hw/block/dataplane/virtio-blk.c | 77 +++++++++++++++++++++++++++---------
>   hw/scsi/virtio-scsi.c           | 87 ++++++++++++++++++++++++++++++++++++++---
>   include/block/block.h           |  1 -
>   include/block/block_int.h       |  2 -
>   include/hw/virtio/virtio-scsi.h | 10 +++++
>   include/sysemu/block-backend.h  |  3 +-
>   nbd.c                           | 13 ++++++
>   10 files changed, 182 insertions(+), 73 deletions(-)

[snip]

> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 5469bad..570f2fd 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -755,6 +755,37 @@ static void virtio_scsi_change(SCSIBus *bus, SCSIDevice *dev, SCSISense sense)
>       }
>   }
>   
> +static void virtio_scsi_set_up_op_blockers(VirtIOSCSI *s, SCSIDevice *sd)
> +{
> +    if (!s->blocker) {
> +        error_setg(&s->blocker, "block device is in use by data plane");

This will no longer be necessary with v2 of my virtio-scsi op blocker patch.

> +    }
> +    blk_op_block_all(sd->conf.blk, s->blocker);
> +}
> +
> +static void virtio_scsi_remove_op_blockers(VirtIOSCSI *s, SCSIDevice *sd)
> +{
> +    if (s->blocker) {
> +        blk_op_unblock_all(sd->conf.blk, s->blocker);
> +    }
> +}
> +
> +static void virtio_scsi_blk_insert_notifier(Notifier *n, void *data)
> +{
> +    VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
> +                                                n, n);
> +    assert(cn->sd->conf.blk == data);
> +    virtio_scsi_set_up_op_blockers(cn->s, cn->sd);
> +}
> +
> +static void virtio_scsi_blk_remove_notifier(Notifier *n, void *data)
> +{
> +    VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
> +                                                n, n);
> +    assert(cn->sd->conf.blk == data);
> +    virtio_scsi_remove_op_blockers(cn->s, cn->sd);
> +}
> +
>   static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                   Error **errp)
>   {
> @@ -763,12 +794,26 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       SCSIDevice *sd = SCSI_DEVICE(dev);
>   
>       if (s->ctx && !s->dataplane_disabled) {
> +        VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
> +
> +        insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +        insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
> +        insert_notifier->s = s;
> +        insert_notifier->sd = sd;
> +        blk_add_insert_bs_notifier(sd->conf.blk, &insert_notifier->n);
> +        QTAILQ_INSERT_TAIL(&s->insert_notifiers, insert_notifier, next);
> +
> +        remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +        remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
> +        remove_notifier->s = s;
> +        remove_notifier->sd = sd;
> +        blk_add_remove_bs_notifier(sd->conf.blk, &remove_notifier->n);
> +        QTAILQ_INSERT_TAIL(&s->remove_notifiers, remove_notifier, next);
> +
>           if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
>               return;
>           }
> -        assert(!s->blocker);
> -        error_setg(&s->blocker, "block device is in use by data plane");

Same for these two lines removed (they just don't exist any more).

> -        blk_op_block_all(sd->conf.blk, s->blocker);
> +        virtio_scsi_set_up_op_blockers(s, sd);
>       }
>   
>       if ((vdev->guest_features >> VIRTIO_SCSI_F_HOTPLUG) & 1) {
> @@ -784,6 +829,7 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev);
>       VirtIOSCSI *s = VIRTIO_SCSI(vdev);
>       SCSIDevice *sd = SCSI_DEVICE(dev);
> +    VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
>   
>       if ((vdev->guest_features >> VIRTIO_SCSI_F_HOTPLUG) & 1) {
>           virtio_scsi_push_event(s, sd,
> @@ -791,11 +837,39 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                  VIRTIO_SCSI_EVT_RESET_REMOVED);
>       }
>   
> -    if (s->ctx && s->blocker) {
> -        blk_op_unblock_all(sd->conf.blk, s->blocker);
> +    if (s->ctx) {
> +        virtio_scsi_remove_op_blockers(s, sd);
> +    }
> +
> +    QTAILQ_FOREACH(insert_notifier, &s->insert_notifiers, next) {
> +        if (insert_notifier->sd == sd) {
> +            break;
> +        }
> +    }
> +    if (insert_notifier) {
> +        notifier_remove(&insert_notifier->n);
> +        QTAILQ_REMOVE(&s->insert_notifiers, insert_notifier, next);
> +        g_free(insert_notifier);
> +    }
> +
> +    QTAILQ_FOREACH(remove_notifier, &s->remove_notifiers, next) {
> +        if (remove_notifier->sd == sd) {
> +            break;
> +        }
> +    }
> +    if (remove_notifier) {
> +        notifier_remove(&remove_notifier->n);
> +        QTAILQ_REMOVE(&s->remove_notifiers, remove_notifier, next);
> +        g_free(remove_notifier);
> +    }
> +
> +    if (QTAILQ_EMPTY(&s->insert_notifiers) &&
> +        QTAILQ_EMPTY(&s->remove_notifiers))
> +    {
>           error_free(s->blocker);
>           s->blocker = NULL;
>       }
> +

And for this last block.

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB
  2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB Max Reitz
  2015-02-27 20:08   ` Max Reitz
@ 2015-02-28  2:55   ` Fam Zheng
  2015-03-02 15:18     ` Max Reitz
  1 sibling, 1 reply; 20+ messages in thread
From: Fam Zheng @ 2015-02-28  2:55 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On Fri, 02/27 11:43, Max Reitz wrote:
>  static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                  Error **errp)
>  {
> @@ -763,12 +794,26 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      SCSIDevice *sd = SCSI_DEVICE(dev);
>  
>      if (s->ctx && !s->dataplane_disabled) {
> +        VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
> +
> +        insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +        insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
> +        insert_notifier->s = s;
> +        insert_notifier->sd = sd;
> +        blk_add_insert_bs_notifier(sd->conf.blk, &insert_notifier->n);
> +        QTAILQ_INSERT_TAIL(&s->insert_notifiers, insert_notifier, next);

Could you instead embed a Notifier into SCSIDevice, similarly? That way there
is no need to maintain a list in VirtIOSCSI.

> +
> +        remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +        remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
> +        remove_notifier->s = s;
> +        remove_notifier->sd = sd;
> +        blk_add_remove_bs_notifier(sd->conf.blk, &remove_notifier->n);
> +        QTAILQ_INSERT_TAIL(&s->remove_notifiers, remove_notifier, next);
> +
>          if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
>              return;
>          }
> -        assert(!s->blocker);
> -        error_setg(&s->blocker, "block device is in use by data plane");
> -        blk_op_block_all(sd->conf.blk, s->blocker);
> +        virtio_scsi_set_up_op_blockers(s, sd);
>      }

Fam

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all()
  2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
                   ` (10 preceding siblings ...)
  2015-02-27 16:44 ` [Qemu-devel] [PATCH v4 11/11] iotests: Add test for multiple BB on BDS tree Max Reitz
@ 2015-03-02  9:25 ` Kevin Wolf
  2015-03-02 15:15   ` Max Reitz
  11 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2015-03-02  9:25 UTC (permalink / raw)
  To: Max Reitz
  Cc: Fam Zheng, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
> Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
> which can lead to data corruption (see the iotest added in the final
> patch of this series) and is most certainly very ugly.
> 
> This series reworks bdrv_close_all() to instead eject the BDS trees from
> all BlockBackends and then close the monitor-owned BDS trees, which are
> the only BDSs without a BB. In effect, all BDSs are closed just by
> getting closed automatically due to their reference count becoming 0.
> 
> The benefit over the approach taken in v1 and v2 is that in device
> models we often cannot simply drop the reference to a BB because there
> may be some user which we forgot about. By ejecting the BDS trees from
> the BB, the BB itself becomes unusable, but in a clean way (it will
> return errors when accessed, but nothing will crash). Also, it is much
> simpler (no reference tracking necessary).
> 
> The only disadvantage (I can see) is that the BBs are leaked; but this
> does not matter because the qemu process is about to exit anyway.

I haven't looked at the actual patches yet, but just from this
description and the diffstat: We need to make sure that the refcount
really drops to 0. That is, if there are NBD servers, block jobs, etc.
that hold an additional reference, we must make sure to stop them. It
doesn't look like this series takes care of this, does it?

Hm... Perhaps we could even install an atexit handler that asserts that
all BDSes are really gone in the end?

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all()
  2015-03-02  9:25 ` [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Kevin Wolf
@ 2015-03-02 15:15   ` Max Reitz
  2015-03-02 15:57     ` Kevin Wolf
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-03-02 15:15 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On 2015-03-02 at 04:25, Kevin Wolf wrote:
> Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
>> Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
>> which can lead to data corruption (see the iotest added in the final
>> patch of this series) and is most certainly very ugly.
>>
>> This series reworks bdrv_close_all() to instead eject the BDS trees from
>> all BlockBackends and then close the monitor-owned BDS trees, which are
>> the only BDSs without a BB. In effect, all BDSs are closed just by
>> getting closed automatically due to their reference count becoming 0.
>>
>> The benefit over the approach taken in v1 and v2 is that in device
>> models we often cannot simply drop the reference to a BB because there
>> may be some user which we forgot about. By ejecting the BDS trees from
>> the BB, the BB itself becomes unusable, but in a clean way (it will
>> return errors when accessed, but nothing will crash). Also, it is much
>> simpler (no reference tracking necessary).
>>
>> The only disadvantage (I can see) is that the BBs are leaked; but this
>> does not matter because the qemu process is about to exit anyway.
> I haven't looked at the actual patches yet, but just from this
> description and the diffstat: We need to make sure that the refcount
> really drops to 0.

If you mean the BBs: Tried that in v2, doesn't seem to work. I had my 
fun tracing around where the handles to the BBs for device models stay 
and who might have access to them, but when I say "fun" I mean that 
ironically.

If you mean the BDSs: See below (below below).

> That is, if there are NBD servers, block jobs, etc.
> that hold an additional reference, we must make sure to stop them. It
> doesn't look like this series takes care of this, does it?

Everyone using BBs is fine; for instance, NBD servers are stopped (they 
register a notifier for when the BDS tree is ejected from a BB).

So block jobs are not handled, once because I still didn't get around to 
make them work on BBs instead of BDSs (and I probably never will, 
although they really should), and because I missed the bdrv_ref() in 
block_job_create()

I guess that means manually cancelling all block jobs in bdrv_close_all()...

> Hm... Perhaps we could even install an atexit handler that asserts that
> all BDSes are really gone in the end?

I had that for BBs in v1 and v2, it was just a function that was called 
at the end of bdrv_close_all(). For that to work for BDSs, however, we'd 
need a list of all BDSs and assert that it's empty. I found that a bit 
too much.

A simpler but uglier way would be a counter that's incremented every 
time a BDS is created and decremented every time one is deleted (or 
incremented once per refcount increment and drecrement once per refcount 
decrement). We'd then assert that this counter is 0.

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB
  2015-02-28  2:55   ` Fam Zheng
@ 2015-03-02 15:18     ` Max Reitz
  2015-03-03  2:08       ` Fam Zheng
  0 siblings, 1 reply; 20+ messages in thread
From: Max Reitz @ 2015-03-02 15:18 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On 2015-02-27 at 21:55, Fam Zheng wrote:
> On Fri, 02/27 11:43, Max Reitz wrote:
>>   static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>                                   Error **errp)
>>   {
>> @@ -763,12 +794,26 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>       SCSIDevice *sd = SCSI_DEVICE(dev);
>>   
>>       if (s->ctx && !s->dataplane_disabled) {
>> +        VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
>> +
>> +        insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
>> +        insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
>> +        insert_notifier->s = s;
>> +        insert_notifier->sd = sd;
>> +        blk_add_insert_bs_notifier(sd->conf.blk, &insert_notifier->n);
>> +        QTAILQ_INSERT_TAIL(&s->insert_notifiers, insert_notifier, next);
> Could you instead embed a Notifier into SCSIDevice, similarly? That way there
> is no need to maintain a list in VirtIOSCSI.

I thought about that, but somehow SCSIDevice seemed so generic to me... 
I didn't really want to add a VirtIOSCSI pointer to it (or even just a 
notifier only to be used by virtio-scsi).

So I take it you want me to?

Max

>> +
>> +        remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
>> +        remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
>> +        remove_notifier->s = s;
>> +        remove_notifier->sd = sd;
>> +        blk_add_remove_bs_notifier(sd->conf.blk, &remove_notifier->n);
>> +        QTAILQ_INSERT_TAIL(&s->remove_notifiers, remove_notifier, next);
>> +
>>           if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
>>               return;
>>           }
>> -        assert(!s->blocker);
>> -        error_setg(&s->blocker, "block device is in use by data plane");
>> -        blk_op_block_all(sd->conf.blk, s->blocker);
>> +        virtio_scsi_set_up_op_blockers(s, sd);
>>       }
> Fam

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all()
  2015-03-02 15:15   ` Max Reitz
@ 2015-03-02 15:57     ` Kevin Wolf
  2015-03-02 16:03       ` Max Reitz
  0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2015-03-02 15:57 UTC (permalink / raw)
  To: Max Reitz
  Cc: Fam Zheng, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

Am 02.03.2015 um 16:15 hat Max Reitz geschrieben:
> On 2015-03-02 at 04:25, Kevin Wolf wrote:
> >Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
> >>Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
> >>which can lead to data corruption (see the iotest added in the final
> >>patch of this series) and is most certainly very ugly.
> >>
> >>This series reworks bdrv_close_all() to instead eject the BDS trees from
> >>all BlockBackends and then close the monitor-owned BDS trees, which are
> >>the only BDSs without a BB. In effect, all BDSs are closed just by
> >>getting closed automatically due to their reference count becoming 0.
> >>
> >>The benefit over the approach taken in v1 and v2 is that in device
> >>models we often cannot simply drop the reference to a BB because there
> >>may be some user which we forgot about. By ejecting the BDS trees from
> >>the BB, the BB itself becomes unusable, but in a clean way (it will
> >>return errors when accessed, but nothing will crash). Also, it is much
> >>simpler (no reference tracking necessary).
> >>
> >>The only disadvantage (I can see) is that the BBs are leaked; but this
> >>does not matter because the qemu process is about to exit anyway.
> >I haven't looked at the actual patches yet, but just from this
> >description and the diffstat: We need to make sure that the refcount
> >really drops to 0.
> 
> If you mean the BBs: Tried that in v2, doesn't seem to work. I had
> my fun tracing around where the handles to the BBs for device models
> stay and who might have access to them, but when I say "fun" I mean
> that ironically.
> 
> If you mean the BDSs: See below (below below).

Yes, I mean BDSes.

> >That is, if there are NBD servers, block jobs, etc.
> >that hold an additional reference, we must make sure to stop them. It
> >doesn't look like this series takes care of this, does it?
> 
> Everyone using BBs is fine; for instance, NBD servers are stopped
> (they register a notifier for when the BDS tree is ejected from a
> BB).

Good point. Even if the NBD server were not stopped, it still would
hold a reference only to the BB and not to the BDS, so the BDS would
correctly be closed.

> So block jobs are not handled, once because I still didn't get
> around to make them work on BBs instead of BDSs (and I probably
> never will, although they really should), and because I missed the
> bdrv_ref() in block_job_create()
> 
> I guess that means manually cancelling all block jobs in bdrv_close_all()...

Even in the far future, will block jobs really always work on a BB? Even
if they modify some node in the middle of the chain?

Anyway, for now, we probably need to cancel them, yes. Essentially you
end up calling everything manually that had to register a notifier in
the earlier versions.

> >Hm... Perhaps we could even install an atexit handler that asserts that
> >all BDSes are really gone in the end?
> 
> I had that for BBs in v1 and v2, it was just a function that was
> called at the end of bdrv_close_all(). For that to work for BDSs,
> however, we'd need a list of all BDSs and assert that it's empty. I
> found that a bit too much.

The list of all relevant BDSes should be the union of bdrv_states and
bdrv_graph_states, and I think it doesn't have to be empty, but all of
its entries have to be closed (bs->drv == NULL).

It's not 100% complete because we have these stupid QMP commands that
address BDSes using filenames instead of node or BB names, but it's
probably a useful subset to check.

> A simpler but uglier way would be a counter that's incremented every
> time a BDS is created and decremented every time one is deleted (or
> incremented once per refcount increment and drecrement once per
> refcount decrement). We'd then assert that this counter is 0.

Ugh. Well, it does the job, so holding my nose while applying should
do...

Kevin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all()
  2015-03-02 15:57     ` Kevin Wolf
@ 2015-03-02 16:03       ` Max Reitz
  0 siblings, 0 replies; 20+ messages in thread
From: Max Reitz @ 2015-03-02 16:03 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On 2015-03-02 at 10:57, Kevin Wolf wrote:
> Am 02.03.2015 um 16:15 hat Max Reitz geschrieben:
>> On 2015-03-02 at 04:25, Kevin Wolf wrote:
>>> Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
>>>> Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
>>>> which can lead to data corruption (see the iotest added in the final
>>>> patch of this series) and is most certainly very ugly.
>>>>
>>>> This series reworks bdrv_close_all() to instead eject the BDS trees from
>>>> all BlockBackends and then close the monitor-owned BDS trees, which are
>>>> the only BDSs without a BB. In effect, all BDSs are closed just by
>>>> getting closed automatically due to their reference count becoming 0.
>>>>
>>>> The benefit over the approach taken in v1 and v2 is that in device
>>>> models we often cannot simply drop the reference to a BB because there
>>>> may be some user which we forgot about. By ejecting the BDS trees from
>>>> the BB, the BB itself becomes unusable, but in a clean way (it will
>>>> return errors when accessed, but nothing will crash). Also, it is much
>>>> simpler (no reference tracking necessary).
>>>>
>>>> The only disadvantage (I can see) is that the BBs are leaked; but this
>>>> does not matter because the qemu process is about to exit anyway.
>>> I haven't looked at the actual patches yet, but just from this
>>> description and the diffstat: We need to make sure that the refcount
>>> really drops to 0.
>> If you mean the BBs: Tried that in v2, doesn't seem to work. I had
>> my fun tracing around where the handles to the BBs for device models
>> stay and who might have access to them, but when I say "fun" I mean
>> that ironically.
>>
>> If you mean the BDSs: See below (below below).
> Yes, I mean BDSes.
>
>>> That is, if there are NBD servers, block jobs, etc.
>>> that hold an additional reference, we must make sure to stop them. It
>>> doesn't look like this series takes care of this, does it?
>> Everyone using BBs is fine; for instance, NBD servers are stopped
>> (they register a notifier for when the BDS tree is ejected from a
>> BB).
> Good point. Even if the NBD server were not stopped, it still would
> hold a reference only to the BB and not to the BDS, so the BDS would
> correctly be closed.
>
>> So block jobs are not handled, once because I still didn't get
>> around to make them work on BBs instead of BDSs (and I probably
>> never will, although they really should), and because I missed the
>> bdrv_ref() in block_job_create()
>>
>> I guess that means manually cancelling all block jobs in bdrv_close_all()...
> Even in the far future, will block jobs really always work on a BB? Even
> if they modify some node in the middle of the chain?

I think they should create an unnamed BB for that case.

But if we start to work on that, we'll have to create a BB for every 
block job, and that means supporting multiple BBs per BDS (if a block 
job is started on a root BDS).

> Anyway, for now, we probably need to cancel them, yes. Essentially you
> end up calling everything manually that had to register a notifier in
> the earlier versions.

Well, not absolutely everything, only everything that does not use a BB 
(and as far as I'm aware, those are only the monitor (which can hold 
references to BDSs without BBs) and block jobs).

>>> Hm... Perhaps we could even install an atexit handler that asserts that
>>> all BDSes are really gone in the end?
>> I had that for BBs in v1 and v2, it was just a function that was
>> called at the end of bdrv_close_all(). For that to work for BDSs,
>> however, we'd need a list of all BDSs and assert that it's empty. I
>> found that a bit too much.
> The list of all relevant BDSes should be the union of bdrv_states and
> bdrv_graph_states, and I think it doesn't have to be empty, but all of
> its entries have to be closed (bs->drv == NULL).

The idea of this series is that bdrv_close() should not be called 
outside of bdrv_delete() (and bdrv_open(), if that failed), so whenever 
a BDS is closed, it should be deleted.

> It's not 100% complete because we have these stupid QMP commands that
> address BDSes using filenames instead of node or BB names, but it's
> probably a useful subset to check.
>
>> A simpler but uglier way would be a counter that's incremented every
>> time a BDS is created and decremented every time one is deleted (or
>> incremented once per refcount increment and drecrement once per
>> refcount decrement). We'd then assert that this counter is 0.
> Ugh. Well, it does the job, so holding my nose while applying should
> do...

OK.

Max

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB
  2015-03-02 15:18     ` Max Reitz
@ 2015-03-03  2:08       ` Fam Zheng
  0 siblings, 0 replies; 20+ messages in thread
From: Fam Zheng @ 2015-03-03  2:08 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, qemu-block, qemu-devel, Markus Armbruster,
	Stefan Hajnoczi, Paolo Bonzini

On Mon, 03/02 10:18, Max Reitz wrote:
> On 2015-02-27 at 21:55, Fam Zheng wrote:
> >On Fri, 02/27 11:43, Max Reitz wrote:
> >>  static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >>                                  Error **errp)
> >>  {
> >>@@ -763,12 +794,26 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >>      SCSIDevice *sd = SCSI_DEVICE(dev);
> >>      if (s->ctx && !s->dataplane_disabled) {
> >>+        VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
> >>+
> >>+        insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> >>+        insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
> >>+        insert_notifier->s = s;
> >>+        insert_notifier->sd = sd;
> >>+        blk_add_insert_bs_notifier(sd->conf.blk, &insert_notifier->n);
> >>+        QTAILQ_INSERT_TAIL(&s->insert_notifiers, insert_notifier, next);
> >Could you instead embed a Notifier into SCSIDevice, similarly? That way there
> >is no need to maintain a list in VirtIOSCSI.
> 
> I thought about that, but somehow SCSIDevice seemed so generic to me... I
> didn't really want to add a VirtIOSCSI pointer to it (or even just a
> notifier only to be used by virtio-scsi).
> 

I see. Then it is better done as now.

Fam

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-03-03  2:09 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-27 16:43 [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 01/11] iotests: Move _filter_nbd into common.filter Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 02/11] iotests: Make redirecting qemu's stderr optional Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 03/11] iotests: Add test for eject under NBD server Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 04/11] quorum: Fix close path Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 05/11] block: Move BDS close notifiers into BB Max Reitz
2015-02-27 20:08   ` Max Reitz
2015-02-28  2:55   ` Fam Zheng
2015-03-02 15:18     ` Max Reitz
2015-03-03  2:08       ` Fam Zheng
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 06/11] block: Use blk_remove_bs() in blk_delete() Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 07/11] blockdev: Use blk_remove_bs() in do_drive_del() Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 08/11] block: Make bdrv_close() static Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 09/11] blockdev: Keep track of monitor-owned BDS Max Reitz
2015-02-27 16:43 ` [Qemu-devel] [PATCH v4 10/11] block: Eject BDS tree from BB at bdrv_close_all() Max Reitz
2015-02-27 16:44 ` [Qemu-devel] [PATCH v4 11/11] iotests: Add test for multiple BB on BDS tree Max Reitz
2015-03-02  9:25 ` [Qemu-devel] [PATCH v4 00/11] block: Rework bdrv_close_all() Kevin Wolf
2015-03-02 15:15   ` Max Reitz
2015-03-02 15:57     ` Kevin Wolf
2015-03-02 16:03       ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.