All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts
@ 2016-02-22 14:39 Paolo Bonzini
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time Paolo Bonzini
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Paolo Bonzini @ 2016-02-22 14:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, Max Reitz

In short, this patch gets rid of blockdev_mark_auto_del and
blockdev_auto_del.

With these patches, it is possible to create a new -drive with the same
id as soon as the DEVICE_DELETED event is delivered (which equals to
unrealize).

I'm sorry I'm not able to explain the history (and probably do not
understand the full ramifications) of this.  That's why this is just
an RFC.

The idea here is that reference counting the BlockBackend is enough to
defer the deletion of the block device as much as necessary; anticipating
the destruction of the DriveInfo is not a problem, and has the desired
effect of freeing the QemuOpts.

Patches 1 and 3 are mostly similar to the version I had earlier sent as
RFC, but they now pass all unit tests.  Patch 2 is new, but I don't know
of a test that fails it.

Paolo

Paolo Bonzini (3):
  block: detach devices from DriveInfo at unrealize time
  block: keep BlockBackend alive until device finalize time
  block: remove legacy_dinfo at blk_detach_dev time

 block/block-backend.c            | 13 +++++++++----
 blockdev.c                       | 28 +++++++++-------------------
 hw/block/virtio-blk.c            |  4 +++-
 hw/block/xen_disk.c              |  1 +
 hw/core/qdev-properties-system.c | 13 +++++++++++--
 hw/ide/piix.c                    |  3 +++
 hw/scsi/scsi-bus.c               |  4 +++-
 hw/usb/dev-storage.c             |  3 ++-
 include/sysemu/blockdev.h        |  5 ++---
 9 files changed, 43 insertions(+), 31 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
@ 2016-02-22 14:39 ` Paolo Bonzini
  2016-03-21 15:13   ` Markus Armbruster
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time Paolo Bonzini
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-02-22 14:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, Max Reitz

Instead of delaying blk_detach_dev and blockdev_auto_del until
the object is finalized and properties are released, do that
as soon as possible.

This patch replaces blockdev_mark_auto_del calls with blk_detach_dev
and blockdev_del_drive (the latter is a combination of the former
blockdev_mark_auto_del and blockdev_auto_del).

release_drive's call to blockdev_auto_del can then be removed completely.
This is of course okay in the case where the device has been unrealized
before and unrealize took care of calling blockdev_del_drive.  However,
it is also okay if the device has failed to be realized.  In that case,
blockdev_mark_auto_del was never called (because it is called during
unrealize) and thus release_drive's blockdev_auto_del call did nothing.
The drive-del-test qtest covers this case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 blockdev.c                       | 21 +++++----------------
 hw/block/virtio-blk.c            |  4 +++-
 hw/block/xen_disk.c              |  1 +
 hw/core/qdev-properties-system.c |  8 ++++++--
 hw/ide/piix.c                    |  3 +++
 hw/scsi/scsi-bus.c               |  4 +++-
 hw/usb/dev-storage.c             |  3 ++-
 include/sysemu/blockdev.h        |  4 +---
 8 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1f73478..2dfb2d8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -114,20 +114,16 @@ void override_max_devs(BlockInterfaceType type, int max_devs)
 /*
  * We automatically delete the drive when a device using it gets
  * unplugged.  Questionable feature, but we can't just drop it.
- * Device models call blockdev_mark_auto_del() to schedule the
- * automatic deletion, and generic qdev code calls blockdev_auto_del()
- * when deletion is actually safe.
+ * Device models call blockdev_del_drive() to schedule the
+ * automatic deletion, and generic block layer code uses the
+ * refcount to do the deletion when it is actually safe.
  */
-void blockdev_mark_auto_del(BlockBackend *blk)
+void blockdev_del_drive(BlockBackend *blk)
 {
     DriveInfo *dinfo = blk_legacy_dinfo(blk);
     BlockDriverState *bs = blk_bs(blk);
     AioContext *aio_context;
 
-    if (!dinfo) {
-        return;
-    }
-
     if (bs) {
         aio_context = bdrv_get_aio_context(bs);
         aio_context_acquire(aio_context);
@@ -139,14 +135,7 @@ void blockdev_mark_auto_del(BlockBackend *blk)
         aio_context_release(aio_context);
     }
 
-    dinfo->auto_del = 1;
-}
-
-void blockdev_auto_del(BlockBackend *blk)
-{
-    DriveInfo *dinfo = blk_legacy_dinfo(blk);
-
-    if (dinfo && dinfo->auto_del) {
+    if (dinfo) {
         blk_unref(blk);
     }
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c427698..0582787 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -945,7 +945,9 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
     s->dataplane = NULL;
     qemu_del_vm_change_state_handler(s->change);
     unregister_savevm(dev, "virtio-blk", s);
-    blockdev_mark_auto_del(s->blk);
+    blk_detach_dev(s->blk, dev);
+    blockdev_del_drive(s->blk);
+    s->blk = NULL;
     virtio_cleanup(vdev);
 }
 
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 7bd5bde..39a72e4 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -1041,6 +1041,7 @@ static void blk_disconnect(struct XenDevice *xendev)
 
     if (blkdev->blk) {
         blk_detach_dev(blkdev->blk, blkdev);
+        blockdev_del_drive(blkdev->blk);
         blk_unref(blkdev->blk);
         blkdev->blk = NULL;
     }
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index e10cede..469ba8a 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -101,9 +101,13 @@ static void release_drive(Object *obj, const char *name, void *opaque)
     Property *prop = opaque;
     BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
 
-    if (*ptr) {
+    if (*ptr && blk_get_attached_dev(*ptr) != NULL) {
+        /* Unrealize has already called blk_detach_dev and blockdev_del_drive
+         * if the device has been realized; in that case, blk_get_attached_dev
+         * returns NULL.  Thus, we get here if the device failed to realize,
+         * and the -drive must not be released.
+         */
         blk_detach_dev(*ptr, dev);
-        blockdev_auto_del(*ptr);
     }
 }
 
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index df46147..cf8fa58 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -182,6 +182,9 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev)
             if (ds) {
                 blk_detach_dev(blk, ds);
             }
+            if (pci_ide->bus[di->bus].ifs[di->unit].blk) {
+                blockdev_del_drive(blk);
+            }
             pci_ide->bus[di->bus].ifs[di->unit].blk = NULL;
             if (!(i % 2)) {
                 idedev = pci_ide->bus[di->bus].master;
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 6dcdbc0..3b2b766 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -214,7 +214,9 @@ static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
     }
 
     scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));
-    blockdev_mark_auto_del(dev->conf.blk);
+    blk_detach_dev(dev->conf.blk, qdev);
+    blockdev_del_drive(dev->conf.blk);
+    dev->conf.blk = NULL;
 }
 
 /* handle legacy '-drive if=scsi,...' cmd line args */
diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
index 5ae0424..1c00211 100644
--- a/hw/usb/dev-storage.c
+++ b/hw/usb/dev-storage.c
@@ -643,7 +643,8 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp)
      * blockdev, or else scsi_bus_legacy_add_drive() dies when it
      * attaches again.
      *
-     * The hack is probably a bad idea.
+     * The hack is probably a bad idea.  Anyway, this is why this does not
+     * call blockdev_del_drive.
      */
     blk_detach_dev(blk, &s->dev.qdev);
     s->conf.blk = NULL;
diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index b06a060..ae7ad67 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -14,8 +14,7 @@
 #include "qapi/error.h"
 #include "qemu/queue.h"
 
-void blockdev_mark_auto_del(BlockBackend *blk);
-void blockdev_auto_del(BlockBackend *blk);
+void blockdev_del_drive(BlockBackend *blk);
 
 typedef enum {
     IF_DEFAULT = -1,            /* for use with drive_add() only */
@@ -34,7 +33,6 @@ struct DriveInfo {
     BlockInterfaceType type;
     int bus;
     int unit;
-    int auto_del;               /* see blockdev_mark_auto_del() */
     bool is_default;            /* Added by default_drive() ?  */
     int media_cd;
     int cyls, heads, secs, trans;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time
  2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time Paolo Bonzini
@ 2016-02-22 14:39 ` Paolo Bonzini
  2016-03-21 15:22   ` Markus Armbruster
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time Paolo Bonzini
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-02-22 14:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, Max Reitz

While the next patch will anticipate the death of the DriveInfo
data structure, the BlockBackend must survive after unrealize,
for example in case there are outstanding operations on it.
The good thing is that we can just use reference counting to
do it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/core/qdev-properties-system.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 469ba8a..5e84b55 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -93,6 +93,7 @@ static void parse_drive(DeviceState *dev, const char *str, void **ptr,
         return;
     }
     *ptr = blk;
+    blk_ref(blk);
 }
 
 static void release_drive(Object *obj, const char *name, void *opaque)
@@ -101,13 +102,17 @@ static void release_drive(Object *obj, const char *name, void *opaque)
     Property *prop = opaque;
     BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
 
-    if (*ptr && blk_get_attached_dev(*ptr) != NULL) {
-        /* Unrealize has already called blk_detach_dev and blockdev_del_drive
-         * if the device has been realized; in that case, blk_get_attached_dev
-         * returns NULL.  Thus, we get here if the device failed to realize,
-         * and the -drive must not be released.
-         */
-        blk_detach_dev(*ptr, dev);
+    if (*ptr) {
+        if (blk_get_attached_dev(*ptr) != NULL) {
+            /* Unrealize has already called blk_detach_dev and
+             * blockdev_del_drive if the device has been realized;
+             * in that case, blk_get_attached_dev returns NULL.  Thus,
+             * we get here if the device failed to realize, and the
+             * -drive must not be released.
+             */
+            blk_detach_dev(*ptr, dev);
+        }
+        blk_unref(*ptr);
     }
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time Paolo Bonzini
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time Paolo Bonzini
@ 2016-02-22 14:39 ` Paolo Bonzini
  2016-03-21 16:15   ` Markus Armbruster
  2016-03-09 12:20 ` [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
  2016-03-17 17:00 ` Markus Armbruster
  4 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-02-22 14:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, Max Reitz

Currently, blockdev_del_drive (and before it blk_auto_del) does a blk_unref
that will cause blk_delete to be called and the DriveInfo to be freed.
But really, we want to free the drive info as soon as the device is
detached, even if there are other references for whatever reason, so
that the QemuOpts are freed as well and the id can be reused.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/block-backend.c     | 13 +++++++++----
 blockdev.c                |  9 +++++----
 include/sysemu/blockdev.h |  1 +
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index ebdf78a..0a85c6a 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -65,8 +65,6 @@ static const AIOCBInfo block_backend_aiocb_info = {
     .aiocb_size = sizeof(BlockBackendAIOCB),
 };
 
-static void drive_info_del(DriveInfo *dinfo);
-
 /* All the BlockBackends (except for hidden ones) */
 static QTAILQ_HEAD(, BlockBackend) blk_backends =
     QTAILQ_HEAD_INITIALIZER(blk_backends);
@@ -165,6 +163,7 @@ static void blk_delete(BlockBackend *blk)
 {
     assert(!blk->refcnt);
     assert(!blk->dev);
+    assert(!blk->legacy_dinfo);
     if (blk->bs) {
         blk_remove_bs(blk);
     }
@@ -179,19 +178,25 @@ static void blk_delete(BlockBackend *blk)
         QTAILQ_REMOVE(&blk_backends, blk, link);
     }
     g_free(blk->name);
-    drive_info_del(blk->legacy_dinfo);
     block_acct_cleanup(&blk->stats);
     g_free(blk);
 }
 
-static void drive_info_del(DriveInfo *dinfo)
+void blk_release_legacy_dinfo(BlockBackend *blk)
 {
+    DriveInfo *dinfo = blk->legacy_dinfo;
+
     if (!dinfo) {
         return;
     }
     qemu_opts_del(dinfo->opts);
     g_free(dinfo->serial);
     g_free(dinfo);
+    blk->legacy_dinfo = NULL;
+    /* We are not interested anymore in retrieving the BlockBackend
+     * via blk_by_legacy_dinfo, so let it die.
+     */
+    blk_unref(blk);
 }
 
 int blk_get_refcnt(BlockBackend *blk)
diff --git a/blockdev.c b/blockdev.c
index 2dfb2d8..85f0cb5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -120,10 +120,10 @@ void override_max_devs(BlockInterfaceType type, int max_devs)
  */
 void blockdev_del_drive(BlockBackend *blk)
 {
-    DriveInfo *dinfo = blk_legacy_dinfo(blk);
     BlockDriverState *bs = blk_bs(blk);
     AioContext *aio_context;
 
+    blk_ref(blk);
     if (bs) {
         aio_context = bdrv_get_aio_context(bs);
         aio_context_acquire(aio_context);
@@ -135,9 +135,10 @@ void blockdev_del_drive(BlockBackend *blk)
         aio_context_release(aio_context);
     }
 
-    if (dinfo) {
-        blk_unref(blk);
+    if (blk_legacy_dinfo(blk)) {
+        blk_release_legacy_dinfo(blk);
     }
+    blk_unref(blk);
 }
 
 /**
@@ -2811,7 +2812,7 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
         blk_set_on_error(blk, BLOCKDEV_ON_ERROR_REPORT,
                          BLOCKDEV_ON_ERROR_REPORT);
     } else {
-        blk_unref(blk);
+        blockdev_del_drive(blk);
     }
 
     aio_context_release(aio_context);
diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index ae7ad67..5722b9f 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -44,6 +44,7 @@ struct DriveInfo {
 DriveInfo *blk_legacy_dinfo(BlockBackend *blk);
 DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo);
 BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo);
+void blk_release_legacy_dinfo(BlockBackend *blk);
 
 void override_max_devs(BlockInterfaceType type, int max_devs);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts
  2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
                   ` (2 preceding siblings ...)
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time Paolo Bonzini
@ 2016-03-09 12:20 ` Paolo Bonzini
  2016-03-09 12:30   ` Kevin Wolf
  2016-03-17 17:00 ` Markus Armbruster
  4 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-09 12:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, qemu block, Max Reitz



On 22/02/2016 15:39, Paolo Bonzini wrote:
> In short, this patch gets rid of blockdev_mark_auto_del and
> blockdev_auto_del.
> 
> With these patches, it is possible to create a new -drive with the same
> id as soon as the DEVICE_DELETED event is delivered (which equals to
> unrealize).
> 
> I'm sorry I'm not able to explain the history (and probably do not
> understand the full ramifications) of this.  That's why this is just
> an RFC.
> 
> The idea here is that reference counting the BlockBackend is enough to
> defer the deletion of the block device as much as necessary; anticipating
> the destruction of the DriveInfo is not a problem, and has the desired
> effect of freeing the QemuOpts.
> 
> Patches 1 and 3 are mostly similar to the version I had earlier sent as
> RFC, but they now pass all unit tests.  Patch 2 is new, but I don't know
> of a test that fails it.
> 
> Paolo
> 
> Paolo Bonzini (3):
>   block: detach devices from DriveInfo at unrealize time
>   block: keep BlockBackend alive until device finalize time
>   block: remove legacy_dinfo at blk_detach_dev time
> 
>  block/block-backend.c            | 13 +++++++++----
>  blockdev.c                       | 28 +++++++++-------------------
>  hw/block/virtio-blk.c            |  4 +++-
>  hw/block/xen_disk.c              |  1 +
>  hw/core/qdev-properties-system.c | 13 +++++++++++--
>  hw/ide/piix.c                    |  3 +++
>  hw/scsi/scsi-bus.c               |  4 +++-
>  hw/usb/dev-storage.c             |  3 ++-
>  include/sysemu/blockdev.h        |  5 ++---
>  9 files changed, 43 insertions(+), 31 deletions(-)
> 

Ping?!?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts
  2016-03-09 12:20 ` [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
@ 2016-03-09 12:30   ` Kevin Wolf
  2016-03-09 12:53     ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Kevin Wolf @ 2016-03-09 12:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Max Reitz, qemu-devel, qemu block, Markus Armbruster

Am 09.03.2016 um 13:20 hat Paolo Bonzini geschrieben:
> On 22/02/2016 15:39, Paolo Bonzini wrote:
> > In short, this patch gets rid of blockdev_mark_auto_del and
> > blockdev_auto_del.
> > 
> > With these patches, it is possible to create a new -drive with the same
> > id as soon as the DEVICE_DELETED event is delivered (which equals to
> > unrealize).
> > 
> > I'm sorry I'm not able to explain the history (and probably do not
> > understand the full ramifications) of this.  That's why this is just
> > an RFC.
> > 
> > The idea here is that reference counting the BlockBackend is enough to
> > defer the deletion of the block device as much as necessary; anticipating
> > the destruction of the DriveInfo is not a problem, and has the desired
> > effect of freeing the QemuOpts.
> > 
> > Patches 1 and 3 are mostly similar to the version I had earlier sent as
> > RFC, but they now pass all unit tests.  Patch 2 is new, but I don't know
> > of a test that fails it.
> > 
> > Paolo
> > 
> > Paolo Bonzini (3):
> >   block: detach devices from DriveInfo at unrealize time
> >   block: keep BlockBackend alive until device finalize time
> >   block: remove legacy_dinfo at blk_detach_dev time
> > 
> >  block/block-backend.c            | 13 +++++++++----
> >  blockdev.c                       | 28 +++++++++-------------------
> >  hw/block/virtio-blk.c            |  4 +++-
> >  hw/block/xen_disk.c              |  1 +
> >  hw/core/qdev-properties-system.c | 13 +++++++++++--
> >  hw/ide/piix.c                    |  3 +++
> >  hw/scsi/scsi-bus.c               |  4 +++-
> >  hw/usb/dev-storage.c             |  3 ++-
> >  include/sysemu/blockdev.h        |  5 ++---
> >  9 files changed, 43 insertions(+), 31 deletions(-)
> > 
> 
> Ping?!?

Markus, can you please review this? You seem to have a better
understanding of DriveInfo and related magic.

Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts
  2016-03-09 12:30   ` Kevin Wolf
@ 2016-03-09 12:53     ` Markus Armbruster
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-09 12:53 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, qemu block, Max Reitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 09.03.2016 um 13:20 hat Paolo Bonzini geschrieben:
>> On 22/02/2016 15:39, Paolo Bonzini wrote:
>> > In short, this patch gets rid of blockdev_mark_auto_del and
>> > blockdev_auto_del.
>> > 
>> > With these patches, it is possible to create a new -drive with the same
>> > id as soon as the DEVICE_DELETED event is delivered (which equals to
>> > unrealize).
>> > 
>> > I'm sorry I'm not able to explain the history (and probably do not
>> > understand the full ramifications) of this.  That's why this is just
>> > an RFC.
>> > 
>> > The idea here is that reference counting the BlockBackend is enough to
>> > defer the deletion of the block device as much as necessary; anticipating
>> > the destruction of the DriveInfo is not a problem, and has the desired
>> > effect of freeing the QemuOpts.
>> > 
>> > Patches 1 and 3 are mostly similar to the version I had earlier sent as
>> > RFC, but they now pass all unit tests.  Patch 2 is new, but I don't know
>> > of a test that fails it.
>> > 
>> > Paolo
>> > 
>> > Paolo Bonzini (3):
>> >   block: detach devices from DriveInfo at unrealize time
>> >   block: keep BlockBackend alive until device finalize time
>> >   block: remove legacy_dinfo at blk_detach_dev time
>> > 
>> >  block/block-backend.c            | 13 +++++++++----
>> >  blockdev.c                       | 28 +++++++++-------------------
>> >  hw/block/virtio-blk.c            |  4 +++-
>> >  hw/block/xen_disk.c              |  1 +
>> >  hw/core/qdev-properties-system.c | 13 +++++++++++--
>> >  hw/ide/piix.c                    |  3 +++
>> >  hw/scsi/scsi-bus.c               |  4 +++-
>> >  hw/usb/dev-storage.c             |  3 ++-
>> >  include/sysemu/blockdev.h        |  5 ++---
>> >  9 files changed, 43 insertions(+), 31 deletions(-)
>> > 
>> 
>> Ping?!?
>
> Markus, can you please review this? You seem to have a better
> understanding of DriveInfo and related magic.

Okay, I'll try to get to it this week.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts
  2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
                   ` (3 preceding siblings ...)
  2016-03-09 12:20 ` [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
@ 2016-03-17 17:00 ` Markus Armbruster
  4 siblings, 0 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-17 17:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> In short, this patch gets rid of blockdev_mark_auto_del and
> blockdev_auto_del.
>
> With these patches, it is possible to create a new -drive with the same
> id as soon as the DEVICE_DELETED event is delivered (which equals to
> unrealize).
>
> I'm sorry I'm not able to explain the history (and probably do not
> understand the full ramifications) of this.

I'm going to try, because I need it myself to understand the
ramifications.

>                                              That's why this is just
> an RFC.

It's technically not an RFC anymore.

> The idea here is that reference counting the BlockBackend is enough to
> defer the deletion of the block device as much as necessary; anticipating
> the destruction of the DriveInfo is not a problem, and has the desired
> effect of freeing the QemuOpts.

Let me explain how we got ourselves into this mess, how it works, and
where it fails.

In the beginning, block frontends and backends got created during
startup and lived as long as the process.  Life was simple, if a bit
dull.

Enter hot plug / unplug, commit 6f338c3 "qemu: PCI device, disk and host
network hot-add / hot-remove (Marcelo Tosatti)", Feb 2009, v0.10.  In
this early form, the interface mushed frontends and backends together,
similar to the command line at that time.  The interface we use today
appeared in commit 3418bd2 "qdev hotplug: infrastructure and monitor
commands.", Sep 2009, v0.12.  Here's how it works:

* Dynamically create a block backend:

    drive_add "" if=none,id=BE-ID,...

* Hot plug a block frontend:

    device_add driver=DRV,id=FE-ID,drive=BE-ID,...

* Hot unplug a block frontend

    device_del FE-ID

  For virtio-blk and SCSI devices, this also destroys the block backend.
  Back then, these were the only devices that could be unplugged, and
  this was the only way to destroy a block backend.  This automatic
  backend destruction was a mistake.  No other kind of backend gets
  destroyed that way.  The (still experimental) blockdev-add gives us
  the opportunity to correct the mistake: backends created with it are
  not destroyed by frontend unplug.

  For some kinds of devices, such as USB, hot unplug is synchronous.
  For others, such as PCI, it's not: device_del merely initiates the
  unplug, which then completes in its own sweet time.  Or doesn't: PCI
  unplug via ACPI actually requires guest cooperation.

  While there's some complexity in orchestrating the PCI-ACPI-dance, the
  block code proper is still simple: frontend destruction triggers
  backend destruction.

Not only is automatic backend destruction a mistake, it's also not quite
trivial.  The backend gets destroyed by the frontend's qdev exit()
method (which has since become its QOM unrealize()).  This leaves the
frontend's pointer to the backend dangling until the frontend finishes
dying.  Works as long as it doesn't get dereferenced during that time,
but relying on that is a bit brittle.  When it got in the way, it led to
commit 14bafc5 "blockdev: Clean up automatic drive deletion", Jun 2010,
v0.13.  This is where blockdev_mark_auto_del() and blockdev_auto_del()
come from.  Instead of destroying the backend, we merely mark it, and
destroy it when it's safe.

The next piece of the puzzle is drive_del.  While an image is in use by
QEMU, it's generally unsafe to use by anything else.  If you need it for
something else, you need to first make QEMU relinquish it.  Easy enough:
hot unplug the frontend, thus destroy the backend, done.

Except when the unplug needs guest cooperation, and the guest doesn't
cooperate.  This motivated commit 9063f81 "Implement drive_del to
decouple block removal from device removal", Nov 2010, v0.14.  The guest
gets no say, and afterwards sees a terminally broken block device.

In its initial form, drive_del simply closed and destroyed the block
backend right away.  It tried to zap the pointers first, but failed.  We
fixed the resulting use-after-free in commit d22b2f4 "Do not delete
BlockDriverState when deleting the drive", Mar 2011, v0.15: we hide
instead of destroy the closed backend when it's being used by a
frontend.

Why hide it?  Backward compatibility.  v0.14's drive_del made the
backend's ID available for reuse immediately.  What we can't destroy
immediately, we need to hide.

The final piece of the puzzle is event DEVICE_DELETED, from commit
0402a5d "qdev: DEVICE_DELETED event", Mar 2013, v1.5.0.  It gets emitted
when unplug completes.

Meanwhile and mostly independently, backend reference counting evolved.
When backends still lived as long as the process, there was none,
obviously.

We probably should've introduced it for drive_del in v0.14, but that
didn't happen, because we (incorrectly) thought drive_del could simply
delete and be done.

Instead, it got introduced for block migration, in commit 84fb392
"blockdev: add refcount to DriveInfo", Jan 2011, v0.15.

Note that DriveInfo is *not* the block backend.  It captures a -drive /
drive_add.  Reference counting block backend there pressed DriveInfo to
use as owner of the block backend.  It took a while to clean that up,
until commit fa510eb "block: use BDS ref for block jobs", Aug 2013,
v1.7.  DriveInfo's reference count even lingered until commit ae60e8e,
v2.1.

Note that reference counting began only after the unplug work was done
except for the bug fixes.

When I worked on BlockBackend, I wanted to do what your series does: put
the reference counting to use to clean up the unplug mess.  But then the
BlockBackend job became too complex and I had to cut that part.

So, what do we learn from my lengthy history lesson?  I think we can
learn what to watch out for.  Here's my list:

* Block backend auto-destroy wart: we need to destroy block backends on
  frontend unplug exactly as before.

* The "drive_del hides the backend" wart: what we can't destroy
  immediately, we need to hide.

* drive_del is where dragons be.  I've fixed enough bugs there, and
  spent enough hours wracking my brain how to change things without
  messing it up to be on my toes there.

> Patches 1 and 3 are mostly similar to the version I had earlier sent as
> RFC, but they now pass all unit tests.  Patch 2 is new, but I don't know
> of a test that fails it.

I'll review the actual patches next, hopefully tomorrow.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time Paolo Bonzini
@ 2016-03-21 15:13   ` Markus Armbruster
  2016-03-21 15:31     ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 15:13 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Quick recap of the analysis of the status quo I sent last week:

* virtio-blk and SCSI devices delete their block backend on unplug,
  except when it was created with blockdev-add.  No other device deletes
  backends.

* drive_del deletes a backend when it can, else it closes and hides it.

* drive_del has been a fertile source of headaches.  Care advised.

Paolo Bonzini <pbonzini@redhat.com> writes:

> Instead of delaying blk_detach_dev and blockdev_auto_del until
> the object is finalized and properties are released, do that
> as soon as possible.
>
> This patch replaces blockdev_mark_auto_del calls with blk_detach_dev
> and blockdev_del_drive (the latter is a combination of the former
> blockdev_mark_auto_del and blockdev_auto_del).
>
> release_drive's call to blockdev_auto_del can then be removed completely.
> This is of course okay in the case where the device has been unrealized
> before and unrealize took care of calling blockdev_del_drive.  However,
> it is also okay if the device has failed to be realized.  In that case,
> blockdev_mark_auto_del was never called (because it is called during
> unrealize) and thus release_drive's blockdev_auto_del call did nothing.
> The drive-del-test qtest covers this case.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockdev.c                       | 21 +++++----------------
>  hw/block/virtio-blk.c            |  4 +++-
>  hw/block/xen_disk.c              |  1 +
>  hw/core/qdev-properties-system.c |  8 ++++++--
>  hw/ide/piix.c                    |  3 +++
>  hw/scsi/scsi-bus.c               |  4 +++-
>  hw/usb/dev-storage.c             |  3 ++-
>  include/sysemu/blockdev.h        |  4 +---
>  8 files changed, 24 insertions(+), 24 deletions(-)
>
> diff --git a/blockdev.c b/blockdev.c
> index 1f73478..2dfb2d8 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -114,20 +114,16 @@ void override_max_devs(BlockInterfaceType type, int max_devs)
>  /*
>   * We automatically delete the drive when a device using it gets
>   * unplugged.  Questionable feature, but we can't just drop it.
> - * Device models call blockdev_mark_auto_del() to schedule the
> - * automatic deletion, and generic qdev code calls blockdev_auto_del()
> - * when deletion is actually safe.
> + * Device models call blockdev_del_drive() to schedule the
> + * automatic deletion, and generic block layer code uses the
> + * refcount to do the deletion when it is actually safe.
>   */
> -void blockdev_mark_auto_del(BlockBackend *blk)
> +void blockdev_del_drive(BlockBackend *blk)
>  {
>      DriveInfo *dinfo = blk_legacy_dinfo(blk);
>      BlockDriverState *bs = blk_bs(blk);
>      AioContext *aio_context;
>  
> -    if (!dinfo) {
> -        return;
> -    }
> -
>      if (bs) {
>          aio_context = bdrv_get_aio_context(bs);
>          aio_context_acquire(aio_context);
> @@ -139,14 +135,7 @@ void blockdev_mark_auto_del(BlockBackend *blk)
>          aio_context_release(aio_context);
>      }
>  
> -    dinfo->auto_del = 1;
> -}
> -
> -void blockdev_auto_del(BlockBackend *blk)
> -{
> -    DriveInfo *dinfo = blk_legacy_dinfo(blk);
> -
> -    if (dinfo && dinfo->auto_del) {
> +    if (dinfo) {
>          blk_unref(blk);
>      }
>  }

Initially (commit 14bafc5), blockdev_mark_auto_del() and
blockdev_auto_del() simply did what their names promised:

    void blockdev_mark_auto_del(BlockDriverState *bs)
    {
        DriveInfo *dinfo = drive_get_by_blockdev(bs);

        dinfo->auto_del = 1;
    }

    void blockdev_auto_del(BlockDriverState *bs)
    {
        DriveInfo *dinfo = drive_get_by_blockdev(bs);
    
        if (dinfo->auto_del) {
            drive_uninit(dinfo);
        }
    }

Since we didn't want to perpetuate the "automatic deletion" wart for
backends created with the new (& still experimental) blockdev-add, we
prepended

    if (!dinfo) {
        return;
    }

to the marking in commit 2d246f0 and 26f8b3a.

Meanwhile, commit 12bde0e added block job cancellation:

    block: cancel jobs when a device is ready to go away
    
    We do not want jobs to keep a device busy for a possibly very long
    time, and management could become confused because they thought a
    device was not even there anymore.  So, cancel long-running jobs
    as soon as their device is going to disappear.

The automatic block job cancellation is an extension of the automatic
deletion wart.  We cancel exactly when we schedule the warty deletion.
Note that this made blockdev_mark_auto_del() do more than its name
promises.  I never liked that, and always wondered why we don't cancel
in blockdev_auto_del() instead.

Also meanwhile, blockdev_auto_del() slowly morphed from "delete" to
"drop a reference".

Anyway, code before your patch, with // additional annotations

    void blockdev_mark_auto_del(BlockBackend *blk)
    {
        DriveInfo *dinfo = blk_legacy_dinfo(blk);
        BlockDriverState *bs = blk_bs(blk);
        AioContext *aio_context;

        // Limit the auto-deletion wart to pre-blockdev-add
        if (!dinfo) {
            return;
        }

        // Warty automatic job cancellation
        if (bs) {
            aio_context = bdrv_get_aio_context(bs);
            aio_context_acquire(aio_context);

            if (bs->job) {
                block_job_cancel(bs->job);
            }

            aio_context_release(aio_context);
        }

        // Schedule warty automatic deletion
        dinfo->auto_del = 1;
    }

    void blockdev_auto_del(BlockBackend *blk)
    {
        DriveInfo *dinfo = blk_legacy_dinfo(blk);

        // Execute scheduled warty automatic deletion, if any
        // This drops the reference block-backend.c holds in trust for
        // lookup by name.  The device already dropped its own
        // reference.  The backend is deleted unless more references
        // exist (not sure that's possible).  If they do, management
        // applications that expect auto-deletion may get confused.
        if (dinfo && dinfo->auto_del) {
            blk_unref(blk);
        }
    }

Code after your patch:

    void blockdev_del_drive(BlockBackend *blk)
    {
        DriveInfo *dinfo = blk_legacy_dinfo(blk);
        BlockDriverState *bs = blk_bs(blk);
        AioContext *aio_context;

        // warty automatic job cancellation
        if (bs) {
            aio_context = bdrv_get_aio_context(bs);
            aio_context_acquire(aio_context);

            if (bs->job) {
                block_job_cancel(bs->job);
            }

            aio_context_release(aio_context);
        }

        if (dinfo) {
            // Drop the reference held in trust for lookup by name.  The
            // device still holds another one (if qdevified, the
            // property holds it).  Unless more references exist, the
            // backend will be auto-deleted when the device drops its
            // reference.
            blk_unref(blk);
        }
    }

Instead of delaying the unref to blockdev_auto_del(), which made sense
back when it was a hard delete, you unref right away, leaving
blockdev_auto_del() with nothing to do.  Swaps the order of the two
unrefs, but that's just fine.

We now cancel even when !dinfo, i.e. even when we won't delete.  Are you
sure that's correct?  If it is, then it needs to be explained in the
commit message.

> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index c427698..0582787 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -945,7 +945,9 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
>      s->dataplane = NULL;
>      qemu_del_vm_change_state_handler(s->change);
>      unregister_savevm(dev, "virtio-blk", s);
> -    blockdev_mark_auto_del(s->blk);
> +    blk_detach_dev(s->blk, dev);
> +    blockdev_del_drive(s->blk);
> +    s->blk = NULL;
>      virtio_cleanup(vdev);
>  }

Before your patch, we leave finalization of the property to its
release() callback release_drive(), as we should.  All we do here is
schedule warty deletion.  And that we must do here, because only here we
know that warty deletion is wanted.

Your patch inserts a copy of release_drive() and hacks it up a bit.  Two
hunks down, release_drive() gets hacked up to conditionally avoid
repeating the job.

This feels rather dirty to me.

>  
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 7bd5bde..39a72e4 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -1041,6 +1041,7 @@ static void blk_disconnect(struct XenDevice *xendev)
>  
>      if (blkdev->blk) {
>          blk_detach_dev(blkdev->blk, blkdev);
> +        blockdev_del_drive(blkdev->blk);
>          blk_unref(blkdev->blk);
>          blkdev->blk = NULL;
>      }

This is a non-qdevified device, where the link to the backend is not a
property, and the link to the backend has to be dismantled by the device
itself.

I believe inserting blockdev_del_drive() extends the automatic deletion
wart to this device.  That's an incompatible change, isn't it?

> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> index e10cede..469ba8a 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -101,9 +101,13 @@ static void release_drive(Object *obj, const char *name, void *opaque)
>      Property *prop = opaque;
>      BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
>  
> -    if (*ptr) {
> +    if (*ptr && blk_get_attached_dev(*ptr) != NULL) {
> +        /* Unrealize has already called blk_detach_dev and blockdev_del_drive
> +         * if the device has been realized; in that case, blk_get_attached_dev
> +         * returns NULL.  Thus, we get here if the device failed to realize,
> +         * and the -drive must not be released.
> +         */
>          blk_detach_dev(*ptr, dev);
> -        blockdev_auto_del(*ptr);
>      }
>  }

Two changes:

* The change to the condition suppresses the code you copied to
  unrealize() methods when it already ran there.

* blockdev_auto_del() is gone.

> diff --git a/hw/ide/piix.c b/hw/ide/piix.c
> index df46147..cf8fa58 100644
> --- a/hw/ide/piix.c
> +++ b/hw/ide/piix.c
> @@ -182,6 +182,9 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev)
>              if (ds) {
>                  blk_detach_dev(blk, ds);
>              }
> +            if (pci_ide->bus[di->bus].ifs[di->unit].blk) {
> +                blockdev_del_drive(blk);
> +            }
>              pci_ide->bus[di->bus].ifs[di->unit].blk = NULL;
>              if (!(i % 2)) {
>                  idedev = pci_ide->bus[di->bus].master;

Same comment as for xen_disk.c.

> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 6dcdbc0..3b2b766 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -214,7 +214,9 @@ static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
>      }
>  
>      scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));
> -    blockdev_mark_auto_del(dev->conf.blk);
> +    blk_detach_dev(dev->conf.blk, qdev);
> +    blockdev_del_drive(dev->conf.blk);
> +    dev->conf.blk = NULL;
>  }
>  
>  /* handle legacy '-drive if=scsi,...' cmd line args */

Same comment as for virtio-blk.c.

> diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
> index 5ae0424..1c00211 100644
> --- a/hw/usb/dev-storage.c
> +++ b/hw/usb/dev-storage.c
> @@ -643,7 +643,8 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp)
>       * blockdev, or else scsi_bus_legacy_add_drive() dies when it
>       * attaches again.
>       *
> -     * The hack is probably a bad idea.
> +     * The hack is probably a bad idea.  Anyway, this is why this does not
> +     * call blockdev_del_drive.
>       */
>      blk_detach_dev(blk, &s->dev.qdev);
>      s->conf.blk = NULL;

Note that other qdevified block devices (such as nvme) are *not*
touched.  Warty auto deletion is extended only to some, but not all
cases.

> diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
> index b06a060..ae7ad67 100644
> --- a/include/sysemu/blockdev.h
> +++ b/include/sysemu/blockdev.h
> @@ -14,8 +14,7 @@
>  #include "qapi/error.h"
>  #include "qemu/queue.h"
>  
> -void blockdev_mark_auto_del(BlockBackend *blk);
> -void blockdev_auto_del(BlockBackend *blk);
> +void blockdev_del_drive(BlockBackend *blk);
>  
>  typedef enum {
>      IF_DEFAULT = -1,            /* for use with drive_add() only */
> @@ -34,7 +33,6 @@ struct DriveInfo {
>      BlockInterfaceType type;
>      int bus;
>      int unit;
> -    int auto_del;               /* see blockdev_mark_auto_del() */
>      bool is_default;            /* Added by default_drive() ?  */
>      int media_cd;
>      int cyls, heads, secs, trans;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time Paolo Bonzini
@ 2016-03-21 15:22   ` Markus Armbruster
  2016-03-21 15:37     ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 15:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> While the next patch will anticipate the death of the DriveInfo
> data structure, the BlockBackend must survive after unrealize,
> for example in case there are outstanding operations on it.
> The good thing is that we can just use reference counting to
> do it.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/core/qdev-properties-system.c | 19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
> index 469ba8a..5e84b55 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -93,6 +93,7 @@ static void parse_drive(DeviceState *dev, const char *str, void **ptr,
       if (blk_attach_dev(blk, dev) < 0) {
           DriveInfo *dinfo = blk_legacy_dinfo(blk);

           if (dinfo->type != IF_NONE) {
               error_setg(errp, "Drive '%s' is already in use because "
                          "it has been automatically connected to another "
                          "device (did you need 'if=none' in the drive options?)",
                          str);
           } else {
               error_setg(errp, "Drive '%s' is already in use by another device",
                          str);
           }
>          return;
>      }
>      *ptr = blk;
> +    blk_ref(blk);

blk_attach_dev() already takes a reference.  I'm not sure I understand
why you need to take a second one.  You say "in case there are
outstanding operations on it."  What operations could that be?  And
shouldn't they take their own reference?

I hasten to add that I'm not going to demand you fix them to take their
own references.  It's okay to take a hacky second reference here, then
fix "them" at our leisure.  But I need to understand what exactly this
second reference protects.  It probably needs to be explained in the
source, too.

>  }
>  
>  static void release_drive(Object *obj, const char *name, void *opaque)
> @@ -101,13 +102,17 @@ static void release_drive(Object *obj, const char *name, void *opaque)
>      Property *prop = opaque;
>      BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
>  
> -    if (*ptr && blk_get_attached_dev(*ptr) != NULL) {
> -        /* Unrealize has already called blk_detach_dev and blockdev_del_drive
> -         * if the device has been realized; in that case, blk_get_attached_dev
> -         * returns NULL.  Thus, we get here if the device failed to realize,
> -         * and the -drive must not be released.
> -         */
> -        blk_detach_dev(*ptr, dev);
> +    if (*ptr) {
> +        if (blk_get_attached_dev(*ptr) != NULL) {
> +            /* Unrealize has already called blk_detach_dev and
> +             * blockdev_del_drive if the device has been realized;
> +             * in that case, blk_get_attached_dev returns NULL.  Thus,
> +             * we get here if the device failed to realize, and the
> +             * -drive must not be released.
> +             */
> +            blk_detach_dev(*ptr, dev);
> +        }
> +        blk_unref(*ptr);
>      }
>  }

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-21 15:13   ` Markus Armbruster
@ 2016-03-21 15:31     ` Paolo Bonzini
  2016-03-21 17:19       ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-21 15:31 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 16:13, Markus Armbruster wrote:
> Meanwhile, commit 12bde0e added block job cancellation:
> 
>     block: cancel jobs when a device is ready to go away
>     
>     We do not want jobs to keep a device busy for a possibly very long
>     time, and management could become confused because they thought a
>     device was not even there anymore.  So, cancel long-running jobs
>     as soon as their device is going to disappear.
> 
> The automatic block job cancellation is an extension of the automatic
> deletion wart.  We cancel exactly when we schedule the warty deletion.
> Note that this made blockdev_mark_auto_del() do more than its name
> promises.  I never liked that, and always wondered why we don't cancel
> in blockdev_auto_del() instead.

Because management would fall prey of exactly the bug we're trying to
fix.  For example by getting a BLOCK_JOB_CANCELLED event for a block
device that (according to the earlier DEVICE_DELETED event) should have
gone away already.

> Instead of delaying the unref to blockdev_auto_del(), which made sense
> back when it was a hard delete, you unref right away, leaving
> blockdev_auto_del() with nothing to do.  Swaps the order of the two
> unrefs, but that's just fine.
> 
> We now cancel even when !dinfo, i.e. even when we won't delete.  Are you
> sure that's correct?  If it is, then it needs to be explained in the
> commit message.

Will avoid that.

>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index c427698..0582787 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -945,7 +945,9 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
>>      s->dataplane = NULL;
>>      qemu_del_vm_change_state_handler(s->change);
>>      unregister_savevm(dev, "virtio-blk", s);
>> -    blockdev_mark_auto_del(s->blk);
>> +    blk_detach_dev(s->blk, dev);
>> +    blockdev_del_drive(s->blk);
>> +    s->blk = NULL;
>>      virtio_cleanup(vdev);
>>  }
> 
> Before your patch, we leave finalization of the property to its
> release() callback release_drive(), as we should.  All we do here is
> schedule warty deletion.  And that we must do here, because only here we
> know that warty deletion is wanted.
> 
> Your patch inserts a copy of release_drive() and hacks it up a bit.  Two
> hunks down, release_drive() gets hacked up to conditionally avoid
> repeating the job.
> 
> This feels rather dirty to me.

The other possibility is to make blk_detach_dev do nothing if blk->dev
== NULL, i.e. make it idempotent.  On one hand, who doesn't like
idempotency; on the other hand, removing an assertion is also dirty.

I chose the easy way here (changing as fewer contracts as possible).

>> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
>> index 7bd5bde..39a72e4 100644
>> --- a/hw/block/xen_disk.c
>> +++ b/hw/block/xen_disk.c
>> @@ -1041,6 +1041,7 @@ static void blk_disconnect(struct XenDevice *xendev)
>>  
>>      if (blkdev->blk) {
>>          blk_detach_dev(blkdev->blk, blkdev);
>> +        blockdev_del_drive(blkdev->blk);
>>          blk_unref(blkdev->blk);
>>          blkdev->blk = NULL;
>>      }
> 
> This is a non-qdevified device, where the link to the backend is not a
> property, and the link to the backend has to be dismantled by the device
> itself.
> 
> I believe inserting blockdev_del_drive() extends the automatic deletion
> wart to this device.  That's an incompatible change, isn't it?

This is why I wanted a careful review. :)  I can surely get rid of it.

>> diff --git a/hw/ide/piix.c b/hw/ide/piix.c
>> index df46147..cf8fa58 100644
>> --- a/hw/ide/piix.c
>> +++ b/hw/ide/piix.c
>> @@ -182,6 +182,9 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev)
>>              if (ds) {
>>                  blk_detach_dev(blk, ds);
>>              }
>> +            if (pci_ide->bus[di->bus].ifs[di->unit].blk) {
>> +                blockdev_del_drive(blk);
>> +            }
>>              pci_ide->bus[di->bus].ifs[di->unit].blk = NULL;
>>              if (!(i % 2)) {
>>                  idedev = pci_ide->bus[di->bus].master;
> 
> Same comment as for xen_disk.c.

Same here.

>> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
>> index 6dcdbc0..3b2b766 100644
>> --- a/hw/scsi/scsi-bus.c
>> +++ b/hw/scsi/scsi-bus.c
>> @@ -214,7 +214,9 @@ static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
>>      }
>>  
>>      scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));
>> -    blockdev_mark_auto_del(dev->conf.blk);
>> +    blk_detach_dev(dev->conf.blk, qdev);
>> +    blockdev_del_drive(dev->conf.blk);
>> +    dev->conf.blk = NULL;
>>  }
>>  
>>  /* handle legacy '-drive if=scsi,...' cmd line args */
> 
> Same comment as for virtio-blk.c.

Same answer...

>> diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
>> index 5ae0424..1c00211 100644
>> --- a/hw/usb/dev-storage.c
>> +++ b/hw/usb/dev-storage.c
>> @@ -643,7 +643,8 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp)
>>       * blockdev, or else scsi_bus_legacy_add_drive() dies when it
>>       * attaches again.
>>       *
>> -     * The hack is probably a bad idea.
>> +     * The hack is probably a bad idea.  Anyway, this is why this does not
>> +     * call blockdev_del_drive.
>>       */
>>      blk_detach_dev(blk, &s->dev.qdev);
>>      s->conf.blk = NULL;
> 
> Note that other qdevified block devices (such as nvme) are *not*
> touched.  Warty auto deletion is extended only to some, but not all
> cases.

I wonder if we actually _should_ extend to all of them, i.e. which way
is the bug.  That would of course change what to do with Xen and IDE.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time
  2016-03-21 15:22   ` Markus Armbruster
@ 2016-03-21 15:37     ` Paolo Bonzini
  0 siblings, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-21 15:37 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 16:22, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> While the next patch will anticipate the death of the DriveInfo
>> data structure, the BlockBackend must survive after unrealize,
>> for example in case there are outstanding operations on it.
>> The good thing is that we can just use reference counting to
>> do it.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  hw/core/qdev-properties-system.c | 19 ++++++++++++-------
>>  1 file changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
>> index 469ba8a..5e84b55 100644
>> --- a/hw/core/qdev-properties-system.c
>> +++ b/hw/core/qdev-properties-system.c
>> @@ -93,6 +93,7 @@ static void parse_drive(DeviceState *dev, const char *str, void **ptr,
>        if (blk_attach_dev(blk, dev) < 0) {
>            DriveInfo *dinfo = blk_legacy_dinfo(blk);
> 
>            if (dinfo->type != IF_NONE) {
>                error_setg(errp, "Drive '%s' is already in use because "
>                           "it has been automatically connected to another "
>                           "device (did you need 'if=none' in the drive options?)",
>                           str);
>            } else {
>                error_setg(errp, "Drive '%s' is already in use by another device",
>                           str);
>            }
>>          return;
>>      }
>>      *ptr = blk;
>> +    blk_ref(blk);
> 
> blk_attach_dev() already takes a reference.  I'm not sure I understand
> why you need to take a second one.  You say "in case there are
> outstanding operations on it."  What operations could that be?

There could be asynchronous I/O operations which are still active after
unrealize.  The device would not be finalized until they are completed.

> And shouldn't they take their own reference?

Generally the block layer doesn't try to ref/unref on every use.  It
assumes that someone else does it for you.  A better justification for
this patch is that blk_attach_dev/blk_detach_dev actually does not need
to take a reference, so we can add it to parse_drive/release_drive and
remove it from blk_attach_dev/blk_detach_dev instead.

Paolo

> I hasten to add that I'm not going to demand you fix them to take their
> own references.  It's okay to take a hacky second reference here, then
> fix "them" at our leisure.  But I need to understand what exactly this
> second reference protects.  It probably needs to be explained in the
> source, too.
> 
>>  }
>>  
>>  static void release_drive(Object *obj, const char *name, void *opaque)
>> @@ -101,13 +102,17 @@ static void release_drive(Object *obj, const char *name, void *opaque)
>>      Property *prop = opaque;
>>      BlockBackend **ptr = qdev_get_prop_ptr(dev, prop);
>>  
>> -    if (*ptr && blk_get_attached_dev(*ptr) != NULL) {
>> -        /* Unrealize has already called blk_detach_dev and blockdev_del_drive
>> -         * if the device has been realized; in that case, blk_get_attached_dev
>> -         * returns NULL.  Thus, we get here if the device failed to realize,
>> -         * and the -drive must not be released.
>> -         */
>> -        blk_detach_dev(*ptr, dev);
>> +    if (*ptr) {
>> +        if (blk_get_attached_dev(*ptr) != NULL) {
>> +            /* Unrealize has already called blk_detach_dev and
>> +             * blockdev_del_drive if the device has been realized;
>> +             * in that case, blk_get_attached_dev returns NULL.  Thus,
>> +             * we get here if the device failed to realize, and the
>> +             * -drive must not be released.
>> +             */
>> +            blk_detach_dev(*ptr, dev);
>> +        }
>> +        blk_unref(*ptr);
>>      }
>>  }

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-02-22 14:39 ` [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time Paolo Bonzini
@ 2016-03-21 16:15   ` Markus Armbruster
  2016-03-21 16:21     ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 16:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> Currently, blockdev_del_drive (and before it blk_auto_del) does a blk_unref
> that will cause blk_delete to be called and the DriveInfo to be freed.
> But really, we want to free the drive info as soon as the device is
> detached, even if there are other references for whatever reason, so
> that the QemuOpts are freed as well and the id can be reused.

Let me write up how I understand this, so you can correct my
misunderstandings.

Management applications expect that on receipt of event DEVICE_DELETED

* the frontend is fully gone, and

* any of its warty block backends are fully gone.

A frontend's block backend is warty if it gets automatically deleted
along with the frontend.

"Fully gone" implies the ID can safely be reused.

The whole series is about warty automatic block backend deletion.

Non-warty block backends need to be deleted explicitly with
x-blockdev-del, which fails when there are other references.

We can't do the same for warty deletion, because the *frontend* got
deleted just fine (thus DEVICE_DELETED must be sent), even when
something else is keeping the backend alive.  Back when the wart was
born, this wasn't possible.

So, what to do?  The commit message sounds like the patch hides the
backend then, so it's "fully gone" from the management application's
view.

But isn't that an impossible mission if the "something else" is
something the management application can see?  For instance, what if
there's a block job tied to the backend?  The management application can
see that tie.  If we succeed in hiding the backend, we have a dangling
tie.  If we keep the tie, we failed at hiding.

We avoid the case of a block job the hamfisted way: we cancel it.  Okay,
but that begs the question what else could hold a reference, whether
it's similarly exposed to the management appliction, and whether it
needs to be "cancelled" as well.

I'm sure Shrek would love this swamp.

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/block-backend.c     | 13 +++++++++----
>  blockdev.c                |  9 +++++----
>  include/sysemu/blockdev.h |  1 +
>  3 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index ebdf78a..0a85c6a 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -65,8 +65,6 @@ static const AIOCBInfo block_backend_aiocb_info = {
>      .aiocb_size = sizeof(BlockBackendAIOCB),
>  };
>  
> -static void drive_info_del(DriveInfo *dinfo);
> -
>  /* All the BlockBackends (except for hidden ones) */
>  static QTAILQ_HEAD(, BlockBackend) blk_backends =
>      QTAILQ_HEAD_INITIALIZER(blk_backends);
> @@ -165,6 +163,7 @@ static void blk_delete(BlockBackend *blk)
>  {
>      assert(!blk->refcnt);
>      assert(!blk->dev);
> +    assert(!blk->legacy_dinfo);
>      if (blk->bs) {
>          blk_remove_bs(blk);
>      }
> @@ -179,19 +178,25 @@ static void blk_delete(BlockBackend *blk)
>          QTAILQ_REMOVE(&blk_backends, blk, link);
>      }
>      g_free(blk->name);
> -    drive_info_del(blk->legacy_dinfo);
>      block_acct_cleanup(&blk->stats);
>      g_free(blk);
>  }

This and the assertion above effectively say "you must delete a
BlockBackend's legacy_dinfo before you drop its last reference."
We'll see below why that works.

>  
> -static void drive_info_del(DriveInfo *dinfo)
> +void blk_release_legacy_dinfo(BlockBackend *blk)
>  {
> +    DriveInfo *dinfo = blk->legacy_dinfo;
> +
>      if (!dinfo) {
>          return;
>      }
>      qemu_opts_del(dinfo->opts);
>      g_free(dinfo->serial);
>      g_free(dinfo);
> +    blk->legacy_dinfo = NULL;
> +    /* We are not interested anymore in retrieving the BlockBackend
> +     * via blk_by_legacy_dinfo, so let it die.
> +     */
> +    blk_unref(blk);
>  }
>  

This looks like DriveInfo now owns a reference to BlockBackend, even
though the pointer still goes in the other direction.

>  int blk_get_refcnt(BlockBackend *blk)
> diff --git a/blockdev.c b/blockdev.c
> index 2dfb2d8..85f0cb5 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -120,10 +120,10 @@ void override_max_devs(BlockInterfaceType type, int max_devs)
>   */
>  void blockdev_del_drive(BlockBackend *blk)
>  {
> -    DriveInfo *dinfo = blk_legacy_dinfo(blk);
>      BlockDriverState *bs = blk_bs(blk);
>      AioContext *aio_context;
>  
> +    blk_ref(blk);
>      if (bs) {
>          aio_context = bdrv_get_aio_context(bs);
>          aio_context_acquire(aio_context);
> @@ -135,9 +135,10 @@ void blockdev_del_drive(BlockBackend *blk)
>          aio_context_release(aio_context);
>      }
>  
> -    if (dinfo) {
> -        blk_unref(blk);
> +    if (blk_legacy_dinfo(blk)) {
> +        blk_release_legacy_dinfo(blk);
>      }
> +    blk_unref(blk);
>  }

Before: we drop a reference here if we have a DriveInfo.

After: we drop a reference in blk_release_legacy_dinfo() if we have a
DriveInfo.  We also take a temporary reference here.  I guess the only
reason is to avoid tripping the assertion we just discussed.

>  
>  /**
> @@ -2811,7 +2812,7 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
       /* if we have a device attached to this BlockDriverState
        * then we need to make the drive anonymous until the device
        * can be removed.  If this is a drive with no device backing
        * then we can just get rid of the block driver state right here.
        */
       if (blk_get_attached_dev(blk)) {
           blk_hide_on_behalf_of_hmp_drive_del(blk);
           /* Further I/O must not pause the guest */
>          blk_set_on_error(blk, BLOCKDEV_ON_ERROR_REPORT,
>                           BLOCKDEV_ON_ERROR_REPORT);
>      } else {
> -        blk_unref(blk);
> +        blockdev_del_drive(blk);

Now we come to the spot I've dreaded all along...

If blk_get_attached_dev(), we can't delete the backend, so we must hide
it.  Not changed by your patch.

Else, we delete the backend.

Before your patch, we delete the obvious way: drop the reference.  If
something else holds another reference, we fail to delete.  If this can
happen, it's a bug.  I'm not sure it can't happen.

After your patch, we can't just drop the reference, because that would
trip the assertion we just discussed.  So we call blockdev_del_drive()
instead.  In addition to dropping the reference, this

* Cancels block jobs.  Could well be a fix for the bug I just described,
  but it needs to be featured in the commit message then.

* blk_release_legacy_dinfo().  Same addition as in blockdev_del_drive()
  above.

Now back to why "you must delete a BlockBackend's legacy_dinfo before
you drop its last reference" works.  A BlockBackend has a legacy_dinfo
exactly when it was created with -drive or drive_add.  The block layer
holds a reference for lookup by name then.  This reference goes away
only when the backend is deleted by the user, in one of the following
ways:

* Explicitly with drive_del

  Deletes right away if no frontend is attached.  Your patch has the
  necessary replacement of blk_unref(blk) by blockdev_del_drive(blk), in
  hmp_drive_del().

  Else, deletion is delayed until the frontend detaches.  Where is the
  legacy_dinfo released then?

* Explicitly with x-blockdev-del

  Fails unless no other reference exists.  Where is the legacy_dinfo
  released?

* Implicitly via warty automatic deletion

  Your PATCH 01 has the necessary replacement of blk_unref(blk) by
  blkdev_del_drive(blk) for some devices (virtio-blk.c, scsi-bus.c,
  xen_disk,c, piix.c), butas far as I can see not for others such as
  nvme.c.

>      }
>  
>      aio_context_release(aio_context);
> diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
> index ae7ad67..5722b9f 100644
> --- a/include/sysemu/blockdev.h
> +++ b/include/sysemu/blockdev.h
> @@ -44,6 +44,7 @@ struct DriveInfo {
>  DriveInfo *blk_legacy_dinfo(BlockBackend *blk);
>  DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo);
>  BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo);
> +void blk_release_legacy_dinfo(BlockBackend *blk);
>  
>  void override_max_devs(BlockInterfaceType type, int max_devs);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 16:15   ` Markus Armbruster
@ 2016-03-21 16:21     ` Paolo Bonzini
  2016-03-21 17:30       ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-21 16:21 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 17:15, Markus Armbruster wrote:
> * Explicitly with x-blockdev-del
> 
>   Fails unless no other reference exists.  Where is the legacy_dinfo
>   released?

Can a -drive block device be deleted with x-blockdev-del even?!?

In other words, you said "This looks like DriveInfo now owns a reference
to BlockBackend, even though the pointer still goes in the other
direction".  I say, "I thought this was the idea all along"...

Shall I add a check to x-blockdev-del that gives an error if the
BlockBackend has a DriveInfo attached?

Paolo

> * Implicitly via warty automatic deletion
> 
>   Your PATCH 01 has the necessary replacement of blk_unref(blk) by
>   blkdev_del_drive(blk) for some devices (virtio-blk.c, scsi-bus.c,
>   xen_disk,c, piix.c), butas far as I can see not for others such as
>   nvme.c.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-21 15:31     ` Paolo Bonzini
@ 2016-03-21 17:19       ` Markus Armbruster
  2016-03-21 17:30         ` Paolo Bonzini
  2016-03-22 22:15         ` Paolo Bonzini
  0 siblings, 2 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 17:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/03/2016 16:13, Markus Armbruster wrote:
>> Meanwhile, commit 12bde0e added block job cancellation:
>> 
>>     block: cancel jobs when a device is ready to go away
>>     
>>     We do not want jobs to keep a device busy for a possibly very long
>>     time, and management could become confused because they thought a
>>     device was not even there anymore.  So, cancel long-running jobs
>>     as soon as their device is going to disappear.
>> 
>> The automatic block job cancellation is an extension of the automatic
>> deletion wart.  We cancel exactly when we schedule the warty deletion.
>> Note that this made blockdev_mark_auto_del() do more than its name
>> promises.  I never liked that, and always wondered why we don't cancel
>> in blockdev_auto_del() instead.
>
> Because management would fall prey of exactly the bug we're trying to
> fix.  For example by getting a BLOCK_JOB_CANCELLED event for a block
> device that (according to the earlier DEVICE_DELETED event) should have
> gone away already.
>
>> Instead of delaying the unref to blockdev_auto_del(), which made sense
>> back when it was a hard delete, you unref right away, leaving
>> blockdev_auto_del() with nothing to do.  Swaps the order of the two
>> unrefs, but that's just fine.
>> 
>> We now cancel even when !dinfo, i.e. even when we won't delete.  Are you
>> sure that's correct?  If it is, then it needs to be explained in the
>> commit message.
>
> Will avoid that.
>
>>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>>> index c427698..0582787 100644
>>> --- a/hw/block/virtio-blk.c
>>> +++ b/hw/block/virtio-blk.c
>>> @@ -945,7 +945,9 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
>>>      s->dataplane = NULL;
>>>      qemu_del_vm_change_state_handler(s->change);
>>>      unregister_savevm(dev, "virtio-blk", s);
>>> -    blockdev_mark_auto_del(s->blk);
>>> +    blk_detach_dev(s->blk, dev);
>>> +    blockdev_del_drive(s->blk);
>>> +    s->blk = NULL;
>>>      virtio_cleanup(vdev);
>>>  }
>> 
>> Before your patch, we leave finalization of the property to its
>> release() callback release_drive(), as we should.  All we do here is
>> schedule warty deletion.  And that we must do here, because only here we
>> know that warty deletion is wanted.
>> 
>> Your patch inserts a copy of release_drive() and hacks it up a bit.  Two
>> hunks down, release_drive() gets hacked up to conditionally avoid
>> repeating the job.
>> 
>> This feels rather dirty to me.
>
> The other possibility is to make blk_detach_dev do nothing if blk->dev
> == NULL, i.e. make it idempotent.  On one hand, who doesn't like
> idempotency; on the other hand, removing an assertion is also dirty.
>
> I chose the easy way here (changing as fewer contracts as possible).

Why can't we keep the work in the property release() method
release_drive()?

The only reason blockdev_mark_auto_del() isn't there is that the device
decides whether to call it, not the property.

>>> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
>>> index 7bd5bde..39a72e4 100644
>>> --- a/hw/block/xen_disk.c
>>> +++ b/hw/block/xen_disk.c
>>> @@ -1041,6 +1041,7 @@ static void blk_disconnect(struct XenDevice *xendev)
>>>  
>>>      if (blkdev->blk) {
>>>          blk_detach_dev(blkdev->blk, blkdev);
>>> +        blockdev_del_drive(blkdev->blk);
>>>          blk_unref(blkdev->blk);
>>>          blkdev->blk = NULL;
>>>      }
>> 
>> This is a non-qdevified device, where the link to the backend is not a
>> property, and the link to the backend has to be dismantled by the device
>> itself.
>> 
>> I believe inserting blockdev_del_drive() extends the automatic deletion
>> wart to this device.  That's an incompatible change, isn't it?
>
> This is why I wanted a careful review. :)  I can surely get rid of it.
>
>>> diff --git a/hw/ide/piix.c b/hw/ide/piix.c
>>> index df46147..cf8fa58 100644
>>> --- a/hw/ide/piix.c
>>> +++ b/hw/ide/piix.c
>>> @@ -182,6 +182,9 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev)
>>>              if (ds) {
>>>                  blk_detach_dev(blk, ds);
>>>              }
>>> +            if (pci_ide->bus[di->bus].ifs[di->unit].blk) {
>>> +                blockdev_del_drive(blk);
>>> +            }
>>>              pci_ide->bus[di->bus].ifs[di->unit].blk = NULL;
>>>              if (!(i % 2)) {
>>>                  idedev = pci_ide->bus[di->bus].master;
>> 
>> Same comment as for xen_disk.c.
>
> Same here.
>
>>> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
>>> index 6dcdbc0..3b2b766 100644
>>> --- a/hw/scsi/scsi-bus.c
>>> +++ b/hw/scsi/scsi-bus.c
>>> @@ -214,7 +214,9 @@ static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
>>>      }
>>>  
>>>      scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));
>>> -    blockdev_mark_auto_del(dev->conf.blk);
>>> +    blk_detach_dev(dev->conf.blk, qdev);
>>> +    blockdev_del_drive(dev->conf.blk);
>>> +    dev->conf.blk = NULL;
>>>  }
>>>  
>>>  /* handle legacy '-drive if=scsi,...' cmd line args */
>> 
>> Same comment as for virtio-blk.c.
>
> Same answer...
>
>>> diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
>>> index 5ae0424..1c00211 100644
>>> --- a/hw/usb/dev-storage.c
>>> +++ b/hw/usb/dev-storage.c
>>> @@ -643,7 +643,8 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp)
>>>       * blockdev, or else scsi_bus_legacy_add_drive() dies when it
>>>       * attaches again.
>>>       *
>>> -     * The hack is probably a bad idea.
>>> +     * The hack is probably a bad idea.  Anyway, this is why this does not
>>> +     * call blockdev_del_drive.
>>>       */
>>>      blk_detach_dev(blk, &s->dev.qdev);
>>>      s->conf.blk = NULL;
>> 
>> Note that other qdevified block devices (such as nvme) are *not*
>> touched.  Warty auto deletion is extended only to some, but not all
>> cases.
>
> I wonder if we actually _should_ extend to all of them, i.e. which way
> is the bug.  That would of course change what to do with Xen and IDE.

Certainly debatable.

Current wart: virtio-blk and SCSI devices auto-delete block backends
with DriveInfo.

Possible alternate wart: all devices auto-delete block backends with
DriveInfo.  This is more regular, but the more regular wart is still a
wart.

I think only if some our users actually expect the alternate wart can we
seriosuly consider switching, because then we have to choose between two
breakages anyway:

* We can stick to the current wart, and leave these users broken.

* We can switch to the alternate wart, unbreak these users, and break
  the users that expect the current wart.

Without further evidence on who expects what, I'd stick to the current
wart.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-21 17:19       ` Markus Armbruster
@ 2016-03-21 17:30         ` Paolo Bonzini
  2016-03-23  8:35           ` Markus Armbruster
  2016-03-22 22:15         ` Paolo Bonzini
  1 sibling, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-21 17:30 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 18:19, Markus Armbruster wrote:
>> >
>> > The other possibility is to make blk_detach_dev do nothing if blk->dev
>> > == NULL, i.e. make it idempotent.  On one hand, who doesn't like
>> > idempotency; on the other hand, removing an assertion is also dirty.
>> >
>> > I chose the easy way here (changing as fewer contracts as possible).
> Why can't we keep the work in the property release() method
> release_drive()?
> 
> The only reason blockdev_mark_auto_del() isn't there is that the device
> decides whether to call it, not the property.

DEVICE_DELETED is currently sent right after setting unrealized to false
(see device_unparent), and you cannnot send it later than that.  In
particular release_drive would mean sending the drive when properties
are removed in instance_finalize; by that time you don't have anymore a
QOM path to include in the event.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 16:21     ` Paolo Bonzini
@ 2016-03-21 17:30       ` Markus Armbruster
  2016-03-21 17:34         ` Paolo Bonzini
  2016-03-21 17:39         ` Kevin Wolf
  0 siblings, 2 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 17:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/03/2016 17:15, Markus Armbruster wrote:
>> * Explicitly with x-blockdev-del
>> 
>>   Fails unless no other reference exists.  Where is the legacy_dinfo
>>   released?
>
> Can a -drive block device be deleted with x-blockdev-del even?!?

When I wrote my review, I forgot that I expect x-blockdev-del to accept
only backends created with blockdev-add.  With that, my question is
indeed moot.

However, I've now tested my expectation, and it turned out to be wrong.
I'm inclined to call that a bug.

> In other words, you said "This looks like DriveInfo now owns a reference
> to BlockBackend, even though the pointer still goes in the other
> direction".  I say, "I thought this was the idea all along"...

For me, the DriveInfo doesn't own anything, but a BlockBackend may have
a DriveInfo.  Evidence:

* The pointer goes from the BlockBackend to the DriveInfo

* To go back, you search the blk_backends for the one that has the
  DriveInfo.  See blk_by_legacy_dinfo().

* There is no list of DriveInfo.  If you want to find one, you search
  blk_backends.  See drive_get() & friends.

> Shall I add a check to x-blockdev-del that gives an error if the
> BlockBackend has a DriveInfo attached?

Yes, please.  But do double-check with Kevin & Max, who might have
different ideas on blockdev-add/del than I do.

>
> Paolo
>
>> * Implicitly via warty automatic deletion
>> 
>>   Your PATCH 01 has the necessary replacement of blk_unref(blk) by
>>   blkdev_del_drive(blk) for some devices (virtio-blk.c, scsi-bus.c,
>>   xen_disk,c, piix.c), butas far as I can see not for others such as
>>   nvme.c.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 17:30       ` Markus Armbruster
@ 2016-03-21 17:34         ` Paolo Bonzini
  2016-03-21 18:14           ` Markus Armbruster
  2016-03-21 17:39         ` Kevin Wolf
  1 sibling, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-21 17:34 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 18:30, Markus Armbruster wrote:
> However, I've now tested my expectation, and it turned out to be wrong.
> I'm inclined to call that a bug.

--verbose, what is wrong and what was your expectation?

> > In other words, you said "This looks like DriveInfo now owns a reference
> > to BlockBackend, even though the pointer still goes in the other
> > direction".  I say, "I thought this was the idea all along"...
> 
> For me, the DriveInfo doesn't own anything, but a BlockBackend may have
> a DriveInfo.  Evidence:
> 
> * The pointer goes from the BlockBackend to the DriveInfo
> 
> * To go back, you search the blk_backends for the one that has the
>   DriveInfo.  See blk_by_legacy_dinfo().
> 
> * There is no list of DriveInfo.  If you want to find one, you search
>   blk_backends.  See drive_get() & friends.

That's from the point of view of the code.  But from the point of view
of the user, he specifies a drive=... and the device converts that under
the hood to a BlockBackend; and when he calls drive_del on an unassigned
drive, the BlockBackend is destroyed.

There is no action on a BlockBackend that destroys the
DriveInfo---except auto-deletion on unplug, but even then the user in
the first place had provided a DriveInfo.  So from the point of view of
the user it's always been the DriveInfo that owned a BlockBackend.  The
lack of a list of DriveInfo is just an implementation detail.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 17:30       ` Markus Armbruster
  2016-03-21 17:34         ` Paolo Bonzini
@ 2016-03-21 17:39         ` Kevin Wolf
  2016-03-21 18:02           ` Markus Armbruster
  2016-03-22 22:10           ` Paolo Bonzini
  1 sibling, 2 replies; 32+ messages in thread
From: Kevin Wolf @ 2016-03-21 17:39 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-devel, Max Reitz

Am 21.03.2016 um 18:30 hat Markus Armbruster geschrieben:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 21/03/2016 17:15, Markus Armbruster wrote:
> >> * Explicitly with x-blockdev-del
> >> 
> >>   Fails unless no other reference exists.  Where is the legacy_dinfo
> >>   released?
> >
> > Can a -drive block device be deleted with x-blockdev-del even?!?
> 
> When I wrote my review, I forgot that I expect x-blockdev-del to accept
> only backends created with blockdev-add.  With that, my question is
> indeed moot.
> 
> However, I've now tested my expectation, and it turned out to be wrong.
> I'm inclined to call that a bug.

Yes.

> > Shall I add a check to x-blockdev-del that gives an error if the
> > BlockBackend has a DriveInfo attached?
> 
> Yes, please.  But do double-check with Kevin & Max, who might have
> different ideas on blockdev-add/del than I do.

I'm pretty sure that I said that failing on -drive/drive_add created
BlockBackends was a requirement for x-blockdev-del. Apparently I failed
to catch the bug in the review then.

So go ahead and let's fix it now.

Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 17:39         ` Kevin Wolf
@ 2016-03-21 18:02           ` Markus Armbruster
  2016-03-22 22:10           ` Paolo Bonzini
  1 sibling, 0 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 18:02 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, Max Reitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 21.03.2016 um 18:30 hat Markus Armbruster geschrieben:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>> > On 21/03/2016 17:15, Markus Armbruster wrote:
>> >> * Explicitly with x-blockdev-del
>> >> 
>> >>   Fails unless no other reference exists.  Where is the legacy_dinfo
>> >>   released?
>> >
>> > Can a -drive block device be deleted with x-blockdev-del even?!?
>> 
>> When I wrote my review, I forgot that I expect x-blockdev-del to accept
>> only backends created with blockdev-add.  With that, my question is
>> indeed moot.
>> 
>> However, I've now tested my expectation, and it turned out to be wrong.
>> I'm inclined to call that a bug.
>
> Yes.
>
>> > Shall I add a check to x-blockdev-del that gives an error if the
>> > BlockBackend has a DriveInfo attached?
>> 
>> Yes, please.  But do double-check with Kevin & Max, who might have
>> different ideas on blockdev-add/del than I do.
>
> I'm pretty sure that I said that failing on -drive/drive_add created
> BlockBackends was a requirement for x-blockdev-del. Apparently I failed
> to catch the bug in the review then.
>
> So go ahead and let's fix it now.

Yes, please.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 17:34         ` Paolo Bonzini
@ 2016-03-21 18:14           ` Markus Armbruster
  2016-03-22  8:19             ` Kevin Wolf
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-21 18:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/03/2016 18:30, Markus Armbruster wrote:
>> However, I've now tested my expectation, and it turned out to be wrong.
>> I'm inclined to call that a bug.
>
> --verbose, what is wrong and what was your expectation?

x-blockdev-del should refuse to touch anything not created with
blockdev-add.

>> > In other words, you said "This looks like DriveInfo now owns a reference
>> > to BlockBackend, even though the pointer still goes in the other
>> > direction".  I say, "I thought this was the idea all along"...
>> 
>> For me, the DriveInfo doesn't own anything, but a BlockBackend may have
>> a DriveInfo.  Evidence:
>> 
>> * The pointer goes from the BlockBackend to the DriveInfo
>> 
>> * To go back, you search the blk_backends for the one that has the
>>   DriveInfo.  See blk_by_legacy_dinfo().
>> 
>> * There is no list of DriveInfo.  If you want to find one, you search
>>   blk_backends.  See drive_get() & friends.
>
> That's from the point of view of the code.  But from the point of view
> of the user, he specifies a drive=... and the device converts that under
> the hood to a BlockBackend; and when he calls drive_del on an unassigned
> drive, the BlockBackend is destroyed.
>
> There is no action on a BlockBackend that destroys the
> DriveInfo---except auto-deletion on unplug, but even then the user in
> the first place had provided a DriveInfo.  So from the point of view of
> the user it's always been the DriveInfo that owned a BlockBackend.  The
> lack of a list of DriveInfo is just an implementation detail.

>From the user's point of view, neither BlockBackend nor DriveInfo are
visible :)

A BlockBackend may have a DriveInfo.  If it has one, then destroying the
BlockBackend also destroys its DriveInfo.

DriveInfo exists only to capture a -drive in a more convenient form than
its QemuOpts.  We use it for creating a BlockBackend.  It lives on after
that only because -drive mixes up front- and backend matter.  Keeping
DriveInfo around hanging off BlockBackend lets us keep frontend matter
out of BlockBackend: if need to access mixed up frontend matter for
back-compat, we find it in the BlockBackend's DriveInfo.

Imagine a future where we drop -drive / drive_add, or at least its mixed
up aspects (doesn't have to be practical for imagining it).  In that
future, we'd also drop DriveInfo.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 18:14           ` Markus Armbruster
@ 2016-03-22  8:19             ` Kevin Wolf
  2016-03-22 10:25               ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Kevin Wolf @ 2016-03-22  8:19 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-devel, Max Reitz

Am 21.03.2016 um 19:14 hat Markus Armbruster geschrieben:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 21/03/2016 18:30, Markus Armbruster wrote:
> >> However, I've now tested my expectation, and it turned out to be wrong.
> >> I'm inclined to call that a bug.
> >
> > --verbose, what is wrong and what was your expectation?
> 
> x-blockdev-del should refuse to touch anything not created with
> blockdev-add.
> 
> >> > In other words, you said "This looks like DriveInfo now owns a reference
> >> > to BlockBackend, even though the pointer still goes in the other
> >> > direction".  I say, "I thought this was the idea all along"...
> >> 
> >> For me, the DriveInfo doesn't own anything, but a BlockBackend may have
> >> a DriveInfo.  Evidence:
> >> 
> >> * The pointer goes from the BlockBackend to the DriveInfo
> >> 
> >> * To go back, you search the blk_backends for the one that has the
> >>   DriveInfo.  See blk_by_legacy_dinfo().
> >> 
> >> * There is no list of DriveInfo.  If you want to find one, you search
> >>   blk_backends.  See drive_get() & friends.
> >
> > That's from the point of view of the code.  But from the point of view
> > of the user, he specifies a drive=... and the device converts that under
> > the hood to a BlockBackend; and when he calls drive_del on an unassigned
> > drive, the BlockBackend is destroyed.
> >
> > There is no action on a BlockBackend that destroys the
> > DriveInfo---except auto-deletion on unplug, but even then the user in
> > the first place had provided a DriveInfo.  So from the point of view of
> > the user it's always been the DriveInfo that owned a BlockBackend.  The
> > lack of a list of DriveInfo is just an implementation detail.
> 
> From the user's point of view, neither BlockBackend nor DriveInfo are
> visible :)
> 
> A BlockBackend may have a DriveInfo.  If it has one, then destroying the
> BlockBackend also destroys its DriveInfo.
> 
> DriveInfo exists only to capture a -drive in a more convenient form than
> its QemuOpts.  We use it for creating a BlockBackend.  It lives on after
> that only because -drive mixes up front- and backend matter.  Keeping
> DriveInfo around hanging off BlockBackend lets us keep frontend matter
> out of BlockBackend: if need to access mixed up frontend matter for
> back-compat, we find it in the BlockBackend's DriveInfo.
> 
> Imagine a future where we drop -drive / drive_add, or at least its mixed
> up aspects (doesn't have to be practical for imagining it).  In that
> future, we'd also drop DriveInfo.

While we're dreaming up things... Imagine a future where users don't
have to know about BlockBackend, but devices automatically create their
BlockBackend. Which happens to be something that I'd really like to have
and at least I haven't seen the show stopper for it yet. Which might
just be because we never really looked much into the details, but
anyway...

In this case, the relationship between DriveInfo and BlockBackend
couldn't be "BB owns DriveInfo" any more, because the BB would be
created later than the DriveInfo. (DriveInfo controls the creation of
the guest device which would create the BB during its initialisation.)

By the way, what's the reason again for keeping DriveInfo around even
after having created the guest device?

Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-22  8:19             ` Kevin Wolf
@ 2016-03-22 10:25               ` Markus Armbruster
  2016-03-22 22:07                 ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-22 10:25 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, Max Reitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 21.03.2016 um 19:14 hat Markus Armbruster geschrieben:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>> > On 21/03/2016 18:30, Markus Armbruster wrote:
>> >> However, I've now tested my expectation, and it turned out to be wrong.
>> >> I'm inclined to call that a bug.
>> >
>> > --verbose, what is wrong and what was your expectation?
>> 
>> x-blockdev-del should refuse to touch anything not created with
>> blockdev-add.
>> 
>> >> > In other words, you said "This looks like DriveInfo now owns a reference
>> >> > to BlockBackend, even though the pointer still goes in the other
>> >> > direction".  I say, "I thought this was the idea all along"...
>> >> 
>> >> For me, the DriveInfo doesn't own anything, but a BlockBackend may have
>> >> a DriveInfo.  Evidence:
>> >> 
>> >> * The pointer goes from the BlockBackend to the DriveInfo
>> >> 
>> >> * To go back, you search the blk_backends for the one that has the
>> >>   DriveInfo.  See blk_by_legacy_dinfo().
>> >> 
>> >> * There is no list of DriveInfo.  If you want to find one, you search
>> >>   blk_backends.  See drive_get() & friends.
>> >
>> > That's from the point of view of the code.  But from the point of view
>> > of the user, he specifies a drive=... and the device converts that under
>> > the hood to a BlockBackend; and when he calls drive_del on an unassigned
>> > drive, the BlockBackend is destroyed.
>> >
>> > There is no action on a BlockBackend that destroys the
>> > DriveInfo---except auto-deletion on unplug, but even then the user in
>> > the first place had provided a DriveInfo.  So from the point of view of
>> > the user it's always been the DriveInfo that owned a BlockBackend.  The
>> > lack of a list of DriveInfo is just an implementation detail.
>> 
>> From the user's point of view, neither BlockBackend nor DriveInfo are
>> visible :)
>> 
>> A BlockBackend may have a DriveInfo.  If it has one, then destroying the
>> BlockBackend also destroys its DriveInfo.
>> 
>> DriveInfo exists only to capture a -drive in a more convenient form than
>> its QemuOpts.  We use it for creating a BlockBackend.  It lives on after
>> that only because -drive mixes up front- and backend matter.  Keeping
>> DriveInfo around hanging off BlockBackend lets us keep frontend matter
>> out of BlockBackend: if need to access mixed up frontend matter for
>> back-compat, we find it in the BlockBackend's DriveInfo.
>> 
>> Imagine a future where we drop -drive / drive_add, or at least its mixed
>> up aspects (doesn't have to be practical for imagining it).  In that
>> future, we'd also drop DriveInfo.
>
> While we're dreaming up things... Imagine a future where users don't
> have to know about BlockBackend, but devices automatically create their
> BlockBackend. Which happens to be something that I'd really like to have
> and at least I haven't seen the show stopper for it yet. Which might
> just be because we never really looked much into the details, but
> anyway...
>
> In this case, the relationship between DriveInfo and BlockBackend
> couldn't be "BB owns DriveInfo" any more, because the BB would be
> created later than the DriveInfo. (DriveInfo controls the creation of
> the guest device which would create the BB during its initialisation.)

I don't think that would necessarily affect ownership.

Regardless of how and when we create BlockBackend, we'll want to keep
the clean separation between frontend and backend internally and at the
user interface.

DriveInfo has no role in cleanly separate creation of frontend and
backend now, and it shouldn't get one in the future.  Its purpose is to
support the legacy user interface that has frontend and backend matters
mixed up.  Two things, actually:

* Letting board code find legacy drive configuration information, so it
  can create their frontends.  This code lingers because we still
  haven't reduced legacy drives to sugar for modern configuration.  I
  believe we haven't tried because we got a number of boards nobody
  wants to touch.  That's our old "stuff is too precious to retire, yet
  too worthless for anybody spending the time it takes to update it to
  modern interfaces" problem.

* Letting devices fall back to legacy configuration.  Things like "if
  qdev property "serial" isn't given, try getting the serial number from
  DriveInfo.

The way we do this now is to package up the relevant parts of QemuOpts
as a DriveInfo and stick them to the backend when we create the backend
from the QemuOpts.  "The backend" here is a BlockBackend.  If we change
that to be something else, I guess we'll tack the DriveInfo to whatever
that something else may be.

Reverting the "owns" relationship to make DriveInfo own BlockBackend (or
whatever) makes no sense to me, because DriveInfo exists only for *some*
backends.  The others need an owner two.  I think that owner should own
all of them, not just the ones without a DriveInfo.

> By the way, what's the reason again for keeping DriveInfo around even
> after having created the guest device?

Inertia?

I know it's accessed from some realize() methods.  We'd have to review
whether it's accessed after realize().

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-22 10:25               ` Markus Armbruster
@ 2016-03-22 22:07                 ` Paolo Bonzini
  2016-03-23  9:18                   ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-22 22:07 UTC (permalink / raw)
  To: Markus Armbruster, Kevin Wolf; +Cc: qemu-devel, Max Reitz



On 22/03/2016 11:25, Markus Armbruster wrote:
> Regardless of how and when we create BlockBackend, we'll want to keep
> the clean separation between frontend and backend internally and at the
> user interface.

This means that the BlockBackend should not own the DriveInfo.  The
backend and frontend need not know of the object that mixes concepts
from both of them.  Instead, the DriveInfo can instantiate itself into a
BlockBackend and the board can (if required) use the frontend parts of
DriveInfo to instantiate a device and connect it to the BlocKBackend.

In Kevin's idea there would be no ownership either way.  Until then, I
think my patch actually gets us closer to the ideal.

Paolo

> DriveInfo has no role in cleanly separate creation of frontend and
> backend now, and it shouldn't get one in the future.  Its purpose is to
> support the legacy user interface that has frontend and backend matters
> mixed up. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-21 17:39         ` Kevin Wolf
  2016-03-21 18:02           ` Markus Armbruster
@ 2016-03-22 22:10           ` Paolo Bonzini
  2016-03-23  8:37             ` Markus Armbruster
  1 sibling, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-22 22:10 UTC (permalink / raw)
  To: Kevin Wolf, Markus Armbruster; +Cc: qemu-devel, Max Reitz



On 21/03/2016 18:39, Kevin Wolf wrote:
> > When I wrote my review, I forgot that I expect x-blockdev-del to accept
> > only backends created with blockdev-add.  With that, my question is
> > indeed moot.
> > 
> > However, I've now tested my expectation, and it turned out to be wrong.
> > I'm inclined to call that a bug.
> 
> Yes.

Like this?

diff --git a/blockdev.c b/blockdev.c
index 3eb05d1..0bc7ea2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -4023,6 +4023,11 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
             error_setg(errp, "Cannot find block backend %s", id);
             return;
         }
+        if (blk_legacy_dinfo(blk)) {
+            error_setg(errp, "Deleting block backend added with drive-add"
+                       " is not supported");
+            return;
+        }
         if (blk_get_refcnt(blk) > 1) {
             error_setg(errp, "Block backend %s is in use", id);
             return;

Paolo

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-21 17:19       ` Markus Armbruster
  2016-03-21 17:30         ` Paolo Bonzini
@ 2016-03-22 22:15         ` Paolo Bonzini
  1 sibling, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-22 22:15 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 21/03/2016 18:19, Markus Armbruster wrote:
> I think only if some our users actually expect the alternate wart can we
> seriosuly consider switching, because then we have to choose between two
> breakages anyway:
> 
> * We can stick to the current wart, and leave these users broken.
> 
> * We can switch to the alternate wart, unbreak these users, and break
>   the users that expect the current wart.
> 
> Without further evidence on who expects what, I'd stick to the current
> wart.

I certainly would expect nvme to behave the same as virtio-blk, for one.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-21 17:30         ` Paolo Bonzini
@ 2016-03-23  8:35           ` Markus Armbruster
  2016-03-23  9:35             ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-23  8:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/03/2016 18:19, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>>> On 21/03/2016 16:13, Markus Armbruster wrote:
>>>> Before your patch, we leave finalization of the property to its
>>>> release() callback release_drive(), as we should.  All we do here is
>>>> schedule warty deletion.  And that we must do here, because only here we
>>>> know that warty deletion is wanted.
>>>> 
>>>> Your patch inserts a copy of release_drive() and hacks it up a bit.  Two
>>>> hunks down, release_drive() gets hacked up to conditionally avoid
>>>> repeating the job.
>>>> 
>>>> This feels rather dirty to me.
>>>
>>> The other possibility is to make blk_detach_dev do nothing if blk->dev
>>> == NULL, i.e. make it idempotent.  On one hand, who doesn't like
>>> idempotency; on the other hand, removing an assertion is also dirty.
>>>
>>> I chose the easy way here (changing as fewer contracts as possible).
>>
>> Why can't we keep the work in the property release() method
>> release_drive()?
>> 
>> The only reason blockdev_mark_auto_del() isn't there is that the device

s/isn't there/exists/ (oops)

>> decides whether to call it, not the property.
>
> DEVICE_DELETED is currently sent right after setting unrealized to false
> (see device_unparent), and you cannnot send it later than that.  In
> particular release_drive would mean sending the drive when properties
> are removed in instance_finalize; by that time you don't have anymore a
> QOM path to include in the event.

I see.  To delay DEVICE_DELETED, we'd have to save the QOM path, and
that would be bothersome.

Still, copying code from property to devices with that property is
undesirable.  It's not that bad in this patch, because we copy only to
the devices that do warty backend deletion, and that's just the two
places where we call blockdev_mark_auto_del() now.  However, it gets
worse if we decide to extend warty backend deletion to *all* devices:
more places, and a new need to consistently copy it to every new user of
the drive property.

When you find yourself copying code from a property callback into every
device using it, the real problem might be you're missing a callback.
In this case, one that runs at unrealize time.  The existing release()
runs at finalize time.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-22 22:10           ` Paolo Bonzini
@ 2016-03-23  8:37             ` Markus Armbruster
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-23  8:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/03/2016 18:39, Kevin Wolf wrote:
>> > When I wrote my review, I forgot that I expect x-blockdev-del to accept
>> > only backends created with blockdev-add.  With that, my question is
>> > indeed moot.
>> > 
>> > However, I've now tested my expectation, and it turned out to be wrong.
>> > I'm inclined to call that a bug.
>> 
>> Yes.
>
> Like this?
>
> diff --git a/blockdev.c b/blockdev.c
> index 3eb05d1..0bc7ea2 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -4023,6 +4023,11 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
>              error_setg(errp, "Cannot find block backend %s", id);
>              return;
>          }
> +        if (blk_legacy_dinfo(blk)) {
> +            error_setg(errp, "Deleting block backend added with drive-add"
> +                       " is not supported");
> +            return;
> +        }
>          if (blk_get_refcnt(blk) > 1) {
>              error_setg(errp, "Block backend %s is in use", id);
>              return;

Matches hmp_drive_del().

Reviewed-by: Markus Armbruster <armbru@redhat.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-22 22:07                 ` Paolo Bonzini
@ 2016-03-23  9:18                   ` Markus Armbruster
  2016-03-23  9:40                     ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Markus Armbruster @ 2016-03-23  9:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 22/03/2016 11:25, Markus Armbruster wrote:
>> Regardless of how and when we create BlockBackend, we'll want to keep
>> the clean separation between frontend and backend internally and at the
>> user interface.
>
> This means that the BlockBackend should not own the DriveInfo.  The
> backend and frontend need not know of the object that mixes concepts
> from both of them.  Instead, the DriveInfo can instantiate itself into a
> BlockBackend and the board can (if required) use the frontend parts of
> DriveInfo to instantiate a device and connect it to the BlocKBackend.

You missed or are glossing over the "letting devices fall back to legacy
configuration" part.  Let me explain it in more detail, using frontend
property "serial" as example.  You can use -drive parameter serial to
control it for many devices.

Example: -drive if=ide,index=3,file=tmp.qcow2,serial=Stockhausen

The board examines the if=ide drives and creates frontends for them.  It
could certainly recognize -drive parameter serial and configure the
frontend accordingly.  However, it doesn't.  To show you why, I need
another example.

Example: -drive if=none,id=ide3,file=tmp.qcow2,serial=Stockhausen \
-device ide-hd,bus=ide.1,unit=1,drive=ide3

The board is not involved here.  Instead, the *frontend* implements the
legacy fallback: if its property "serial" isn't set, it checks whether
its backend's DriveInfo has a serial, and if yes, it uses that.  Only
some frontends do that, namely the ones where the legacy configuration
actually needs to be preserved.  Newer ones don't.  Look for
blkconf_serial().

> In Kevin's idea there would be no ownership either way.  Until then, I
> think my patch actually gets us closer to the ideal.

I'm afraid it gets us closer to where we used to be six years ago :)

Qdev drive properties used to point to a DriveInfo, and the DriveInfo
pointed to BlockDriverState.  Commit f8b6cc0 cut out the DriveInfo
middleman.  This was a tiny step towards DriveInfo-less blockdev-add.

DriveInfo is legacy configuration.  Tacking it to BlockBackend is simple
and convenient.  If it ceases to be simple and convenient, we can try to
find another home.  But it really has no life of its own!  It's
ancillary information for whatever -drive creates (currently:
BlockBackend), and therefore should simply die with whatever it is
ancillary to.  It's owned by that, not the other way round.

Now, you can certainly take a reference without being the owner.  But my
review comment wasn't so much about ownership, it was more about the
oddness of taking a reference (in the sense of incrementing the
reference count) without actually *having* a reference (in the sense of
a pointer to the reference-counted object).  I find that confusing.

Confusion can be countered with a comment.  However, I still don't
understand why we need to take this new reference.  Can you explain?

>> DriveInfo has no role in cleanly separate creation of frontend and
>> backend now, and it shouldn't get one in the future.  Its purpose is to
>> support the legacy user interface that has frontend and backend matters
>> mixed up. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time
  2016-03-23  8:35           ` Markus Armbruster
@ 2016-03-23  9:35             ` Paolo Bonzini
  0 siblings, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-23  9:35 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 23/03/2016 09:35, Markus Armbruster wrote:
>> by that time you don't have anymore a QOM path to include in the event.
>
> I see.  To delay DEVICE_DELETED, we'd have to save the QOM path, and
> that would be bothersome.

Not just that, the QOM path goes away at the time we currently raise
DEVICE_DELETED.  If we delay it to finalization, it might even have been
reused.

> Still, copying code from property to devices with that property is
> undesirable.  It's not that bad in this patch, because we copy only to
> the devices that do warty backend deletion, and that's just the two
> places where we call blockdev_mark_auto_del() now.  However, it gets
> worse if we decide to extend warty backend deletion to *all* devices:
> more places, and a new need to consistently copy it to every new user of
> the drive property.
> 
> When you find yourself copying code from a property callback into every
> device using it, the real problem might be you're missing a callback.
> In this case, one that runs at unrealize time.  The existing release()
> runs at finalize time.

That's certainly a good thing to do if we decide to extend autodeletion
to the NVMe and SD devices.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-23  9:18                   ` Markus Armbruster
@ 2016-03-23  9:40                     ` Paolo Bonzini
  2016-03-23 12:13                       ` Markus Armbruster
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2016-03-23  9:40 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Kevin Wolf, qemu-devel, Max Reitz



On 23/03/2016 10:18, Markus Armbruster wrote:
>> In Kevin's idea there would be no ownership either way.  Until then, I
>> think my patch actually gets us closer to the ideal.
> 
> I'm afraid it gets us closer to where we used to be six years ago :)
> 
> Qdev drive properties used to point to a DriveInfo, and the DriveInfo
> pointed to BlockDriverState.  Commit f8b6cc0 cut out the DriveInfo
> middleman.  This was a tiny step towards DriveInfo-less blockdev-add.
> 
> DriveInfo is legacy configuration.  Tacking it to BlockBackend is simple
> and convenient.  If it ceases to be simple and convenient, we can try to
> find another home.  But it really has no life of its own!

I disagree; the life of DriveInfo is exactly the same as the -drive
QemuOpts.  But anyway, with your idea of adding an unrealize callback to
the drive properties, I can move the extra reference within the device.
 It should become cleaner.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time
  2016-03-23  9:40                     ` Paolo Bonzini
@ 2016-03-23 12:13                       ` Markus Armbruster
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Armbruster @ 2016-03-23 12:13 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Max Reitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 23/03/2016 10:18, Markus Armbruster wrote:
>>> In Kevin's idea there would be no ownership either way.  Until then, I
>>> think my patch actually gets us closer to the ideal.
>> 
>> I'm afraid it gets us closer to where we used to be six years ago :)
>> 
>> Qdev drive properties used to point to a DriveInfo, and the DriveInfo
>> pointed to BlockDriverState.  Commit f8b6cc0 cut out the DriveInfo
>> middleman.  This was a tiny step towards DriveInfo-less blockdev-add.
>> 
>> DriveInfo is legacy configuration.  Tacking it to BlockBackend is simple
>> and convenient.  If it ceases to be simple and convenient, we can try to
>> find another home.  But it really has no life of its own!
>
> I disagree; the life of DriveInfo is exactly the same as the -drive
> QemuOpts.  But anyway, with your idea of adding an unrealize callback to
> the drive properties, I can move the extra reference within the device.
>  It should become cleaner.

I guess discussing the finer semantic points some more wouldn't be
productive now.  Instead, you do a v2, and then we'll see.  Working code
can make philosophical differences evaporate :)  Okay?

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-03-23 12:13 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-22 14:39 [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
2016-02-22 14:39 ` [Qemu-devel] [PATCH 1/3] block: detach devices from DriveInfo at unrealize time Paolo Bonzini
2016-03-21 15:13   ` Markus Armbruster
2016-03-21 15:31     ` Paolo Bonzini
2016-03-21 17:19       ` Markus Armbruster
2016-03-21 17:30         ` Paolo Bonzini
2016-03-23  8:35           ` Markus Armbruster
2016-03-23  9:35             ` Paolo Bonzini
2016-03-22 22:15         ` Paolo Bonzini
2016-02-22 14:39 ` [Qemu-devel] [PATCH 2/3] block: keep BlockBackend alive until device finalize time Paolo Bonzini
2016-03-21 15:22   ` Markus Armbruster
2016-03-21 15:37     ` Paolo Bonzini
2016-02-22 14:39 ` [Qemu-devel] [PATCH 3/3] block: remove legacy_dinfo at blk_detach_dev time Paolo Bonzini
2016-03-21 16:15   ` Markus Armbruster
2016-03-21 16:21     ` Paolo Bonzini
2016-03-21 17:30       ` Markus Armbruster
2016-03-21 17:34         ` Paolo Bonzini
2016-03-21 18:14           ` Markus Armbruster
2016-03-22  8:19             ` Kevin Wolf
2016-03-22 10:25               ` Markus Armbruster
2016-03-22 22:07                 ` Paolo Bonzini
2016-03-23  9:18                   ` Markus Armbruster
2016-03-23  9:40                     ` Paolo Bonzini
2016-03-23 12:13                       ` Markus Armbruster
2016-03-21 17:39         ` Kevin Wolf
2016-03-21 18:02           ` Markus Armbruster
2016-03-22 22:10           ` Paolo Bonzini
2016-03-23  8:37             ` Markus Armbruster
2016-03-09 12:20 ` [Qemu-devel] [PATCH 0/3] Early release of -drive QemuOpts Paolo Bonzini
2016-03-09 12:30   ` Kevin Wolf
2016-03-09 12:53     ` Markus Armbruster
2016-03-17 17:00 ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.