All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread
@ 2020-08-31 15:01 Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 1/9] scsi/scsi_bus: switch search direction in scsi_device_find Maxim Levitsky
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Paolo Bonzini

Hi!

This is a patch series that is a result of my discussion with Paulo on
how to correctly fix the root cause of the BZ #1812399.

The root cause of this bug is the fact that IO thread is running mostly
unlocked versus main thread on which device hotplug is done.

qdev_device_add first creates the device object, then places it on the bus,
and only then realizes it.

However some drivers and currently only virtio-scsi enumerate its child bus
devices on each request that is received from the guest and that can happen on the IO
thread.

Thus we have a window when new device is on the bus but not realized and can be accessed
by the virtio-scsi driver in that state.

Fix that by doing two things:

1. Add partial RCU protection to the list of a bus's child devices.
This allows the scsi IO thread to safely enumerate the child devices
while it races with the hotplug placing the device on the bus.

2. Let scsi_device_find not return devices that are on the bus but not realized

Note that in the particular bug report the issue wasn't a race but rather due
to combination of things, the .realize code in the middle managed to trigger IO on the virtqueue
which caused the virtio-scsi driver to access the half realized device. However
since this can happen as well with real IO thread, this patch series was done,
which fixes this as well.

Changes from V3:

* Rebased to latest qemu

* Added a new patch to fix related race in scsi_target_emulate_report_luns

* Moved the non-realized device check to scsi core, since there is no
  way a device driver will want to see non realized devices on a scsi bus.
  (scsi-bus still needs this and can using an internal function)

* Splitted patch that added drain_rcu and used it, to patch that only adds it, and one
  that uses it (no other changes so I kept Reviewed-by)

*Some tweaks to commits

This series was tested by adding a virtio-scsi drive with iothread,
then running fio stress job in the guest in a loop, and then adding/removing
the scsi drive on the host in the loop.
This test was failing usually on 1st iteration withouth this patch series,
and now it seems to work smoothly.

Best regards,
	Maxim Levitsky

Maxim Levitsky (9):
  scsi/scsi_bus: switch search direction in scsi_device_find
  rcu: Implement drain_call_rcu
  device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add
  device-core: use RCU for list of childs of a bus
  device-core: use atomic_set on .realized property
  scsi/scsi-bus: scsi_device_find: don't return unrealized devices
  scsi/scsi_bus: Add scsi_device_get
  virtio-scsi: use scsi_device_get
  scsi/scsi_bus: fix races in REPORT LUNS

 hw/core/bus.c          |  28 +++++---
 hw/core/qdev.c         |  56 +++++++++++----
 hw/scsi/scsi-bus.c     | 151 ++++++++++++++++++++++++++---------------
 hw/scsi/virtio-scsi.c  |  27 +++++---
 include/hw/qdev-core.h |  11 +++
 include/hw/scsi/scsi.h |   1 +
 include/qemu/rcu.h     |   1 +
 qdev-monitor.c         |  22 ++++++
 util/rcu.c             |  55 +++++++++++++++
 9 files changed, 265 insertions(+), 87 deletions(-)

-- 
2.26.2




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/9] scsi/scsi_bus: switch search direction in scsi_device_find
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 2/9] rcu: Implement drain_call_rcu Maxim Levitsky
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Stefan Hajnoczi, Paolo Bonzini

This change will allow us to convert the bus children list to RCU,
while not changing the logic of this function

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/scsi/scsi-bus.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index df65cc2223..f8adfbc2a5 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1572,7 +1572,7 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
     BusChild *kid;
     SCSIDevice *target_dev = NULL;
 
-    QTAILQ_FOREACH_REVERSE(kid, &bus->qbus.children, sibling) {
+    QTAILQ_FOREACH(kid, &bus->qbus.children, sibling) {
         DeviceState *qdev = kid->child;
         SCSIDevice *dev = SCSI_DEVICE(qdev);
 
@@ -1580,7 +1580,15 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
             if (dev->lun == lun) {
                 return dev;
             }
-            target_dev = dev;
+
+            /*
+             * If we don't find exact match (channel/bus/lun),
+             * we will return the first device which matches channel/bus
+             */
+
+            if (!target_dev) {
+                target_dev = dev;
+            }
         }
     }
     return target_dev;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/9] rcu: Implement drain_call_rcu
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 1/9] scsi/scsi_bus: switch search direction in scsi_device_find Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 3/9] device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add Maxim Levitsky
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Stefan Hajnoczi,
	Maxim Levitsky, Stefan Hajnoczi, Paolo Bonzini

This will allow is to preserve the semantics of hmp_device_del,
that the device is deleted immediatly which was changed by previos
patch that delayed this to RCU callback

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/rcu.h |  1 +
 util/rcu.c         | 55 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index 570aa603eb..0e375ebe13 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -133,6 +133,7 @@ struct rcu_head {
 };
 
 extern void call_rcu1(struct rcu_head *head, RCUCBFunc *func);
+extern void drain_call_rcu(void);
 
 /* The operands of the minus operator must have the same type,
  * which must be the one that we specify in the cast.
diff --git a/util/rcu.c b/util/rcu.c
index 60a37f72c3..c4fefa9333 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -293,6 +293,61 @@ void call_rcu1(struct rcu_head *node, void (*func)(struct rcu_head *node))
     qemu_event_set(&rcu_call_ready_event);
 }
 
+
+struct rcu_drain {
+    struct rcu_head rcu;
+    QemuEvent drain_complete_event;
+};
+
+static void drain_rcu_callback(struct rcu_head *node)
+{
+    struct rcu_drain *event = (struct rcu_drain *)node;
+    qemu_event_set(&event->drain_complete_event);
+}
+
+/*
+ * This function ensures that all pending RCU callbacks
+ * on the current thread are done executing
+
+ * drops big qemu lock during the wait to allow RCU thread
+ * to process the callbacks
+ *
+ */
+
+void drain_call_rcu(void)
+{
+    struct rcu_drain rcu_drain;
+    bool locked = qemu_mutex_iothread_locked();
+
+    memset(&rcu_drain, 0, sizeof(struct rcu_drain));
+    qemu_event_init(&rcu_drain.drain_complete_event, false);
+
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
+
+
+    /*
+     * RCU callbacks are invoked in the same order as in which they
+     * are registered, thus we can be sure that when 'drain_rcu_callback'
+     * is called, all RCU callbacks that were registered on this thread
+     * prior to calling this function are completed.
+     *
+     * Note that since we have only one global queue of the RCU callbacks,
+     * we also end up waiting for most of RCU callbacks that were registered
+     * on the other threads, but this is a side effect that shoudn't be
+     * assumed.
+     */
+
+    call_rcu1(&rcu_drain.rcu, drain_rcu_callback);
+    qemu_event_wait(&rcu_drain.drain_complete_event);
+
+    if (locked) {
+        qemu_mutex_lock_iothread();
+    }
+
+}
+
 void rcu_register_thread(void)
 {
     assert(rcu_reader.ctr == 0);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 3/9] device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 1/9] scsi/scsi_bus: switch search direction in scsi_device_find Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 2/9] rcu: Implement drain_call_rcu Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 4/9] device-core: use RCU for list of childs of a bus Maxim Levitsky
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Stefan Hajnoczi,
	Maxim Levitsky, Stefan Hajnoczi, Paolo Bonzini

Soon, a device removal might only happen on RCU callback execution,
but to avoid chanding HMP semantics, just drain all pending RCU
callbacks

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 qdev-monitor.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index e9b7228480..c66ba13780 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -803,6 +803,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
         return;
     }
     dev = qdev_device_add(opts, errp);
+
+    /*
+     * Drain all pending RCU callbacks. This is done because
+     * some bus related operations can delay a device removal
+     * (in this case this can happen if device is added and then
+     * removed due to a configuration error)
+     * to a RCU callback, but user might expect that this interface
+     * will finish its job completely once qmp command returns result
+     * to the user
+     */
+    drain_call_rcu();
+
     if (!dev) {
         qemu_opts_del(opts);
         return;
@@ -894,6 +906,16 @@ void qmp_device_del(const char *id, Error **errp)
         }
 
         qdev_unplug(dev, errp);
+
+        /*
+         * Drain all pending RCU callbacks. This is done because
+         * some bus related operations can delay a device removal
+         * to a RCU callback, but user might expect that this interface
+         * will finish its job completely once qmp command returns result
+         * to the user
+         */
+
+        drain_call_rcu();
     }
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 4/9] device-core: use RCU for list of childs of a bus
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (2 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 3/9] device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 5/9] device-core: use atomic_set on .realized property Maxim Levitsky
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Stefan Hajnoczi, Paolo Bonzini

This fixes the race between device emulation code that tries to find
a child device to dispatch the request to (e.g a scsi disk),
and hotplug of a new device to that bus.

Note that this doesn't convert all the readers of the list
but only these that might go over that list without BQL held.

This is a very small first step to make this code thread safe.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/core/bus.c          | 28 ++++++++++++++++++----------
 hw/core/qdev.c         | 37 +++++++++++++++++++++++--------------
 hw/scsi/scsi-bus.c     | 17 ++++++++++++++---
 hw/scsi/virtio-scsi.c  |  6 +++++-
 include/hw/qdev-core.h |  9 +++++++++
 5 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/hw/core/bus.c b/hw/core/bus.c
index 6b987b6946..385eb3ad5a 100644
--- a/hw/core/bus.c
+++ b/hw/core/bus.c
@@ -49,12 +49,14 @@ int qbus_walk_children(BusState *bus,
         }
     }
 
-    QTAILQ_FOREACH(kid, &bus->children, sibling) {
-        err = qdev_walk_children(kid->child,
-                                 pre_devfn, pre_busfn,
-                                 post_devfn, post_busfn, opaque);
-        if (err < 0) {
-            return err;
+    WITH_RCU_READ_LOCK_GUARD() {
+        QTAILQ_FOREACH_RCU(kid, &bus->children, sibling) {
+            err = qdev_walk_children(kid->child,
+                                     pre_devfn, pre_busfn,
+                                     post_devfn, post_busfn, opaque);
+            if (err < 0) {
+                return err;
+            }
         }
     }
 
@@ -90,9 +92,13 @@ static void bus_reset_child_foreach(Object *obj, ResettableChildCallback cb,
     BusState *bus = BUS(obj);
     BusChild *kid;
 
-    QTAILQ_FOREACH(kid, &bus->children, sibling) {
+    rcu_read_lock();
+
+    QTAILQ_FOREACH_RCU(kid, &bus->children, sibling) {
         cb(OBJECT(kid->child), opaque, type);
     }
+
+    rcu_read_unlock();
 }
 
 static void qbus_init(BusState *bus, DeviceState *parent, const char *name)
@@ -194,9 +200,11 @@ static void bus_set_realized(Object *obj, bool value, Error **errp)
 
         /* TODO: recursive realization */
     } else if (!value && bus->realized) {
-        QTAILQ_FOREACH(kid, &bus->children, sibling) {
-            DeviceState *dev = kid->child;
-            qdev_unrealize(dev);
+        WITH_RCU_READ_LOCK_GUARD() {
+            QTAILQ_FOREACH_RCU(kid, &bus->children, sibling) {
+                DeviceState *dev = kid->child;
+                qdev_unrealize(dev);
+            }
         }
         if (bc->unrealize) {
             bc->unrealize(bus);
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 96772a15bd..28e5fff5ed 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -51,6 +51,12 @@ const VMStateDescription *qdev_get_vmsd(DeviceState *dev)
     return dc->vmsd;
 }
 
+static void bus_free_bus_child(BusChild *kid)
+{
+    object_unref(OBJECT(kid->child));
+    g_free(kid);
+}
+
 static void bus_remove_child(BusState *bus, DeviceState *child)
 {
     BusChild *kid;
@@ -60,15 +66,16 @@ static void bus_remove_child(BusState *bus, DeviceState *child)
             char name[32];
 
             snprintf(name, sizeof(name), "child[%d]", kid->index);
-            QTAILQ_REMOVE(&bus->children, kid, sibling);
+            QTAILQ_REMOVE_RCU(&bus->children, kid, sibling);
 
             bus->num_children--;
 
             /* This gives back ownership of kid->child back to us.  */
             object_property_del(OBJECT(bus), name);
-            object_unref(OBJECT(kid->child));
-            g_free(kid);
-            return;
+
+            /* free the bus kid, when it is safe to do so*/
+            call_rcu(kid, bus_free_bus_child, rcu);
+            break;
         }
     }
 }
@@ -83,7 +90,7 @@ static void bus_add_child(BusState *bus, DeviceState *child)
     kid->child = child;
     object_ref(OBJECT(kid->child));
 
-    QTAILQ_INSERT_HEAD(&bus->children, kid, sibling);
+    QTAILQ_INSERT_HEAD_RCU(&bus->children, kid, sibling);
 
     /* This transfers ownership of kid->child to the property.  */
     snprintf(name, sizeof(name), "child[%d]", kid->index);
@@ -659,17 +666,19 @@ DeviceState *qdev_find_recursive(BusState *bus, const char *id)
     DeviceState *ret;
     BusState *child;
 
-    QTAILQ_FOREACH(kid, &bus->children, sibling) {
-        DeviceState *dev = kid->child;
+    WITH_RCU_READ_LOCK_GUARD() {
+        QTAILQ_FOREACH_RCU(kid, &bus->children, sibling) {
+            DeviceState *dev = kid->child;
 
-        if (dev->id && strcmp(dev->id, id) == 0) {
-            return dev;
-        }
+            if (dev->id && strcmp(dev->id, id) == 0) {
+                return dev;
+            }
 
-        QLIST_FOREACH(child, &dev->child_bus, sibling) {
-            ret = qdev_find_recursive(child, id);
-            if (ret) {
-                return ret;
+            QLIST_FOREACH(child, &dev->child_bus, sibling) {
+                ret = qdev_find_recursive(child, id);
+                if (ret) {
+                    return ret;
+                }
             }
         }
     }
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f8adfbc2a5..92d412b65c 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -400,7 +400,10 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
     id = r->req.dev->id;
     found_lun0 = false;
     n = 0;
-    QTAILQ_FOREACH(kid, &r->req.bus->qbus.children, sibling) {
+
+    rcu_read_lock();
+
+    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
         DeviceState *qdev = kid->child;
         SCSIDevice *dev = SCSI_DEVICE(qdev);
 
@@ -421,7 +424,7 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
     memset(r->buf, 0, len);
     stl_be_p(&r->buf[0], n);
     i = found_lun0 ? 8 : 16;
-    QTAILQ_FOREACH(kid, &r->req.bus->qbus.children, sibling) {
+    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
         DeviceState *qdev = kid->child;
         SCSIDevice *dev = SCSI_DEVICE(qdev);
 
@@ -430,6 +433,9 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
             i += 8;
         }
     }
+
+    rcu_read_unlock();
+
     assert(i == n + 8);
     r->len = len;
     return true;
@@ -1572,12 +1578,15 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
     BusChild *kid;
     SCSIDevice *target_dev = NULL;
 
-    QTAILQ_FOREACH(kid, &bus->qbus.children, sibling) {
+    rcu_read_lock();
+
+    QTAILQ_FOREACH_RCU(kid, &bus->qbus.children, sibling) {
         DeviceState *qdev = kid->child;
         SCSIDevice *dev = SCSI_DEVICE(qdev);
 
         if (dev->channel == channel && dev->id == id) {
             if (dev->lun == lun) {
+                rcu_read_unlock();
                 return dev;
             }
 
@@ -1591,6 +1600,8 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
             }
         }
     }
+
+    rcu_read_unlock();
     return target_dev;
 }
 
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 3a71ea7097..971afbb217 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -367,12 +367,16 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
     case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
         target = req->req.tmf.lun[1];
         s->resetting++;
-        QTAILQ_FOREACH(kid, &s->bus.qbus.children, sibling) {
+
+        rcu_read_lock();
+        QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) {
              d = SCSI_DEVICE(kid->child);
              if (d->channel == 0 && d->id == target) {
                 qdev_reset_all(&d->qdev);
              }
         }
+        rcu_read_unlock();
+
         s->resetting--;
         break;
 
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index ea3f73a282..7c7728ff86 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -3,6 +3,8 @@
 
 #include "qemu/queue.h"
 #include "qemu/bitmap.h"
+#include "qemu/rcu.h"
+#include "qemu/rcu_queue.h"
 #include "qom/object.h"
 #include "hw/hotplug.h"
 #include "hw/resettable.h"
@@ -230,6 +232,7 @@ struct BusClass {
 };
 
 typedef struct BusChild {
+    struct rcu_head rcu;
     DeviceState *child;
     int index;
     QTAILQ_ENTRY(BusChild) sibling;
@@ -250,6 +253,12 @@ struct BusState {
     int max_index;
     bool realized;
     int num_children;
+
+    /*
+     * children is a RCU QTAILQ, thus readers must use RCU to access it,
+     * and writers must hold the big qemu lock
+     */
+
     QTAILQ_HEAD(, BusChild) children;
     QLIST_ENTRY(BusState) sibling;
     ResettableState reset;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 5/9] device-core: use atomic_set on .realized property
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (3 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 4/9] device-core: use RCU for list of childs of a bus Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices Maxim Levitsky
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Stefan Hajnoczi, Paolo Bonzini

Some code might race with placement of new devices on a bus.
We currently first place a (unrealized) device on the bus
and then realize it.

As a workaround, users that scan the child device list, can
check the realized property to see if it is safe to access such a device.
Use an atomic write here too to aid with this.

A separate discussion is what to do with devices that are unrealized:
It looks like for this case we only call the hotplug handler's unplug
callback and its up to it to unrealize the device.
An atomic operation doesn't cause harm for this code path though.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/core/qdev.c         | 19 ++++++++++++++++++-
 include/hw/qdev-core.h |  2 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 28e5fff5ed..97165a556d 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -933,7 +933,25 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
             }
        }
 
+       atomic_store_release(&dev->realized, value);
+
     } else if (!value && dev->realized) {
+
+        /*
+         * Change the value so that any concurrent users are aware
+         * that the device is going to be unrealized
+         *
+         * TODO: change .realized property to enum that states
+         * each phase of the device realization/unrealization
+         */
+
+        atomic_set(&dev->realized, value);
+        /*
+         * execute full memory barrier to ensure that concurrent users
+         * see this update prior to any other changes to the device
+         */
+        smp_mb();
+
         QLIST_FOREACH(bus, &dev->child_bus, sibling) {
             qbus_unrealize(bus);
         }
@@ -948,7 +966,6 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
     }
 
     assert(local_err == NULL);
-    dev->realized = value;
     return;
 
 child_realize_fail:
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 7c7728ff86..08e14e122c 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -165,6 +165,8 @@ struct NamedClockList {
 /**
  * DeviceState:
  * @realized: Indicates whether the device has been fully constructed.
+ *            When accessed outsize big qemu lock, must be accessed with
+ *            atomic_load_acquire()
  * @reset: ResettableState for the device; handled by Resettable interface.
  *
  * This structure should not be accessed directly.  We declare it here
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (4 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 5/9] device-core: use atomic_set on .realized property Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-09-08 15:00   ` Stefan Hajnoczi
  2020-08-31 15:01 ` [PATCH v4 7/9] scsi/scsi_bus: Add scsi_device_get Maxim Levitsky
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Paolo Bonzini

The device core first places a device on the bus and then realizes it.
Make scsi_device_find avoid returing such devices to avoid
races in drivers that use an iothread (currently virtio-scsi)

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1812399

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/scsi/scsi-bus.c | 88 ++++++++++++++++++++++++++++------------------
 1 file changed, 53 insertions(+), 35 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 92d412b65c..7ceae2c92b 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -51,6 +51,56 @@ static const TypeInfo scsi_bus_info = {
 };
 static int next_scsi_bus;
 
+static SCSIDevice *_scsi_device_find(SCSIBus *bus, int channel, int id, int lun,
+                                     bool include_unrealized)
+{
+    BusChild *kid;
+    SCSIDevice *retval = NULL;
+
+    rcu_read_lock();
+
+    QTAILQ_FOREACH_RCU(kid, &bus->qbus.children, sibling) {
+        DeviceState *qdev = kid->child;
+        SCSIDevice *dev = SCSI_DEVICE(qdev);
+
+        if (dev->channel == channel && dev->id == id) {
+            if (dev->lun == lun) {
+                retval = dev;
+                goto out;
+            }
+
+            /*
+             * If we don't find exact match (channel/bus/lun),
+             * we will return the first device which matches channel/bus
+             */
+
+            if (!retval) {
+                retval = dev;
+            }
+        }
+    }
+out:
+    /*
+     * This function might run on the IO thread and we might race against
+     * main thread hot-plugging the device.
+     * We assume that as soon as .realized is set to true we can let
+     * the user access the device.
+     */
+
+    if (retval && !include_unrealized &&
+        !atomic_load_acquire(&retval->qdev.realized)) {
+        retval = NULL;
+    }
+
+    rcu_read_unlock();
+    return retval;
+}
+
+SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
+{
+    return _scsi_device_find(bus, channel, id, lun, false);
+}
+
 static void scsi_device_realize(SCSIDevice *s, Error **errp)
 {
     SCSIDeviceClass *sc = SCSI_DEVICE_GET_CLASS(s);
@@ -186,7 +236,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
             dev->lun = 0;
         }
         do {
-            d = scsi_device_find(bus, dev->channel, ++id, dev->lun);
+            d = _scsi_device_find(bus, dev->channel, ++id, dev->lun, true);
         } while (d && d->lun == dev->lun && id < bus->info->max_target);
         if (d && d->lun == dev->lun) {
             error_setg(errp, "no free target");
@@ -196,7 +246,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
     } else if (dev->lun == -1) {
         int lun = -1;
         do {
-            d = scsi_device_find(bus, dev->channel, dev->id, ++lun);
+            d = _scsi_device_find(bus, dev->channel, dev->id, ++lun, true);
         } while (d && d->lun == lun && lun < bus->info->max_lun);
         if (d && d->lun == lun) {
             error_setg(errp, "no free lun");
@@ -204,7 +254,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
         }
         dev->lun = lun;
     } else {
-        d = scsi_device_find(bus, dev->channel, dev->id, dev->lun);
+        d = _scsi_device_find(bus, dev->channel, dev->id, dev->lun, true);
         assert(d);
         if (d->lun == dev->lun && dev != d) {
             error_setg(errp, "lun already used by '%s'", d->qdev.id);
@@ -1573,38 +1623,6 @@ static char *scsibus_get_fw_dev_path(DeviceState *dev)
                            qdev_fw_name(dev), d->id, d->lun);
 }
 
-SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
-{
-    BusChild *kid;
-    SCSIDevice *target_dev = NULL;
-
-    rcu_read_lock();
-
-    QTAILQ_FOREACH_RCU(kid, &bus->qbus.children, sibling) {
-        DeviceState *qdev = kid->child;
-        SCSIDevice *dev = SCSI_DEVICE(qdev);
-
-        if (dev->channel == channel && dev->id == id) {
-            if (dev->lun == lun) {
-                rcu_read_unlock();
-                return dev;
-            }
-
-            /*
-             * If we don't find exact match (channel/bus/lun),
-             * we will return the first device which matches channel/bus
-             */
-
-            if (!target_dev) {
-                target_dev = dev;
-            }
-        }
-    }
-
-    rcu_read_unlock();
-    return target_dev;
-}
-
 /* SCSI request list.  For simplicity, pv points to the whole device */
 
 static int put_scsi_requests(QEMUFile *f, void *pv, size_t size,
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 7/9] scsi/scsi_bus: Add scsi_device_get
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (5 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 8/9] virtio-scsi: use scsi_device_get Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS Maxim Levitsky
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Stefan Hajnoczi,
	Maxim Levitsky, Paolo Bonzini

Add scsi_device_get which finds the scsi device
and takes a reference to it.

Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/scsi/scsi-bus.c     | 22 +++++++++++++++++-----
 include/hw/scsi/scsi.h |  1 +
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 7ceae2c92b..feab20b76d 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -52,7 +52,7 @@ static const TypeInfo scsi_bus_info = {
 static int next_scsi_bus;
 
 static SCSIDevice *_scsi_device_find(SCSIBus *bus, int channel, int id, int lun,
-                                     bool include_unrealized)
+                                     bool include_unrealized, bool take_ref)
 {
     BusChild *kid;
     SCSIDevice *retval = NULL;
@@ -92,13 +92,22 @@ out:
         retval = NULL;
     }
 
+    if (take_ref) {
+        object_ref(OBJECT(retval));
+    }
+
     rcu_read_unlock();
     return retval;
 }
 
 SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int id, int lun)
 {
-    return _scsi_device_find(bus, channel, id, lun, false);
+    return _scsi_device_find(bus, channel, id, lun, false, false);
+}
+
+SCSIDevice *scsi_device_get(SCSIBus *bus, int channel, int id, int lun)
+{
+    return _scsi_device_find(bus, channel, id, lun, false, true);
 }
 
 static void scsi_device_realize(SCSIDevice *s, Error **errp)
@@ -236,7 +245,8 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
             dev->lun = 0;
         }
         do {
-            d = _scsi_device_find(bus, dev->channel, ++id, dev->lun, true);
+            d = _scsi_device_find(bus, dev->channel, ++id, dev->lun,
+                                  true, false);
         } while (d && d->lun == dev->lun && id < bus->info->max_target);
         if (d && d->lun == dev->lun) {
             error_setg(errp, "no free target");
@@ -246,7 +256,8 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
     } else if (dev->lun == -1) {
         int lun = -1;
         do {
-            d = _scsi_device_find(bus, dev->channel, dev->id, ++lun, true);
+            d = _scsi_device_find(bus, dev->channel, dev->id, ++lun,
+                                 true, false);
         } while (d && d->lun == lun && lun < bus->info->max_lun);
         if (d && d->lun == lun) {
             error_setg(errp, "no free lun");
@@ -254,7 +265,8 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp)
         }
         dev->lun = lun;
     } else {
-        d = _scsi_device_find(bus, dev->channel, dev->id, dev->lun, true);
+        d = _scsi_device_find(bus, dev->channel, dev->id, dev->lun,
+                              true, false);
         assert(d);
         if (d->lun == dev->lun && dev != d) {
             error_setg(errp, "lun already used by '%s'", d->qdev.id);
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 2fc23e44ba..e939a81789 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -194,6 +194,7 @@ int scsi_device_get_sense(SCSIDevice *dev, uint8_t *buf, int len, bool fixed);
 int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size,
                         uint8_t *buf, uint8_t buf_size);
 SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int target, int lun);
+SCSIDevice *scsi_device_get(SCSIBus *bus, int channel, int target, int lun);
 
 /* scsi-generic.c. */
 extern const SCSIReqOps scsi_generic_req_ops;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 8/9] virtio-scsi: use scsi_device_get
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (6 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 7/9] scsi/scsi_bus: Add scsi_device_get Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-08-31 15:01 ` [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS Maxim Levitsky
  8 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Stefan Hajnoczi,
	Maxim Levitsky, Stefan Hajnoczi, Paolo Bonzini

This will help us to avoid the scsi device disappearing
after we took a reference to it.

It doesn't by itself forbid case when we try to access
an unrealized device

Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/scsi/virtio-scsi.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 971afbb217..3db9a8aae9 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -33,7 +33,7 @@ static inline int virtio_scsi_get_lun(uint8_t *lun)
     return ((lun[2] << 8) | lun[3]) & 0x3FFF;
 }
 
-static inline SCSIDevice *virtio_scsi_device_find(VirtIOSCSI *s, uint8_t *lun)
+static inline SCSIDevice *virtio_scsi_device_get(VirtIOSCSI *s, uint8_t *lun)
 {
     if (lun[0] != 1) {
         return NULL;
@@ -41,7 +41,7 @@ static inline SCSIDevice *virtio_scsi_device_find(VirtIOSCSI *s, uint8_t *lun)
     if (lun[2] != 0 && !(lun[2] >= 0x40 && lun[2] < 0x80)) {
         return NULL;
     }
-    return scsi_device_find(&s->bus, 0, lun[1], virtio_scsi_get_lun(lun));
+    return scsi_device_get(&s->bus, 0, lun[1], virtio_scsi_get_lun(lun));
 }
 
 void virtio_scsi_init_req(VirtIOSCSI *s, VirtQueue *vq, VirtIOSCSIReq *req)
@@ -256,7 +256,7 @@ static inline void virtio_scsi_ctx_check(VirtIOSCSI *s, SCSIDevice *d)
  *  case of async cancellation. */
 static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
 {
-    SCSIDevice *d = virtio_scsi_device_find(s, req->req.tmf.lun);
+    SCSIDevice *d = virtio_scsi_device_get(s, req->req.tmf.lun);
     SCSIRequest *r, *next;
     BusChild *kid;
     int target;
@@ -370,10 +370,10 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
 
         rcu_read_lock();
         QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) {
-             d = SCSI_DEVICE(kid->child);
-             if (d->channel == 0 && d->id == target) {
-                qdev_reset_all(&d->qdev);
-             }
+            SCSIDevice *d1 = SCSI_DEVICE(kid->child);
+            if (d1->channel == 0 && d1->id == target) {
+                qdev_reset_all(&d1->qdev);
+            }
         }
         rcu_read_unlock();
 
@@ -386,14 +386,17 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
         break;
     }
 
+    object_unref(OBJECT(d));
     return ret;
 
 incorrect_lun:
     req->resp.tmf.response = VIRTIO_SCSI_S_INCORRECT_LUN;
+    object_unref(OBJECT(d));
     return ret;
 
 fail:
     req->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
+    object_unref(OBJECT(d));
     return ret;
 }
 
@@ -564,7 +567,7 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
         }
     }
 
-    d = virtio_scsi_device_find(s, req->req.cmd.lun);
+    d = virtio_scsi_device_get(s, req->req.cmd.lun);
     if (!d) {
         req->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
         virtio_scsi_complete_cmd_req(req);
@@ -580,10 +583,12 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
             req->sreq->cmd.xfer > req->qsgl.size)) {
         req->resp.cmd.response = VIRTIO_SCSI_S_OVERRUN;
         virtio_scsi_complete_cmd_req(req);
+        object_unref(OBJECT(d));
         return -ENOBUFS;
     }
     scsi_req_ref(req->sreq);
     blk_io_plug(d->conf.blk);
+    object_unref(OBJECT(d));
     return 0;
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS
  2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
                   ` (7 preceding siblings ...)
  2020-08-31 15:01 ` [PATCH v4 8/9] virtio-scsi: use scsi_device_get Maxim Levitsky
@ 2020-08-31 15:01 ` Maxim Levitsky
  2020-09-01  9:15   ` Maxim Levitsky
  2020-09-08 15:27   ` Stefan Hajnoczi
  8 siblings, 2 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-31 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, Maxim Levitsky,
	Paolo Bonzini

Currently scsi_target_emulate_report_luns iterates
over child devices list twice, and there is guarantee, that
it will not be changed meanwhile.

This reason for two loops is that it needs to know how much memory
to allocate.

Avoid this by iterating once, and allocating the memory for the output
dynamically with reserving enought memory so that in practice it won't
be reallocated often.

Bugzilla for reference: https://bugzilla.redhat.com/show_bug.cgi?id=1866707

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 hw/scsi/scsi-bus.c | 62 ++++++++++++++++++++++------------------------
 1 file changed, 29 insertions(+), 33 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index feab20b76d..150dee2e6a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -438,19 +438,25 @@ struct SCSITargetReq {
 static void store_lun(uint8_t *outbuf, int lun)
 {
     if (lun < 256) {
+        /* Simple logical unit addressing method*/
+        outbuf[0] = 0;
         outbuf[1] = lun;
-        return;
+    } else {
+        /* Flat space addressing method */
+        outbuf[0] = 0x40 | (lun >> 8);
+        outbuf[1] = (lun & 255);
     }
-    outbuf[1] = (lun & 255);
-    outbuf[0] = (lun >> 8) | 0x40;
 }
 
 static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
 {
     BusChild *kid;
-    int i, len, n;
     int channel, id;
-    bool found_lun0;
+    uint8_t tmp[8] = {0};
+    int len = 0;
+
+    /* reserve space for 63 LUNs*/
+    GByteArray *buf = g_byte_array_sized_new(512);
 
     if (r->req.cmd.xfer < 16) {
         return false;
@@ -460,46 +466,36 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
     }
     channel = r->req.dev->channel;
     id = r->req.dev->id;
-    found_lun0 = false;
-    n = 0;
 
-    rcu_read_lock();
 
-    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
-        DeviceState *qdev = kid->child;
-        SCSIDevice *dev = SCSI_DEVICE(qdev);
+    /* add size (will be updated later to correct value */
+    g_byte_array_append(buf, tmp, 8);
+    len += 8;
 
-        if (dev->channel == channel && dev->id == id) {
-            if (dev->lun == 0) {
-                found_lun0 = true;
-            }
-            n += 8;
-        }
-    }
-    if (!found_lun0) {
-        n += 8;
-    }
-
-    scsi_target_alloc_buf(&r->req, n + 8);
+    /* add LUN0 */
+    g_byte_array_append(buf, tmp, 8);
+    len += 8;
 
-    len = MIN(n + 8, r->req.cmd.xfer & ~7);
-    memset(r->buf, 0, len);
-    stl_be_p(&r->buf[0], n);
-    i = found_lun0 ? 8 : 16;
+    rcu_read_lock();
     QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
         DeviceState *qdev = kid->child;
         SCSIDevice *dev = SCSI_DEVICE(qdev);
 
-        if (dev->channel == channel && dev->id == id) {
-            store_lun(&r->buf[i], dev->lun);
-            i += 8;
+        if (dev->channel == channel && dev->id == id && dev->lun != 0) {
+            store_lun(tmp, dev->lun);
+            g_byte_array_append(buf, tmp, 8);
+            len += 8;
         }
     }
-
     rcu_read_unlock();
 
-    assert(i == n + 8);
-    r->len = len;
+    r->buf_len = len;
+    r->buf = g_byte_array_free(buf, FALSE);
+    r->len = MIN(len, r->req.cmd.xfer & ~7);
+
+    /* store the LUN list length */
+    stl_be_p(&r->buf[0], len - 8);
+
     return true;
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS
  2020-08-31 15:01 ` [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS Maxim Levitsky
@ 2020-09-01  9:15   ` Maxim Levitsky
  2020-09-08 15:27   ` Stefan Hajnoczi
  1 sibling, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-09-01  9:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Paolo Bonzini, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin

On Mon, 2020-08-31 at 18:01 +0300, Maxim Levitsky wrote:
> Currently scsi_target_emulate_report_luns iterates
> over child devices list twice, and there is guarantee, that
> it will not be changed meanwhile.
> 
> This reason for two loops is that it needs to know how much memory
> to allocate.
> 
> Avoid this by iterating once, and allocating the memory for the output
> dynamically with reserving enought memory so that in practice it won't
> be reallocated often.
Just too many spelling/grammar mistakes in the commit message. Sorry about that.

It should be something like that:

Currently scsi_target_emulate_report_luns iterates over the child device list
twice, and there is no guarantee that this list is the same in both iterations.

The reason for iterating twise is that the first iteration calculates 
how much memory to allocate.

However if we use a dynamic array we can avoid iterating twice, and therefore
we avoid this race.

Best regards,
	Maxim Levitsky


> 
> Bugzilla for reference: https://bugzilla.redhat.com/show_bug.cgi?id=1866707
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  hw/scsi/scsi-bus.c | 62 ++++++++++++++++++++++------------------------
>  1 file changed, 29 insertions(+), 33 deletions(-)
> 
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index feab20b76d..150dee2e6a 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -438,19 +438,25 @@ struct SCSITargetReq {
>  static void store_lun(uint8_t *outbuf, int lun)
>  {
>      if (lun < 256) {
> +        /* Simple logical unit addressing method*/
> +        outbuf[0] = 0;
>          outbuf[1] = lun;
> -        return;
> +    } else {
> +        /* Flat space addressing method */
> +        outbuf[0] = 0x40 | (lun >> 8);
> +        outbuf[1] = (lun & 255);
>      }
> -    outbuf[1] = (lun & 255);
> -    outbuf[0] = (lun >> 8) | 0x40;
>  }
>  
>  static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
>  {
>      BusChild *kid;
> -    int i, len, n;
>      int channel, id;
> -    bool found_lun0;
> +    uint8_t tmp[8] = {0};
> +    int len = 0;
> +
> +    /* reserve space for 63 LUNs*/
> +    GByteArray *buf = g_byte_array_sized_new(512);
>  
>      if (r->req.cmd.xfer < 16) {
>          return false;
> @@ -460,46 +466,36 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
>      }
>      channel = r->req.dev->channel;
>      id = r->req.dev->id;
> -    found_lun0 = false;
> -    n = 0;
>  
> -    rcu_read_lock();
>  
> -    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
> -        DeviceState *qdev = kid->child;
> -        SCSIDevice *dev = SCSI_DEVICE(qdev);
> +    /* add size (will be updated later to correct value */
> +    g_byte_array_append(buf, tmp, 8);
> +    len += 8;
>  
> -        if (dev->channel == channel && dev->id == id) {
> -            if (dev->lun == 0) {
> -                found_lun0 = true;
> -            }
> -            n += 8;
> -        }
> -    }
> -    if (!found_lun0) {
> -        n += 8;
> -    }
> -
> -    scsi_target_alloc_buf(&r->req, n + 8);
> +    /* add LUN0 */
> +    g_byte_array_append(buf, tmp, 8);
> +    len += 8;
>  
> -    len = MIN(n + 8, r->req.cmd.xfer & ~7);
> -    memset(r->buf, 0, len);
> -    stl_be_p(&r->buf[0], n);
> -    i = found_lun0 ? 8 : 16;
> +    rcu_read_lock();
>      QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
>          DeviceState *qdev = kid->child;
>          SCSIDevice *dev = SCSI_DEVICE(qdev);
>  
> -        if (dev->channel == channel && dev->id == id) {
> -            store_lun(&r->buf[i], dev->lun);
> -            i += 8;
> +        if (dev->channel == channel && dev->id == id && dev->lun != 0) {
> +            store_lun(tmp, dev->lun);
> +            g_byte_array_append(buf, tmp, 8);
> +            len += 8;
>          }
>      }
> -
>      rcu_read_unlock();
>  
> -    assert(i == n + 8);
> -    r->len = len;
> +    r->buf_len = len;
> +    r->buf = g_byte_array_free(buf, FALSE);
> +    r->len = MIN(len, r->req.cmd.xfer & ~7);
> +
> +    /* store the LUN list length */
> +    stl_be_p(&r->buf[0], len - 8);
> +
>      return true;
>  }
>  




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices
  2020-08-31 15:01 ` [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices Maxim Levitsky
@ 2020-09-08 15:00   ` Stefan Hajnoczi
  2020-09-09  8:15     ` Maxim Levitsky
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Hajnoczi @ 2020-09-08 15:00 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On Mon, Aug 31, 2020 at 06:01:21PM +0300, Maxim Levitsky wrote:
> The device core first places a device on the bus and then realizes it.
> Make scsi_device_find avoid returing such devices to avoid
> races in drivers that use an iothread (currently virtio-scsi)
> 
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1812399
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  hw/scsi/scsi-bus.c | 88 ++++++++++++++++++++++++++++------------------
>  1 file changed, 53 insertions(+), 35 deletions(-)
> 
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 92d412b65c..7ceae2c92b 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -51,6 +51,56 @@ static const TypeInfo scsi_bus_info = {
>  };
>  static int next_scsi_bus;
>  
> +static SCSIDevice *_scsi_device_find(SCSIBus *bus, int channel, int id, int lun,
> +                                     bool include_unrealized)

Declaring an identifier with a leading underscore with file scope is
undefined behavior according to the C99 standard (7.1.3 Reserved
identifiers). QEMU code usually avoids doing this by calling the
function do_scsi_device_find() or similar.

I'm not aware of any practical problem though, so don't worry about
changing it unless you respin the series:

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS
  2020-08-31 15:01 ` [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS Maxim Levitsky
  2020-09-01  9:15   ` Maxim Levitsky
@ 2020-09-08 15:27   ` Stefan Hajnoczi
  2020-09-09  8:20     ` Maxim Levitsky
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan Hajnoczi @ 2020-09-08 15:27 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1761 bytes --]

On Mon, Aug 31, 2020 at 06:01:24PM +0300, Maxim Levitsky wrote:
> Currently scsi_target_emulate_report_luns iterates
> over child devices list twice, and there is guarantee, that
> it will not be changed meanwhile.
> 
> This reason for two loops is that it needs to know how much memory
> to allocate.
> 
> Avoid this by iterating once, and allocating the memory for the output
> dynamically with reserving enought memory so that in practice it won't
> be reallocated often.
> 
> Bugzilla for reference: https://bugzilla.redhat.com/show_bug.cgi?id=1866707

"Buglink:" is the tag name documented in
https://wiki.qemu.org/Contribute/SubmitAPatch#Write_a_meaningful_commit_message

>  static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
>  {
>      BusChild *kid;
> -    int i, len, n;
>      int channel, id;
> -    bool found_lun0;
> +    uint8_t tmp[8] = {0};
> +    int len = 0;
> +
> +    /* reserve space for 63 LUNs*/
> +    GByteArray *buf = g_byte_array_sized_new(512);
>  
>      if (r->req.cmd.xfer < 16) {
>          return false;

buf is leaked.

> @@ -460,46 +466,36 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
>      }
>      channel = r->req.dev->channel;
>      id = r->req.dev->id;
> -    found_lun0 = false;
> -    n = 0;
>  
> -    rcu_read_lock();
>  
> -    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
> -        DeviceState *qdev = kid->child;
> -        SCSIDevice *dev = SCSI_DEVICE(qdev);
> +    /* add size (will be updated later to correct value */
> +    g_byte_array_append(buf, tmp, 8);
> +    len += 8;

Can g_byte_array_size() be used instead of keeping a len local variable?

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices
  2020-09-08 15:00   ` Stefan Hajnoczi
@ 2020-09-09  8:15     ` Maxim Levitsky
  0 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-09-09  8:15 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Paolo Bonzini

On Tue, 2020-09-08 at 16:00 +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 31, 2020 at 06:01:21PM +0300, Maxim Levitsky wrote:
> > The device core first places a device on the bus and then realizes it.
> > Make scsi_device_find avoid returing such devices to avoid
> > races in drivers that use an iothread (currently virtio-scsi)
> > 
> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1812399
> > 
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  hw/scsi/scsi-bus.c | 88 ++++++++++++++++++++++++++++------------------
> >  1 file changed, 53 insertions(+), 35 deletions(-)
> > 
> > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> > index 92d412b65c..7ceae2c92b 100644
> > --- a/hw/scsi/scsi-bus.c
> > +++ b/hw/scsi/scsi-bus.c
> > @@ -51,6 +51,56 @@ static const TypeInfo scsi_bus_info = {
> >  };
> >  static int next_scsi_bus;
> >  
> > +static SCSIDevice *_scsi_device_find(SCSIBus *bus, int channel, int id, int lun,
> > +                                     bool include_unrealized)
> 
> Declaring an identifier with a leading underscore with file scope is
> undefined behavior according to the C99 standard (7.1.3 Reserved
> identifiers). QEMU code usually avoids doing this by calling the
> function do_scsi_device_find() or similar.

I'll fix that, thanks!

Best regards,
	Maxim Levitsky
> 
> I'm not aware of any practical problem though, so don't worry about
> changing it unless you respin the series:
> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS
  2020-09-08 15:27   ` Stefan Hajnoczi
@ 2020-09-09  8:20     ` Maxim Levitsky
  2020-09-11 15:12       ` Stefan Hajnoczi
  0 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-09-09  8:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Paolo Bonzini

On Tue, 2020-09-08 at 16:27 +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 31, 2020 at 06:01:24PM +0300, Maxim Levitsky wrote:
> > Currently scsi_target_emulate_report_luns iterates
> > over child devices list twice, and there is guarantee, that
> > it will not be changed meanwhile.
> > 
> > This reason for two loops is that it needs to know how much memory
> > to allocate.
> > 
> > Avoid this by iterating once, and allocating the memory for the output
> > dynamically with reserving enought memory so that in practice it won't
> > be reallocated often.
> > 
> > Bugzilla for reference: https://bugzilla.redhat.com/show_bug.cgi?id=1866707
> 
> "Buglink:" is the tag name documented in
> https://wiki.qemu.org/Contribute/SubmitAPatch#Write_a_meaningful_commit_message
Noted
> 
> >  static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
> >  {
> >      BusChild *kid;
> > -    int i, len, n;
> >      int channel, id;
> > -    bool found_lun0;
> > +    uint8_t tmp[8] = {0};
> > +    int len = 0;
> > +
> > +    /* reserve space for 63 LUNs*/
> > +    GByteArray *buf = g_byte_array_sized_new(512);
> >  
> >      if (r->req.cmd.xfer < 16) {
> >          return false;
> 
> buf is leaked.
Oops, will fix
> 
> > @@ -460,46 +466,36 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
> >      }
> >      channel = r->req.dev->channel;
> >      id = r->req.dev->id;
> > -    found_lun0 = false;
> > -    n = 0;
> >  
> > -    rcu_read_lock();
> >  
> > -    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
> > -        DeviceState *qdev = kid->child;
> > -        SCSIDevice *dev = SCSI_DEVICE(qdev);
> > +    /* add size (will be updated later to correct value */
> > +    g_byte_array_append(buf, tmp, 8);
> > +    len += 8;
> 
> Can g_byte_array_size() be used instead of keeping a len local variable?
Glib don't seem to have this function, I checked the docs.
Its seems that they want to convert it to GBytes which is basically immutible verion
of GByteArray and it does have g_bytes_get_size.
I decided that a local variable while ugly is still better that this.


I haven't wrote much code that uses Glib, so I might have missed something though.
I had read this reference:
https://developer.gnome.org/glib/stable/glib-Byte-Arrays.html


> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS
  2020-09-09  8:20     ` Maxim Levitsky
@ 2020-09-11 15:12       ` Stefan Hajnoczi
  0 siblings, 0 replies; 16+ messages in thread
From: Stefan Hajnoczi @ 2020-09-11 15:12 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Fam Zheng, Daniel P. Berrangé,
	Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

On Wed, Sep 09, 2020 at 11:20:24AM +0300, Maxim Levitsky wrote:
> On Tue, 2020-09-08 at 16:27 +0100, Stefan Hajnoczi wrote:
> > On Mon, Aug 31, 2020 at 06:01:24PM +0300, Maxim Levitsky wrote:
> > > @@ -460,46 +466,36 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq *r)
> > >      }
> > >      channel = r->req.dev->channel;
> > >      id = r->req.dev->id;
> > > -    found_lun0 = false;
> > > -    n = 0;
> > >  
> > > -    rcu_read_lock();
> > >  
> > > -    QTAILQ_FOREACH_RCU(kid, &r->req.bus->qbus.children, sibling) {
> > > -        DeviceState *qdev = kid->child;
> > > -        SCSIDevice *dev = SCSI_DEVICE(qdev);
> > > +    /* add size (will be updated later to correct value */
> > > +    g_byte_array_append(buf, tmp, 8);
> > > +    len += 8;
> > 
> > Can g_byte_array_size() be used instead of keeping a len local variable?
> Glib don't seem to have this function, I checked the docs.
> Its seems that they want to convert it to GBytes which is basically immutible verion
> of GByteArray and it does have g_bytes_get_size.
> I decided that a local variable while ugly is still better that this.
> 
> 
> I haven't wrote much code that uses Glib, so I might have missed something though.
> I had read this reference:
> https://developer.gnome.org/glib/stable/glib-Byte-Arrays.html

Oops, you're right. GByteArray != GBytes. The local variable makes sense.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-09-11 15:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 15:01 [PATCH v4 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 1/9] scsi/scsi_bus: switch search direction in scsi_device_find Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 2/9] rcu: Implement drain_call_rcu Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 3/9] device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 4/9] device-core: use RCU for list of childs of a bus Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 5/9] device-core: use atomic_set on .realized property Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices Maxim Levitsky
2020-09-08 15:00   ` Stefan Hajnoczi
2020-09-09  8:15     ` Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 7/9] scsi/scsi_bus: Add scsi_device_get Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 8/9] virtio-scsi: use scsi_device_get Maxim Levitsky
2020-08-31 15:01 ` [PATCH v4 9/9] scsi/scsi_bus: fix races in REPORT LUNS Maxim Levitsky
2020-09-01  9:15   ` Maxim Levitsky
2020-09-08 15:27   ` Stefan Hajnoczi
2020-09-09  8:20     ` Maxim Levitsky
2020-09-11 15:12       ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.