All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
@ 2016-03-16 10:10 Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context Fam Zheng
                   ` (4 more replies)
  0 siblings, 5 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

These are some ideas originated from analyzing the Christian's crash report on
virtio-blk dataplane torture test:

https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02093.html

The ideas are mostly inspired/suggested by Paolo. This doesn't fix the bug, but
the first and the last patches seem to make the crash less frequent.  Also
thanks Cornelia Huck for reviewing the draft version posted in that thread.


Fam Zheng (4):
  block: Use drained section in bdrv_set_aio_context
  block-backend: Introduce blk_drained_begin/end
  virtio-blk: Use blk_drained_begin/end around dataplane stop
  virtio-blk: Clean up start/stop with mutex and BH

 block.c                         |  4 ++-
 block/block-backend.c           | 14 ++++++++++
 hw/block/dataplane/virtio-blk.c | 58 +++++++++++++++++++++++++++++++----------
 hw/block/virtio-blk.c           |  3 ++-
 include/sysemu/block-backend.h  |  2 ++
 5 files changed, 65 insertions(+), 16 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context
  2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
@ 2016-03-16 10:10 ` Fam Zheng
  2016-03-16 10:27   ` Paolo Bonzini
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 2/4] block-backend: Introduce blk_drained_begin/end Fam Zheng
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

An empty begin/end pair is almost the same as a bare bdrv_drain except
the aio_poll inside is wrapped by
aio_disable_external/aio_enable_external.

This is safer, and is the only way to achieve quiescence in this
aio_poll(), because bdrv_drained_begin/end pair cannot span across
context detach/attach options, so it's not possible to do by the caller.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 59a18a3..31f4a9f 100644
--- a/block.c
+++ b/block.c
@@ -3747,7 +3747,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 {
-    bdrv_drain(bs); /* ensure there are no in-flight requests */
+    /* ensure there are no in-flight requests */
+    bdrv_drained_begin(bs);
+    bdrv_drained_end(bs);
 
     bdrv_detach_aio_context(bs);
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 2/4] block-backend: Introduce blk_drained_begin/end
  2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context Fam Zheng
@ 2016-03-16 10:10 ` Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 3/4] virtio-blk: Use blk_drained_begin/end around dataplane stop Fam Zheng
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

They forward the call to bdrv_* counterparts.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/block-backend.c          | 14 ++++++++++++++
 include/sysemu/block-backend.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 03e71b4..d686a63 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -891,6 +891,20 @@ void blk_drain(BlockBackend *blk)
     }
 }
 
+void blk_drained_begin(BlockBackend *blk)
+{
+    if (blk->bs) {
+        bdrv_drained_begin(blk->bs);
+    }
+}
+
+void blk_drained_end(BlockBackend *blk)
+{
+    if (blk->bs) {
+        bdrv_drained_end(blk->bs);
+    }
+}
+
 void blk_drain_all(void)
 {
     bdrv_drain_all();
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 00d69ba..2cd53d0 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -128,6 +128,8 @@ int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
 int blk_flush_all(void);
 void blk_drain(BlockBackend *blk);
+void blk_drained_begin(BlockBackend *blk);
+void blk_drained_end(BlockBackend *blk);
 void blk_drain_all(void);
 void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
                       BlockdevOnError on_write_error);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 3/4] virtio-blk: Use blk_drained_begin/end around dataplane stop
  2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 2/4] block-backend: Introduce blk_drained_begin/end Fam Zheng
@ 2016-03-16 10:10 ` Fam Zheng
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH Fam Zheng
  2016-03-16 10:28 ` [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Paolo Bonzini
  4 siblings, 0 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/virtio-blk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cb710f1..939ba79 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -653,11 +653,12 @@ static void virtio_blk_reset(VirtIODevice *vdev)
      */
     ctx = blk_get_aio_context(s->blk);
     aio_context_acquire(ctx);
-    blk_drain(s->blk);
+    blk_drained_begin(s->blk);
 
     if (s->dataplane) {
         virtio_blk_data_plane_stop(s->dataplane);
     }
+    blk_drained_end(s->blk);
     aio_context_release(ctx);
 
     blk_set_enable_write_cache(s->blk, s->original_wce);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [Qemu-devel] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
                   ` (2 preceding siblings ...)
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 3/4] virtio-blk: Use blk_drained_begin/end around dataplane stop Fam Zheng
@ 2016-03-16 10:10 ` Fam Zheng
  2016-03-17 15:00   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2016-03-16 10:28 ` [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Paolo Bonzini
  4 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

This is to make the dataplane start logic simpler to understand.

Start/stop take the mutex so we don't need the starting flag. The bottom
half is scheduled in the iothread to actually hook up request handlers
with vq.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 58 +++++++++++++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 14 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 36f3d2b..9e5c543 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -26,7 +26,6 @@
 #include "qom/object_interfaces.h"
 
 struct VirtIOBlockDataPlane {
-    bool starting;
     bool stopping;
     bool disabled;
 
@@ -49,6 +48,8 @@ struct VirtIOBlockDataPlane {
 
     /* Operation blocker on BDS */
     Error *blocker;
+
+    QemuMutex start_stop_lock;
 };
 
 /* Raise an interrupt to signal guest, if necessary */
@@ -150,6 +151,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
     s = g_new0(VirtIOBlockDataPlane, 1);
     s->vdev = vdev;
     s->conf = conf;
+    qemu_mutex_init(&s->start_stop_lock);
 
     if (conf->iothread) {
         s->iothread = conf->iothread;
@@ -184,19 +186,47 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     g_free(s);
 }
 
+typedef struct {
+    VirtIOBlock *vblk;
+    QEMUBH *bh;
+} VirtIOBlockStartData;
+
+static void virtio_blk_data_plane_start_bh_cb(void *opaque)
+{
+    VirtIOBlockStartData *data = opaque;
+    VirtIOBlockDataPlane *s = data->vblk->dataplane;
+
+    qemu_mutex_lock(&s->start_stop_lock);
+    if (!data->vblk->dataplane_started) {
+        goto out;
+    }
+    /* Kick right away to begin processing requests already in vring */
+    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
+
+    /* Get this show started by hooking up our callbacks */
+    virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
+
+out:
+    qemu_bh_delete(data->bh);
+    g_free(data);
+    qemu_mutex_unlock(&s->start_stop_lock);
+}
+
 /* Context: QEMU global mutex held */
 void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 {
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
+    VirtIOBlockStartData *data;
     int r;
 
-    if (vblk->dataplane_started || s->starting) {
+    qemu_mutex_lock(&s->start_stop_lock);
+    if (vblk->dataplane_started) {
+        qemu_mutex_unlock(&s->start_stop_lock);
         return;
     }
 
-    s->starting = true;
     s->vq = virtio_get_queue(s->vdev, 0);
 
     /* Set up guest notifier (irq) */
@@ -215,27 +245,24 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
         goto fail_host_notifier;
     }
 
-    s->starting = false;
     vblk->dataplane_started = true;
     trace_virtio_blk_data_plane_start(s);
 
     blk_set_aio_context(s->conf->conf.blk, s->ctx);
 
-    /* Kick right away to begin processing requests already in vring */
-    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
-
-    /* Get this show started by hooking up our callbacks */
-    aio_context_acquire(s->ctx);
-    virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
-    aio_context_release(s->ctx);
+    data = g_new(VirtIOBlockStartData, 1);
+    data->vblk = vblk;
+    data->bh = aio_bh_new(s->ctx, virtio_blk_data_plane_start_bh_cb, data);
+    qemu_bh_schedule(data->bh);
+    qemu_mutex_unlock(&s->start_stop_lock);
     return;
 
   fail_host_notifier:
     k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
     s->disabled = true;
-    s->starting = false;
     vblk->dataplane_started = true;
+    qemu_mutex_unlock(&s->start_stop_lock);
 }
 
 /* Context: QEMU global mutex held */
@@ -245,15 +272,16 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
+    qemu_mutex_lock(&s->start_stop_lock);
     if (!vblk->dataplane_started || s->stopping) {
-        return;
+        goto out;
     }
 
     /* Better luck next time. */
     if (s->disabled) {
         s->disabled = false;
         vblk->dataplane_started = false;
-        return;
+        goto out;
     }
     s->stopping = true;
     trace_virtio_blk_data_plane_stop(s);
@@ -275,4 +303,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
     vblk->dataplane_started = false;
     s->stopping = false;
+out:
+    qemu_mutex_unlock(&s->start_stop_lock);
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context Fam Zheng
@ 2016-03-16 10:27   ` Paolo Bonzini
  2016-03-16 10:51     ` Fam Zheng
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 10:27 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, borntraeger



On 16/03/2016 11:10, Fam Zheng wrote:
> An empty begin/end pair is almost the same as a bare bdrv_drain except
> the aio_poll inside is wrapped by
> aio_disable_external/aio_enable_external.
> 
> This is safer, and is the only way to achieve quiescence in this
> aio_poll(), because bdrv_drained_begin/end pair cannot span across
> context detach/attach options, so it's not possible to do by the caller.

I'm still not sure about this patch.

When starting dataplane, the ioeventfd is registered with iohandler.c so
bdrv_drained_begin/end is not necessary.

Likewise when stopping dataplane bdrv_set_aio_context is called after
the thread has been stopped and thus the ioeventfd is not registered
anymore as an external client.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
                   ` (3 preceding siblings ...)
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH Fam Zheng
@ 2016-03-16 10:28 ` Paolo Bonzini
  2016-03-16 10:49   ` Christian Borntraeger
  4 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 10:28 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck, borntraeger



On 16/03/2016 11:10, Fam Zheng wrote:
> These are some ideas originated from analyzing the Christian's crash report on
> virtio-blk dataplane torture test:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02093.html
> 
> The ideas are mostly inspired/suggested by Paolo. This doesn't fix the bug, but
> the first and the last patches seem to make the crash less frequent.  Also
> thanks Cornelia Huck for reviewing the draft version posted in that thread.

I see you have fixed the mutex and started check in patch 4, so perhaps
this fixes the bug. :)  Bo or Christian, could you try it out---and if
it works try patches 2 to 4 only?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 10:28 ` [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Paolo Bonzini
@ 2016-03-16 10:49   ` Christian Borntraeger
  2016-03-16 11:09     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-16 10:49 UTC (permalink / raw)
  To: Paolo Bonzini, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck

On 03/16/2016 11:28 AM, Paolo Bonzini wrote:
> 
> 
> On 16/03/2016 11:10, Fam Zheng wrote:
>> These are some ideas originated from analyzing the Christian's crash report on
>> virtio-blk dataplane torture test:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02093.html
>>
>> The ideas are mostly inspired/suggested by Paolo. This doesn't fix the bug, but
>> the first and the last patches seem to make the crash less frequent.  Also
>> thanks Cornelia Huck for reviewing the draft version posted in that thread.
> 
> I see you have fixed the mutex and started check in patch 4, so perhaps
> this fixes the bug. :)  Bo or Christian, could you try it out---and if
> it works try patches 2 to 4 only?
> 
> Thanks,
> 
> Paolo
> 
Seems to lockup.

Thread 5 (Thread 0x3ff8b2ff910 (LWP 88956)):
#0  0x000003ff8c97f13e in syscall () at /lib64/libc.so.6
#1  0x00000000803d52fe in futex_wait (ev=0x80a4a104 <rcu_call_ready_event>, val=4294967295) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:292
#2  0x00000000803d558e in qemu_event_wait (ev=0x80a4a104 <rcu_call_ready_event>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399
#3  0x00000000803f2c34 in call_rcu_thread (opaque=0x0) at /home/cborntra/REPOS/qemu/util/rcu.c:250
#4  0x000003ff8ca87c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ff8c984c7a in thread_start () at /lib64/libc.so.6

Thread 4 (Thread 0x3ff8aaff910 (LWP 88957)):
#0  0x000003ff8c9784d8 in ppoll () at /lib64/libc.so.6
#1  0x00000000802efdca in qemu_poll_ns (fds=0x3ff84002240, nfds=2, timeout=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:313
#2  0x00000000802f2528 in aio_poll (ctx=0xb9e94050, blocking=true) at /home/cborntra/REPOS/qemu/aio-posix.c:453
#3  0x000000008016392a in iothread_run (opaque=0xb9e93b10) at /home/cborntra/REPOS/qemu/iothread.c:46
#4  0x000003ff8ca87c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ff8c984c7a in thread_start () at /lib64/libc.so.6

Thread 3 (Thread 0x3ff888dc910 (LWP 88958)):
#0  0x000003ff8ca90cd4 in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x000003ff8ca93e74 in __lll_lock_elision () at /lib64/libpthread.so.0
#2  0x00000000803d49ce in qemu_mutex_lock (mutex=0x8061f260 <qemu_global_mutex>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:64
#3  0x0000000080060ef4 in qemu_mutex_lock_iothread () at /home/cborntra/REPOS/qemu/cpus.c:1226
#4  0x0000000080156af6 in kvm_arch_handle_exit (cs=0xba23b7f0, run=0x3ff8a200000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2024
#5  0x00000000800815de in kvm_cpu_exec (cpu=0xba23b7f0) at /home/cborntra/REPOS/qemu/kvm-all.c:1921
#6  0x000000008006074c in qemu_kvm_cpu_thread_fn (arg=0xba23b7f0) at /home/cborntra/REPOS/qemu/cpus.c:1050
#7  0x000003ff8ca87c2c in start_thread () at /lib64/libpthread.so.0
#8  0x000003ff8c984c7a in thread_start () at /lib64/libc.so.6

Thread 2 (Thread 0x3ff67fff910 (LWP 88959)):
#0  0x000003ff8ca90d04 in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x000003ff8ca93e74 in __lll_lock_elision () at /lib64/libpthread.so.0
#2  0x00000000803d49ce in qemu_mutex_lock (mutex=0x8061f260 <qemu_global_mutex>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:64
#3  0x0000000080060ef4 in qemu_mutex_lock_iothread () at /home/cborntra/REPOS/qemu/cpus.c:1226
#4  0x0000000080156af6 in kvm_arch_handle_exit (cs=0xb9f2e970, run=0x3ff88080000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2024
#5  0x00000000800815de in kvm_cpu_exec (cpu=0xb9f2e970) at /home/cborntra/REPOS/qemu/kvm-all.c:1921
#6  0x000000008006074c in qemu_kvm_cpu_thread_fn (arg=0xb9f2e970) at /home/cborntra/REPOS/qemu/cpus.c:1050
#7  0x000003ff8ca87c2c in start_thread () at /lib64/libpthread.so.0
#8  0x000003ff8c984c7a in thread_start () at /lib64/libc.so.6

Thread 1 (Thread 0x3ff8e55bb90 (LWP 88953)):
#0  0x000003ff8ca90cd4 in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x000003ff8ca93e74 in __lll_lock_elision () at /lib64/libpthread.so.0
#2  0x00000000803d49ce in qemu_mutex_lock (mutex=0xba232df8) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:64
#3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
#4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
#5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
#6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
#7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
#8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
#9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
#10 0x000000008010d5aa in virtio_ccw_set_host_notifier (d=0xb9eed6a0, n=0, assign=true) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:1211
#11 0x00000000800b722c in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:242
#12 0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
#13 0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
#14 0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
#15 0x00000000802f1cd4 in aio_dispatch (ctx=0xb9e81c70) at /home/cborntra/REPOS/qemu/aio-posix.c:327
#16 0x00000000802df31c in aio_ctx_dispatch (source=0xb9e81c70, callback=0x0, user_data=0x0) at /home/cborntra/REPOS/qemu/async.c:232
---Type <return> to continue, or q <return> to quit---
#17 0x000003ff8d2d1c0a in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#18 0x00000000802ee9a6 in glib_pollfds_poll () at /home/cborntra/REPOS/qemu/main-loop.c:212
#19 0x00000000802eeae2 in os_host_main_loop_wait (timeout=979000000) at /home/cborntra/REPOS/qemu/main-loop.c:257
#20 0x00000000802eebee in main_loop_wait (nonblocking=0) at /home/cborntra/REPOS/qemu/main-loop.c:505
#21 0x000000008017b4a4 in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1933
#22 0x0000000080183992 in main (argc=72, argv=0x3ffebc7e908, envp=0x3ffebc7eb50) at /home/cborntra/REPOS/qemu/vl.c:4656

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context
  2016-03-16 10:27   ` Paolo Bonzini
@ 2016-03-16 10:51     ` Fam Zheng
  0 siblings, 0 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-16 10:51 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tubo,
	Stefan Hajnoczi, cornelia.huck, borntraeger

On Wed, 03/16 11:27, Paolo Bonzini wrote:
> 
> 
> On 16/03/2016 11:10, Fam Zheng wrote:
> > An empty begin/end pair is almost the same as a bare bdrv_drain except
> > the aio_poll inside is wrapped by
> > aio_disable_external/aio_enable_external.
> > 
> > This is safer, and is the only way to achieve quiescence in this
> > aio_poll(), because bdrv_drained_begin/end pair cannot span across
> > context detach/attach options, so it's not possible to do by the caller.
> 
> I'm still not sure about this patch.
> 
> When starting dataplane, the ioeventfd is registered with iohandler.c so
> bdrv_drained_begin/end is not necessary.

You are right, and looks like the k->set_host_notifier() above
blk_set_aio_context would disable the fd anyway.

> 
> Likewise when stopping dataplane bdrv_set_aio_context is called after
> the thread has been stopped and thus the ioeventfd is not registered
> anymore as an external client.

Right.

Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 10:49   ` Christian Borntraeger
@ 2016-03-16 11:09     ` Paolo Bonzini
  2016-03-16 11:24       ` Christian Borntraeger
  2016-03-16 11:32       ` Cornelia Huck
  0 siblings, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 11:09 UTC (permalink / raw)
  To: Christian Borntraeger, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck



On 16/03/2016 11:49, Christian Borntraeger wrote:
> Seems to lockup.

That's an improvement actually. :)

> Thread 3 (Thread 0x3ff888dc910 (LWP 88958)):
> #0  0x000003ff8ca90cd4 in __lll_lock_wait () at /lib64/libpthread.so.0
> #1  0x000003ff8ca93e74 in __lll_lock_elision () at /lib64/libpthread.so.0

Off-topic, I love how s390 backtraces always have a little bit of
showing off in them! :)

> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154

One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
understanding of assign=true is correct; I think it means "I'm going to
set up another host notifier handler").

In dataplane, instead, all calls to
virtio_queue_set_host_notifier_fd_handler and
virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
ioeventfd is just being moved from one aiocontext to another.

Paolo

> #10 0x000000008010d5aa in virtio_ccw_set_host_notifier (d=0xb9eed6a0, n=0, assign=true) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:1211
> #11 0x00000000800b722c in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:242
> #12 0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> #13 0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> #14 0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> #15 0x00000000802f1cd4 in aio_dispatch (ctx=0xb9e81c70) at /home/cborntra/REPOS/qemu/aio-posix.c:327
> #16 0x00000000802df31c in aio_ctx_dispatch (source=0xb9e81c70, callback=0x0, user_data=0x0) at /home/cborntra/REPOS/qemu/async.c:232

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:09     ` Paolo Bonzini
@ 2016-03-16 11:24       ` Christian Borntraeger
  2016-03-16 12:55         ` Paolo Bonzini
  2016-03-16 11:32       ` Cornelia Huck
  1 sibling, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-16 11:24 UTC (permalink / raw)
  To: Paolo Bonzini, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck

On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
> 
> 
> On 16/03/2016 11:49, Christian Borntraeger wrote:
>> Seems to lockup.
> 
> That's an improvement actually. :)
> 
>> Thread 3 (Thread 0x3ff888dc910 (LWP 88958)):
>> #0  0x000003ff8ca90cd4 in __lll_lock_wait () at /lib64/libpthread.so.0
>> #1  0x000003ff8ca93e74 in __lll_lock_elision () at /lib64/libpthread.so.0
> 
> Off-topic, I love how s390 backtraces always have a little bit of
> showing off in them! :)

So I understood your first remark (about _working_ transactional memory) but  I 
cannot parse your 2nd one. For Conny it is exactly the opposite, so I will let
Conny answer belows discussion ;-)



> 
>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> 
> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
> understanding of assign=true is correct; I think it means "I'm going to
> set up another host notifier handler").
> 
> In dataplane, instead, all calls to
> virtio_queue_set_host_notifier_fd_handler and
> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> ioeventfd is just being moved from one aiocontext to another.
> 
> Paolo
> 
>> #10 0x000000008010d5aa in virtio_ccw_set_host_notifier (d=0xb9eed6a0, n=0, assign=true) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:1211
>> #11 0x00000000800b722c in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:242
>> #12 0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>> #13 0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>> #14 0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>> #15 0x00000000802f1cd4 in aio_dispatch (ctx=0xb9e81c70) at /home/cborntra/REPOS/qemu/aio-posix.c:327
>> #16 0x00000000802df31c in aio_ctx_dispatch (source=0xb9e81c70, callback=0x0, user_data=0x0) at /home/cborntra/REPOS/qemu/async.c:232
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:09     ` Paolo Bonzini
  2016-03-16 11:24       ` Christian Borntraeger
@ 2016-03-16 11:32       ` Cornelia Huck
  2016-03-16 11:48         ` Paolo Bonzini
  2016-03-16 11:52         ` Cornelia Huck
  1 sibling, 2 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 11:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 12:09:02 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/03/2016 11:49, Christian Borntraeger wrote:

> > #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> > #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> > #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> > #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> > #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
> > #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
> > #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> 
> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
> understanding of assign=true is correct; I think it means "I'm going to
> set up another host notifier handler").

Hm... I copied these semantics from virtio-pci, and they still seem to
be the same. I wonder why we never saw this on virtio-pci?

> In dataplane, instead, all calls to
> virtio_queue_set_host_notifier_fd_handler and
> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> ioeventfd is just being moved from one aiocontext to another.

How can a transport know where they are called from?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:32       ` Cornelia Huck
@ 2016-03-16 11:48         ` Paolo Bonzini
  2016-03-16 11:56           ` Cornelia Huck
  2016-03-16 11:52         ` Cornelia Huck
  1 sibling, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 11:48 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 12:32, Cornelia Huck wrote:
> On Wed, 16 Mar 2016 12:09:02 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 16/03/2016 11:49, Christian Borntraeger wrote:
> 
>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
>>
>> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
>> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
>> understanding of assign=true is correct; I think it means "I'm going to
>> set up another host notifier handler").
> 
> Hm... I copied these semantics from virtio-pci, and they still seem to
> be the same. I wonder why we never saw this on virtio-pci?

Just because we never ran the right tests, probably.  The bug is indeed
in virtio-pci as well (I didn't check mmio).

>> In dataplane, instead, all calls to
>> virtio_queue_set_host_notifier_fd_handler and
>> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
>> ioeventfd is just being moved from one aiocontext to another.
> 
> How can a transport know where they are called from?

That's a bug in dataplane.  The calls should be changed in
hw/block/dataplane/virtio-blk.c

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:32       ` Cornelia Huck
  2016-03-16 11:48         ` Paolo Bonzini
@ 2016-03-16 11:52         ` Cornelia Huck
  2016-03-16 11:54           ` Paolo Bonzini
  1 sibling, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 11:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 12:32:13 +0100
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Wed, 16 Mar 2016 12:09:02 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 16/03/2016 11:49, Christian Borntraeger wrote:
> 
> > > #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> > > #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> > > #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> > > #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> > > #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
> > > #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
> > > #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> > 
> > One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
> > assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
> > understanding of assign=true is correct; I think it means "I'm going to
> > set up another host notifier handler").
> 
> Hm... I copied these semantics from virtio-pci, and they still seem to
> be the same. I wonder why we never saw this on virtio-pci?
> 
> > In dataplane, instead, all calls to
> > virtio_queue_set_host_notifier_fd_handler and
> > virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> > ioeventfd is just being moved from one aiocontext to another.
> 
> How can a transport know where they are called from?

Hm^2... I looked at virtio-scsi dataplane, and I noticed that it
acquires the aio context prior to calling ->set_host_notifiers(). Does
virtio-blk dataplane need to do this as well, or is virtio-scsi
dataplane wrong/different?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:52         ` Cornelia Huck
@ 2016-03-16 11:54           ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 11:54 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 12:52, Cornelia Huck wrote:
> > Hm... I copied these semantics from virtio-pci, and they still seem to
> > be the same. I wonder why we never saw this on virtio-pci?
> > 
> > > In dataplane, instead, all calls to
> > > virtio_queue_set_host_notifier_fd_handler and
> > > virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> > > ioeventfd is just being moved from one aiocontext to another.
> > 
> > How can a transport know where they are called from?
> 
> Hm^2... I looked at virtio-scsi dataplane, and I noticed that it
> acquires the aio context prior to calling ->set_host_notifiers(). Does
> virtio-blk dataplane need to do this as well, or is virtio-scsi
> dataplane wrong/different?

I cannot really answer, but my plan was to solve this bug and then
ensure that virtio-scsi dataplane does exactly the same thing...  I
would ignore virtio-scsi dataplane for now.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:48         ` Paolo Bonzini
@ 2016-03-16 11:56           ` Cornelia Huck
  2016-03-16 11:59             ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 11:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 12:48:22 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 16/03/2016 12:32, Cornelia Huck wrote:
> > On Wed, 16 Mar 2016 12:09:02 +0100
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> >> On 16/03/2016 11:49, Christian Borntraeger wrote:
> > 
> >>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> >>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> >>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> >>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> >>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
> >>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
> >>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> >>
> >> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
> >> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
> >> understanding of assign=true is correct; I think it means "I'm going to
> >> set up another host notifier handler").
> > 
> > Hm... I copied these semantics from virtio-pci, and they still seem to
> > be the same. I wonder why we never saw this on virtio-pci?
> 
> Just because we never ran the right tests, probably.  The bug is indeed
> in virtio-pci as well (I didn't check mmio).

Somebody should probably do a writeup on the expected behaviour, as
this always ends in mental gymnastics (at least for me :)

> 
> >> In dataplane, instead, all calls to
> >> virtio_queue_set_host_notifier_fd_handler and
> >> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> >> ioeventfd is just being moved from one aiocontext to another.
> > 
> > How can a transport know where they are called from?
> 
> That's a bug in dataplane.  The calls should be changed in
> hw/block/dataplane/virtio-blk.c

I don't think the ->set_host_notifiers() api really allows for this.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:56           ` Cornelia Huck
@ 2016-03-16 11:59             ` Paolo Bonzini
  2016-03-16 12:22               ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 11:59 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 12:56, Cornelia Huck wrote:
> On Wed, 16 Mar 2016 12:48:22 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>>
>>
>> On 16/03/2016 12:32, Cornelia Huck wrote:
>>> On Wed, 16 Mar 2016 12:09:02 +0100
>>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
>>>
>>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>>>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
>>>>
>>>> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
>>>> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
>>>> understanding of assign=true is correct; I think it means "I'm going to
>>>> set up another host notifier handler").
>>>
>>> Hm... I copied these semantics from virtio-pci, and they still seem to
>>> be the same. I wonder why we never saw this on virtio-pci?
>>
>> Just because we never ran the right tests, probably.  The bug is indeed
>> in virtio-pci as well (I didn't check mmio).
> 
> Somebody should probably do a writeup on the expected behaviour, as
> this always ends in mental gymnastics (at least for me :)

Yeah, it doesn't help that the functions are underdocumented (as in the
"assign" parameter above).

>>
>>>> In dataplane, instead, all calls to
>>>> virtio_queue_set_host_notifier_fd_handler and
>>>> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
>>>> ioeventfd is just being moved from one aiocontext to another.
>>>
>>> How can a transport know where they are called from?
>>
>> That's a bug in dataplane.  The calls should be changed in
>> hw/block/dataplane/virtio-blk.c
> 
> I don't think the ->set_host_notifiers() api really allows for this.

I think it does, assign is the last argument to k->set_host_notifier().

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:59             ` Paolo Bonzini
@ 2016-03-16 12:22               ` Cornelia Huck
  2016-03-16 12:32                 ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 12:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 12:59:37 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/03/2016 12:56, Cornelia Huck wrote:
> > On Wed, 16 Mar 2016 12:48:22 +0100
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> >>
> >>
> >> On 16/03/2016 12:32, Cornelia Huck wrote:
> >>> On Wed, 16 Mar 2016 12:09:02 +0100
> >>> Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>
> >>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
> >>>
> >>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> >>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> >>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> >>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> >>>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
> >>>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
> >>>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> >>>>
> >>>> One bug is here: virtio_ccw_stop_ioeventfd, in this case, should pass
> >>>> assign=true to virtio_ccw_set_guest2host_notifier.  (Assuming my
> >>>> understanding of assign=true is correct; I think it means "I'm going to
> >>>> set up another host notifier handler").
> >>>
> >>> Hm... I copied these semantics from virtio-pci, and they still seem to
> >>> be the same. I wonder why we never saw this on virtio-pci?
> >>
> >> Just because we never ran the right tests, probably.  The bug is indeed
> >> in virtio-pci as well (I didn't check mmio).
> > 
> > Somebody should probably do a writeup on the expected behaviour, as
> > this always ends in mental gymnastics (at least for me :)
> 
> Yeah, it doesn't help that the functions are underdocumented (as in the
> "assign" parameter above).

My understanding is:

- 'assign': set up a new notifier (true) or disable it (false)
- 'set_handler': use our handler (true) or have it handled elsewhere
  (false)

> 
> >>
> >>>> In dataplane, instead, all calls to
> >>>> virtio_queue_set_host_notifier_fd_handler and
> >>>> virtio_queue_aio_set_host_notifier_handler should have assign=true.  The
> >>>> ioeventfd is just being moved from one aiocontext to another.
> >>>
> >>> How can a transport know where they are called from?
> >>
> >> That's a bug in dataplane.  The calls should be changed in
> >> hw/block/dataplane/virtio-blk.c
> > 
> > I don't think the ->set_host_notifiers() api really allows for this.
> 
> I think it does, assign is the last argument to k->set_host_notifier().

This depends on whether we want 'assign' to clean up any old notifiers
before setting up new ones. I think we want different behaviour for
dataplane and vhost.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 12:22               ` Cornelia Huck
@ 2016-03-16 12:32                 ` Paolo Bonzini
  2016-03-16 12:42                   ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 12:32 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 13:22, Cornelia Huck wrote:
>> > Yeah, it doesn't help that the functions are underdocumented (as in the
>> > "assign" parameter above).
> My understanding is:
> 
> - 'assign': set up a new notifier (true) or disable it (false)
> - 'set_handler': use our handler (true) or have it handled elsewhere
>   (false)

Right.  So if we're setting up a new notifier in
virtio_queue_aio_set_host_notifier_handler, virtio_pci_stop_ioeventfd should

> > > I don't think the ->set_host_notifiers() api really allows for this.
> > 
> > I think it does, assign is the last argument to k->set_host_notifier().
> 
> This depends on whether we want 'assign' to clean up any old notifiers
> before setting up new ones. I think we want different behaviour for
> dataplane and vhost.

I think dataplane and vhost are the same.

The question is whether ioeventfd=off,vhost=on or
ioeventfd=off,dataplane=on are valid combinations; I think they aren't.

If they aren't, it should be okay to remove the
virtio_queue_host_notifier_read call in
virtio_queue_set_host_notifier_fd_handler and
virtio_queue_aio_set_host_notifier_handler.  That's because a handler
for the notifier will always be set _somewhere_.  It could be the usual
ioeventfd handler, the vhost handler or the dataplane handler, but one
will be there.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 12:32                 ` Paolo Bonzini
@ 2016-03-16 12:42                   ` Cornelia Huck
  2016-03-16 12:49                     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 12:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 13:32:59 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/03/2016 13:22, Cornelia Huck wrote:
> >> > Yeah, it doesn't help that the functions are underdocumented (as in the
> >> > "assign" parameter above).
> > My understanding is:
> > 
> > - 'assign': set up a new notifier (true) or disable it (false)
> > - 'set_handler': use our handler (true) or have it handled elsewhere
> >   (false)
> 
> Right.  So if we're setting up a new notifier in
> virtio_queue_aio_set_host_notifier_handler, virtio_pci_stop_ioeventfd should

...do what? ;)

> 
> > > > I don't think the ->set_host_notifiers() api really allows for this.
> > > 
> > > I think it does, assign is the last argument to k->set_host_notifier().
> > 
> > This depends on whether we want 'assign' to clean up any old notifiers
> > before setting up new ones. I think we want different behaviour for
> > dataplane and vhost.
> 
> I think dataplane and vhost are the same.
> 
> The question is whether ioeventfd=off,vhost=on or
> ioeventfd=off,dataplane=on are valid combinations; I think they aren't.

We should disallow that even temporary, then.

(This would imply that we need to drop the _stop_ioeventfd() call in
->set_host_notifier(), correct?)

> If they aren't, it should be okay to remove the
> virtio_queue_host_notifier_read call in
> virtio_queue_set_host_notifier_fd_handler and
> virtio_queue_aio_set_host_notifier_handler.  That's because a handler
> for the notifier will always be set _somewhere_.  It could be the usual
> ioeventfd handler, the vhost handler or the dataplane handler, but one
> will be there.

It should; but we probably need to do a final read when we stop the
ioeventfd.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 12:42                   ` Cornelia Huck
@ 2016-03-16 12:49                     ` Paolo Bonzini
  2016-03-16 13:04                       ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 12:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 13:42, Cornelia Huck wrote:
> On Wed, 16 Mar 2016 13:32:59 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 16/03/2016 13:22, Cornelia Huck wrote:
>>>>> Yeah, it doesn't help that the functions are underdocumented (as in the
>>>>> "assign" parameter above).
>>> My understanding is:
>>>
>>> - 'assign': set up a new notifier (true) or disable it (false)
>>> - 'set_handler': use our handler (true) or have it handled elsewhere
>>>   (false)
>>
>> Right.  So if we're setting up a new notifier in
>> virtio_queue_aio_set_host_notifier_handler, virtio_pci_stop_ioeventfd should

... not call virtio_queue_host_notifier_read.

>>>>> I don't think the ->set_host_notifiers() api really allows for this.
>>>>
>>>> I think it does, assign is the last argument to k->set_host_notifier().
>>>
>>> This depends on whether we want 'assign' to clean up any old notifiers
>>> before setting up new ones. I think we want different behaviour for
>>> dataplane and vhost.
>>
>> I think dataplane and vhost are the same.
>>
>> The question is whether ioeventfd=off,vhost=on or
>> ioeventfd=off,dataplane=on are valid combinations; I think they aren't.
> 
> We should disallow that even temporary, then.
> 
> (This would imply that we need to drop the _stop_ioeventfd() call in
> ->set_host_notifier(), correct?)

No, it would not.  ioeventfd=off,vhost=on would mean: "when vhost is
off, use vCPU thread notification".

When turning on vhost you'd still stop ioeventfd (i.e. stop processing
the virtqueue in QEMU's main iothread), but you don't need to do
anything to the event notifier.  vhost will pick it up and work on the
virtqueue if necessary.  Likewise for dataplane.

>> If they aren't, it should be okay to remove the
>> virtio_queue_host_notifier_read call in
>> virtio_queue_set_host_notifier_fd_handler and
>> virtio_queue_aio_set_host_notifier_handler.  That's because a handler
>> for the notifier will always be set _somewhere_.  It could be the usual
>> ioeventfd handler, the vhost handler or the dataplane handler, but one
>> will be there.
> 
> It should; but we probably need to do a final read when we stop the
> ioeventfd.

I was thinking of handing the final read directly to the next guy who
polls the event notifier instead.  So, when called from vhost or
dataplane, virtio_pci_stop_ioeventfd would use
assign=true/set_handler=false ("a new notifier is going to be set up by
the caller").

The host notifier API unfortunately is full of indirections.  I'm not
sure how many of them are actually necessary.

Paolo

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 11:24       ` Christian Borntraeger
@ 2016-03-16 12:55         ` Paolo Bonzini
  2016-03-16 13:38           ` Christian Borntraeger
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 12:55 UTC (permalink / raw)
  To: Christian Borntraeger, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck



On 16/03/2016 12:24, Christian Borntraeger wrote:
> On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
>> On 16/03/2016 11:49, Christian Borntraeger wrote:
>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785

If you just remove the calls to virtio_queue_host_notifier_read, here
and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
(keeping patches 2-4 in)?

Paolo

>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 12:49                     ` Paolo Bonzini
@ 2016-03-16 13:04                       ` Cornelia Huck
  2016-03-16 13:10                         ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 13:04 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 13:49:10 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/03/2016 13:42, Cornelia Huck wrote:
> > On Wed, 16 Mar 2016 13:32:59 +0100
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> >> On 16/03/2016 13:22, Cornelia Huck wrote:
> >>>>> Yeah, it doesn't help that the functions are underdocumented (as in the
> >>>>> "assign" parameter above).
> >>> My understanding is:
> >>>
> >>> - 'assign': set up a new notifier (true) or disable it (false)
> >>> - 'set_handler': use our handler (true) or have it handled elsewhere
> >>>   (false)
> >>
> >> Right.  So if we're setting up a new notifier in
> >> virtio_queue_aio_set_host_notifier_handler, virtio_pci_stop_ioeventfd should
> 
> ... not call virtio_queue_host_notifier_read.

This needs to be cascaded into
virtio_queue_set_host_notifier_fd_handler(), I guess.

> 
> >>>>> I don't think the ->set_host_notifiers() api really allows for this.
> >>>>
> >>>> I think it does, assign is the last argument to k->set_host_notifier().
> >>>
> >>> This depends on whether we want 'assign' to clean up any old notifiers
> >>> before setting up new ones. I think we want different behaviour for
> >>> dataplane and vhost.
> >>
> >> I think dataplane and vhost are the same.
> >>
> >> The question is whether ioeventfd=off,vhost=on or
> >> ioeventfd=off,dataplane=on are valid combinations; I think they aren't.
> > 
> > We should disallow that even temporary, then.
> > 
> > (This would imply that we need to drop the _stop_ioeventfd() call in
> > ->set_host_notifier(), correct?)
> 
> No, it would not.  ioeventfd=off,vhost=on would mean: "when vhost is
> off, use vCPU thread notification".

*confused*

Is ioeventfd=off supposed to mean "don't talk to the kernel, do
everything in qemu"?

> 
> When turning on vhost you'd still stop ioeventfd (i.e. stop processing
> the virtqueue in QEMU's main iothread), but you don't need to do
> anything to the event notifier.  vhost will pick it up and work on the
> virtqueue if necessary.  Likewise for dataplane.

So "disassociate the handler and switch over to the new one"?

> 
> >> If they aren't, it should be okay to remove the
> >> virtio_queue_host_notifier_read call in
> >> virtio_queue_set_host_notifier_fd_handler and
> >> virtio_queue_aio_set_host_notifier_handler.  That's because a handler
> >> for the notifier will always be set _somewhere_.  It could be the usual
> >> ioeventfd handler, the vhost handler or the dataplane handler, but one
> >> will be there.
> > 
> > It should; but we probably need to do a final read when we stop the
> > ioeventfd.
> 
> I was thinking of handing the final read directly to the next guy who
> polls the event notifier instead.  So, when called from vhost or
> dataplane, virtio_pci_stop_ioeventfd would use
> assign=true/set_handler=false ("a new notifier is going to be set up by
> the caller").

OK, then we'd need to pass a new parameter for this.

> 
> The host notifier API unfortunately is full of indirections.  I'm not
> sure how many of them are actually necessary.

Oh yes, it's very hard to follow, especially with not-very-well defined
parameters.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:04                       ` Cornelia Huck
@ 2016-03-16 13:10                         ` Paolo Bonzini
  2016-03-16 13:14                           ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 13:10 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 14:04, Cornelia Huck wrote:
> > No, it would not.  ioeventfd=off,vhost=on would mean: "when vhost is
> > off, use vCPU thread notification".
> 
> *confused*
> 
> Is ioeventfd=off supposed to mean "don't talk to the kernel, do
> everything in qemu"?

For KVM, it means do everything in the QEMU vCPU thread (using userspace
vmexits).

>> > When turning on vhost you'd still stop ioeventfd (i.e. stop processing
>> > the virtqueue in QEMU's main iothread), but you don't need to do
>> > anything to the event notifier.  vhost will pick it up and work on the
>> > virtqueue if necessary.  Likewise for dataplane.
> 
> So "disassociate the handler and switch over to the new one"?

Yes, if we prohibit combinations which switch from vCPU thread
notification to vhost or dataplane (such as ioeventfd=off,vhost=on).  If
we always use an eventfd, we always have a handler to switch to.

> > > > If they aren't, it should be okay to remove the
> > > > virtio_queue_host_notifier_read call in
> > > > virtio_queue_set_host_notifier_fd_handler and
> > > > virtio_queue_aio_set_host_notifier_handler.  That's because a handler
> > > > for the notifier will always be set _somewhere_.  It could be the usual
> > > > ioeventfd handler, the vhost handler or the dataplane handler, but one
> > > > will be there.
> > > 
> > > It should; but we probably need to do a final read when we stop the
> > > ioeventfd.
> > 
> > I was thinking of handing the final read directly to the next guy who
> > polls the event notifier instead.  So, when called from vhost or
> > dataplane, virtio_pci_stop_ioeventfd would use
> > assign=true/set_handler=false ("a new notifier is going to be set up by
> > the caller").
> 
> OK, then we'd need to pass a new parameter for this.

Yes, agreed.

> > The host notifier API unfortunately is full of indirections.  I'm not
> > sure how many of them are actually necessary.
> 
> Oh yes, it's very hard to follow, especially with not-very-well defined
> parameters.

And full of duplicate code.  If copied code were moved to the virtio bus
level, it would be easier to change too.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:10                         ` Paolo Bonzini
@ 2016-03-16 13:14                           ` Cornelia Huck
  2016-03-16 13:15                             ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-16 13:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi

On Wed, 16 Mar 2016 14:10:29 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 16/03/2016 14:04, Cornelia Huck wrote:
> > > No, it would not.  ioeventfd=off,vhost=on would mean: "when vhost is
> > > off, use vCPU thread notification".
> > 
> > *confused*
> > 
> > Is ioeventfd=off supposed to mean "don't talk to the kernel, do
> > everything in qemu"?
> 
> For KVM, it means do everything in the QEMU vCPU thread (using userspace
> vmexits).

OK, we're on the same page then.

> 
> >> > When turning on vhost you'd still stop ioeventfd (i.e. stop processing
> >> > the virtqueue in QEMU's main iothread), but you don't need to do
> >> > anything to the event notifier.  vhost will pick it up and work on the
> >> > virtqueue if necessary.  Likewise for dataplane.
> > 
> > So "disassociate the handler and switch over to the new one"?
> 
> Yes, if we prohibit combinations which switch from vCPU thread
> notification to vhost or dataplane (such as ioeventfd=off,vhost=on).  If
> we always use an eventfd, we always have a handler to switch to.
> 
> > > > > If they aren't, it should be okay to remove the
> > > > > virtio_queue_host_notifier_read call in
> > > > > virtio_queue_set_host_notifier_fd_handler and
> > > > > virtio_queue_aio_set_host_notifier_handler.  That's because a handler
> > > > > for the notifier will always be set _somewhere_.  It could be the usual
> > > > > ioeventfd handler, the vhost handler or the dataplane handler, but one
> > > > > will be there.
> > > > 
> > > > It should; but we probably need to do a final read when we stop the
> > > > ioeventfd.
> > > 
> > > I was thinking of handing the final read directly to the next guy who
> > > polls the event notifier instead.  So, when called from vhost or
> > > dataplane, virtio_pci_stop_ioeventfd would use
> > > assign=true/set_handler=false ("a new notifier is going to be set up by
> > > the caller").
> > 
> > OK, then we'd need to pass a new parameter for this.
> 
> Yes, agreed.
> 
> > > The host notifier API unfortunately is full of indirections.  I'm not
> > > sure how many of them are actually necessary.
> > 
> > Oh yes, it's very hard to follow, especially with not-very-well defined
> > parameters.
> 
> And full of duplicate code.  If copied code were moved to the virtio bus
> level, it would be easier to change too.

Yes, pci, ccw and mmio basically all do the same things. The only
difference is actually wiring up the eventfd.

I can attempt to do some refactoring.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:14                           ` Cornelia Huck
@ 2016-03-16 13:15                             ` Paolo Bonzini
  0 siblings, 0 replies; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 13:15 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tubo, Stefan Hajnoczi



On 16/03/2016 14:14, Cornelia Huck wrote:
> > And full of duplicate code.  If copied code were moved to the virtio bus
> > level, it would be easier to change too.
> 
> Yes, pci, ccw and mmio basically all do the same things. The only
> difference is actually wiring up the eventfd.
> 
> I can attempt to do some refactoring.

That would be great, in the meanwhile I've asked Christian to try
removing the calls that cause the reentrancy.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 12:55         ` Paolo Bonzini
@ 2016-03-16 13:38           ` Christian Borntraeger
  2016-03-16 13:45             ` Paolo Bonzini
  2016-03-17 12:22             ` tu bo
  0 siblings, 2 replies; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-16 13:38 UTC (permalink / raw)
  To: Paolo Bonzini, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck

On 03/16/2016 01:55 PM, Paolo Bonzini wrote:
> 
> 
> On 16/03/2016 12:24, Christian Borntraeger wrote:
>> On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> 
> If you just remove the calls to virtio_queue_host_notifier_read, here
> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> (keeping patches 2-4 in)?

With these changes and patch 2-4 it does no longer locks up. 
I keep it running some hour to check if a crash happens.

Tu Bo, your setup is currently better suited for reproducing. Can you also check?

> 
> Paolo
> 
>>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:38           ` Christian Borntraeger
@ 2016-03-16 13:45             ` Paolo Bonzini
  2016-03-17  0:39               ` Fam Zheng
  2016-03-17 12:22             ` tu bo
  1 sibling, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-16 13:45 UTC (permalink / raw)
  To: Christian Borntraeger, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, tubo,
	Stefan Hajnoczi, cornelia.huck



On 16/03/2016 14:38, Christian Borntraeger wrote:
> > If you just remove the calls to virtio_queue_host_notifier_read, here
> > and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> > (keeping patches 2-4 in)?
> 
> With these changes and patch 2-4 it does no longer locks up. 
> I keep it running some hour to check if a crash happens.
> 
> Tu Bo, your setup is currently better suited for reproducing. Can you also check?

Great, I'll prepare a patch to virtio then sketching the solution that
Conny agreed with.

While Fam and I agreed that patch 1 is not required, I'm not sure if the
mutex is necessary in the end.

So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
and both with/without Fam's patches, it would be great.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:45             ` Paolo Bonzini
@ 2016-03-17  0:39               ` Fam Zheng
  2016-03-17 11:03                 ` tu bo
  0 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-17  0:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, tubo, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, cornelia.huck

On Wed, 03/16 14:45, Paolo Bonzini wrote:
> 
> 
> On 16/03/2016 14:38, Christian Borntraeger wrote:
> > > If you just remove the calls to virtio_queue_host_notifier_read, here
> > > and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> > > (keeping patches 2-4 in)?
> > 
> > With these changes and patch 2-4 it does no longer locks up. 
> > I keep it running some hour to check if a crash happens.
> > 
> > Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> 
> Great, I'll prepare a patch to virtio then sketching the solution that
> Conny agreed with.
> 
> While Fam and I agreed that patch 1 is not required, I'm not sure if the
> mutex is necessary in the end.

If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
to have it. I'm not sure about the BH.

And on a hindsight I realize we don't want patches 2-3 too. Actually the
begin/end pair won't work as expected because of the blk_set_aio_context.

Let's hold on this series.

> 
> So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
> and both with/without Fam's patches, it would be great.

Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.

Thanks,
Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17  0:39               ` Fam Zheng
@ 2016-03-17 11:03                 ` tu bo
  2016-03-21 10:57                   ` Fam Zheng
  0 siblings, 1 reply; 70+ messages in thread
From: tu bo @ 2016-03-17 11:03 UTC (permalink / raw)
  To: Fam Zheng, Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, cornelia.huck


On 03/17/2016 08:39 AM, Fam Zheng wrote:
> On Wed, 03/16 14:45, Paolo Bonzini wrote:
>>
>>
>> On 16/03/2016 14:38, Christian Borntraeger wrote:
>>>> If you just remove the calls to virtio_queue_host_notifier_read, here
>>>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
>>>> (keeping patches 2-4 in)?
>>>
>>> With these changes and patch 2-4 it does no longer locks up.
>>> I keep it running some hour to check if a crash happens.
>>>
>>> Tu Bo, your setup is currently better suited for reproducing. Can you also check?
>>
>> Great, I'll prepare a patch to virtio then sketching the solution that
>> Conny agreed with.
>>
>> While Fam and I agreed that patch 1 is not required, I'm not sure if the
>> mutex is necessary in the end.
>
> If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
> are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
> to have it. I'm not sure about the BH.
>
> And on a hindsight I realize we don't want patches 2-3 too. Actually the
> begin/end pair won't work as expected because of the blk_set_aio_context.
>
> Let's hold on this series.
>
>>
>> So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
>> and both with/without Fam's patches, it would be great.
>
> Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.
>
1. without the virtio_queue_host_notifier_read calls,  without patch 4

crash happens very often,

(gdb) bt
#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
#1  0x000002aa165da37e in coroutine_trampoline (i0=<optimized out>, 
i1=1812051552) at util/coroutine-ucontext.c:79
#2  0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6


2. without the virtio_queue_host_notifier_read calls, with patch 4

crash happens very often,

(gdb) bt
#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
#1  0x000002aa39dda43e in coroutine_trampoline (i0=<optimized out>, 
i1=-1677715600) at util/coroutine-ucontext.c:79
#2  0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6


> Thanks,
> Fam
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-16 13:38           ` Christian Borntraeger
  2016-03-16 13:45             ` Paolo Bonzini
@ 2016-03-17 12:22             ` tu bo
  2016-03-17 12:39               ` Christian Borntraeger
  1 sibling, 1 reply; 70+ messages in thread
From: tu bo @ 2016-03-17 12:22 UTC (permalink / raw)
  To: Christian Borntraeger, Paolo Bonzini, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin


On 03/16/2016 09:38 PM, Christian Borntraeger wrote:
> On 03/16/2016 01:55 PM, Paolo Bonzini wrote:
>>
>>
>> On 16/03/2016 12:24, Christian Borntraeger wrote:
>>> On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
>>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
>>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>>
>> If you just remove the calls to virtio_queue_host_notifier_read, here
>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
>> (keeping patches 2-4 in)?
>
> With these changes and patch 2-4 it does no longer locks up.
> I keep it running some hour to check if a crash happens.
>
> Tu Bo, your setup is currently better suited for reproducing. Can you also check?

remove the calls to virtio_queue_host_notifier_read, and keeping patches 
2-4 in,

I got same crash as before,
(gdb) bt
#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
#1  0x000002aa0c65d786 in coroutine_trampoline (i0=<optimized out>, 
i1=-2013204784) at util/coroutine-ucontext.c:79
#2  0x000003ff99ad150a in __makecontext_ret () from /lib64/libc.so.6


>
>>
>> Paolo
>>
>>>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
>>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 12:22             ` tu bo
@ 2016-03-17 12:39               ` Christian Borntraeger
  2016-03-17 13:02                 ` Cornelia Huck
  2016-03-17 15:02                 ` Paolo Bonzini
  0 siblings, 2 replies; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-17 12:39 UTC (permalink / raw)
  To: tu bo, Paolo Bonzini, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin

On 03/17/2016 01:22 PM, tu bo wrote:
> 
> On 03/16/2016 09:38 PM, Christian Borntraeger wrote:
>> On 03/16/2016 01:55 PM, Paolo Bonzini wrote:
>>>
>>>
>>> On 16/03/2016 12:24, Christian Borntraeger wrote:
>>>> On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
>>>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
>>>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
>>>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
>>>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
>>>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
>>>
>>> If you just remove the calls to virtio_queue_host_notifier_read, here
>>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
>>> (keeping patches 2-4 in)?
>>
>> With these changes and patch 2-4 it does no longer locks up.
>> I keep it running some hour to check if a crash happens.
>>
>> Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> 
> remove the calls to virtio_queue_host_notifier_read, and keeping patches 2-4 in,
> 
> I got same crash as before,
> (gdb) bt
> #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> #1  0x000002aa0c65d786 in coroutine_trampoline (i0=<optimized out>, i1=-2013204784) at util/coroutine-ucontext.c:79
> #2  0x000003ff99ad150a in __makecontext_ret () from /lib64/libc.so.6
> 

As an interesting side note, I updated my system from F20 to F23 some days ago
(after the initial report). While To Bo is still on a F20 system. I was not able
to reproduce the original crash on f23. but going back to F20 made this
problem re-appear.
 
  Stack trace of thread 26429:
                #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
                #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
                #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
                #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
                #4  0x000003ff943d150a __makecontext_ret (libc.so.6)

this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.

Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).

Maybe we have two independent issues here and this is some old bug in glibc or
whatever?

> 
>>
>>>
>>> Paolo
>>>
>>>>>> #7  0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler (vq=0xba305270, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817
>>>>>> #8  0x0000000080109c50 in virtio_ccw_set_guest2host_notifier (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97
>>>>>> #9  0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154
>>>
>>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 12:39               ` Christian Borntraeger
@ 2016-03-17 13:02                 ` Cornelia Huck
  2016-03-17 15:02                 ` Paolo Bonzini
  1 sibling, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-17 13:02 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, tu bo, Stefan Hajnoczi, Paolo Bonzini

On Thu, 17 Mar 2016 13:39:18 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 03/17/2016 01:22 PM, tu bo wrote:
> > 
> > On 03/16/2016 09:38 PM, Christian Borntraeger wrote:
> >> On 03/16/2016 01:55 PM, Paolo Bonzini wrote:
> >>>
> >>>
> >>> On 16/03/2016 12:24, Christian Borntraeger wrote:
> >>>> On 03/16/2016 12:09 PM, Paolo Bonzini wrote:
> >>>>> On 16/03/2016 11:49, Christian Borntraeger wrote:
> >>>>>> #3  0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224
> >>>>>> #4  0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590
> >>>>>> #5  0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> >>>>>> #6  0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
> >>>
> >>> If you just remove the calls to virtio_queue_host_notifier_read, here
> >>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> >>> (keeping patches 2-4 in)?
> >>
> >> With these changes and patch 2-4 it does no longer locks up.
> >> I keep it running some hour to check if a crash happens.
> >>
> >> Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> > 
> > remove the calls to virtio_queue_host_notifier_read, and keeping patches 2-4 in,
> > 
> > I got same crash as before,
> > (gdb) bt
> > #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> > #1  0x000002aa0c65d786 in coroutine_trampoline (i0=<optimized out>, i1=-2013204784) at util/coroutine-ucontext.c:79
> > #2  0x000003ff99ad150a in __makecontext_ret () from /lib64/libc.so.6
> > 
> 
> As an interesting side note, I updated my system from F20 to F23 some days ago
> (after the initial report). While To Bo is still on a F20 system. I was not able
> to reproduce the original crash on f23. but going back to F20 made this
> problem re-appear.
> 
>   Stack trace of thread 26429:
>                 #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
>                 #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
>                 #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
>                 #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
>                 #4  0x000003ff943d150a __makecontext_ret (libc.so.6)
> 
> this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.
> 
> Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).
> 
> Maybe we have two independent issues here and this is some old bug in glibc or
> whatever?

Doesn't sound unlikely.

But the notifier_read removal makes sense. Fix this now and continue
searching for the root cause of the f20 breakage? We should try to
understand this even if it has been fixed in the meantime.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-16 10:10 ` [Qemu-devel] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH Fam Zheng
@ 2016-03-17 15:00   ` Stefan Hajnoczi
  2016-03-17 15:07     ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Stefan Hajnoczi @ 2016-03-17 15:00 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tubo,
	Stefan Hajnoczi, cornelia.huck, pbonzini, borntraeger

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]

On Wed, Mar 16, 2016 at 06:10:18PM +0800, Fam Zheng wrote:
> +    data = g_new(VirtIOBlockStartData, 1);
> +    data->vblk = vblk;
> +    data->bh = aio_bh_new(s->ctx, virtio_blk_data_plane_start_bh_cb, data);
> +    qemu_bh_schedule(data->bh);
> +    qemu_mutex_unlock(&s->start_stop_lock);
>      return;

This BH usage pattern is dangerous:

1. The BH is created and scheduled.
2. Before the BH executes the device is unrealized.
3. The data->bh pointer is inaccessible so we have a dangling BH that
   will access vblk after vblk has been freed.

In some cases it can be safe but I don't see why the pattern is safe in
this case.  Either the BH needs to hold some sort of reference to keep
vblk alive, or vblk needs to know about pending BHs so they can be
deleted.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 12:39               ` Christian Borntraeger
  2016-03-17 13:02                 ` Cornelia Huck
@ 2016-03-17 15:02                 ` Paolo Bonzini
  2016-03-17 15:07                   ` Christian Borntraeger
  2016-03-17 15:15                   ` Christian Borntraeger
  1 sibling, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-17 15:02 UTC (permalink / raw)
  To: Christian Borntraeger, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin



On 17/03/2016 13:39, Christian Borntraeger wrote:
> As an interesting side note, I updated my system from F20 to F23 some days ago
> (after the initial report). While To Bo is still on a F20 system. I was not able
> to reproduce the original crash on f23. but going back to F20 made this
> problem re-appear.
>  
>   Stack trace of thread 26429:
>                 #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
>                 #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
>                 #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
>                 #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
>                 #4  0x000003ff943d150a __makecontext_ret (libc.so.6)
> 
> this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.
> 
> Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).
> 
> Maybe we have two independent issues here and this is some old bug in glibc or
> whatever?

I'm happy to try and reproduce on x86 if you give me some instruction
(RHEL7 should be close enough to Fedora 20).

Can you add an assert in virtio_blk_handle_output to catch reentrancy, like

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a7ec572..96ea896 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -591,6 +591,8 @@ static void virtio_blk_handle_output(VirtIODevice
*vdev, VirtQueue *vq)
         return;
     }

+    int x = atomic_fetch_inc(&s->test);
+    assert(x == 0);
     blk_io_plug(s->blk);

     while ((req = virtio_blk_get_request(s))) {
@@ -602,6 +604,7 @@ static void virtio_blk_handle_output(VirtIODevice
*vdev, VirtQueue *vq)
     }

     blk_io_unplug(s->blk);
+    atomic_dec(&s->test);
 }

 static void virtio_blk_dma_restart_bh(void *opaque)
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index ae84d92..6472503 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -48,6 +48,7 @@ typedef struct VirtIOBlock {
     BlockBackend *blk;
     VirtQueue *vq;
     void *rq;
+    int test;
     QEMUBH *bh;
     VirtIOBlkConf conf;
     unsigned short sector_mask;

?

Paolo

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 15:02                 ` Paolo Bonzini
@ 2016-03-17 15:07                   ` Christian Borntraeger
  2016-03-17 15:15                   ` Christian Borntraeger
  1 sibling, 0 replies; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-17 15:07 UTC (permalink / raw)
  To: Paolo Bonzini, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin

On 03/17/2016 04:02 PM, Paolo Bonzini wrote:
> 
> 
> On 17/03/2016 13:39, Christian Borntraeger wrote:
>> As an interesting side note, I updated my system from F20 to F23 some days ago
>> (after the initial report). While To Bo is still on a F20 system. I was not able
>> to reproduce the original crash on f23. but going back to F20 made this
>> problem re-appear.
>>  
>>   Stack trace of thread 26429:
>>                 #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
>>                 #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
>>                 #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
>>                 #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
>>                 #4  0x000003ff943d150a __makecontext_ret (libc.so.6)
>>
>> this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.
>>
>> Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).
>>
>> Maybe we have two independent issues here and this is some old bug in glibc or
>> whatever?
> 
> I'm happy to try and reproduce on x86 if you give me some instruction
> (RHEL7 should be close enough to Fedora 20).

Tu Bo has a standard guest that he starting multiple times. I can trigger some
issues by starting 20 guests that only have a kernel (with virtio-blk and bus
driver compiled in) and a busybox ramdisk that simply calls reboot. Sooner or
later a qemu crashes.
This guest has several virtio devices attached as well (partition detection will
do minimal reads)

ala 

  <qemu:commandline>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null1,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null1,serial=null1,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null2,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null2,serial=null2,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null3,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null3,serial=null3,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null4,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null4,serial=null4,iothread=iothread1'/>

    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null5,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null5,serial=null5,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null6,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null6,serial=null6,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null7,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null7,serial=null7,iothread=iothread1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='driver=null-co,id=null8,if=none,size=100G'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='virtio-blk-ccw,scsi=off,drive=null8,serial=null8,iothread=iothread1'/>


  </qemu:commandline>





> 
> Can you add an assert in virtio_blk_handle_output to catch reentrancy, like

Will do.


> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index a7ec572..96ea896 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -591,6 +591,8 @@ static void virtio_blk_handle_output(VirtIODevice
> *vdev, VirtQueue *vq)
>          return;
>      }
> 
> +    int x = atomic_fetch_inc(&s->test);
> +    assert(x == 0);
>      blk_io_plug(s->blk);
> 
>      while ((req = virtio_blk_get_request(s))) {
> @@ -602,6 +604,7 @@ static void virtio_blk_handle_output(VirtIODevice
> *vdev, VirtQueue *vq)
>      }
> 
>      blk_io_unplug(s->blk);
> +    atomic_dec(&s->test);
>  }
> 
>  static void virtio_blk_dma_restart_bh(void *opaque)
> diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> index ae84d92..6472503 100644
> --- a/include/hw/virtio/virtio-blk.h
> +++ b/include/hw/virtio/virtio-blk.h
> @@ -48,6 +48,7 @@ typedef struct VirtIOBlock {
>      BlockBackend *blk;
>      VirtQueue *vq;
>      void *rq;
> +    int test;
>      QEMUBH *bh;
>      VirtIOBlkConf conf;
>      unsigned short sector_mask;
> 
> ?
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-17 15:00   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-03-17 15:07     ` Paolo Bonzini
  2016-03-22 12:52       ` Fam Zheng
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-17 15:07 UTC (permalink / raw)
  To: Stefan Hajnoczi, Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tubo,
	Stefan Hajnoczi, cornelia.huck, borntraeger



On 17/03/2016 16:00, Stefan Hajnoczi wrote:
>> > +    data = g_new(VirtIOBlockStartData, 1);
>> > +    data->vblk = vblk;
>> > +    data->bh = aio_bh_new(s->ctx, virtio_blk_data_plane_start_bh_cb, data);
>> > +    qemu_bh_schedule(data->bh);
>> > +    qemu_mutex_unlock(&s->start_stop_lock);
>> >      return;
> This BH usage pattern is dangerous:
> 
> 1. The BH is created and scheduled.
> 2. Before the BH executes the device is unrealized.
> 3. The data->bh pointer is inaccessible so we have a dangling BH that
>    will access vblk after vblk has been freed.
> 
> In some cases it can be safe but I don't see why the pattern is safe in
> this case.  Either the BH needs to hold some sort of reference to keep
> vblk alive, or vblk needs to know about pending BHs so they can be
> deleted.

You're right.  After unrealizing virtio_blk_data_plane_stop has set of
vblk->dataplane_started = false, so that's covered.  However, you still
need an object_ref/object_object_unref pair.

That said, Christian wasn't testing hotplug/hot-unplug so this shouldn't
matter in his case.  Let's see if we can catch the reentrancy with an
assertion...

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 15:02                 ` Paolo Bonzini
  2016-03-17 15:07                   ` Christian Borntraeger
@ 2016-03-17 15:15                   ` Christian Borntraeger
  2016-03-17 15:16                     ` Christian Borntraeger
  1 sibling, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-17 15:15 UTC (permalink / raw)
  To: Paolo Bonzini, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin

On 03/17/2016 04:02 PM, Paolo Bonzini wrote:
> 
> 
> On 17/03/2016 13:39, Christian Borntraeger wrote:
>> As an interesting side note, I updated my system from F20 to F23 some days ago
>> (after the initial report). While To Bo is still on a F20 system. I was not able
>> to reproduce the original crash on f23. but going back to F20 made this
>> problem re-appear.
>>  
>>   Stack trace of thread 26429:
>>                 #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
>>                 #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
>>                 #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
>>                 #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
>>                 #4  0x000003ff943d150a __makecontext_ret (libc.so.6)
>>
>> this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.
>>
>> Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).
>>
>> Maybe we have two independent issues here and this is some old bug in glibc or
>> whatever?
> 
> I'm happy to try and reproduce on x86 if you give me some instruction
> (RHEL7 should be close enough to Fedora 20).
> 
> Can you add an assert in virtio_blk_handle_output to catch reentrancy, like

that was quick (let me know if I should recompile with debugging)

(gdb) thread apply all bt

Thread 5 (Thread 0x3ff7b8ff910 (LWP 236419)):
#0  0x000003ff7cdfcf56 in syscall () from /lib64/libc.so.6
#1  0x000000001022452e in futex_wait (val=<optimized out>, ev=<optimized out>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:292
#2  qemu_event_wait (ev=ev@entry=0x1082b5c4 <rcu_call_ready_event>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399
#3  0x000000001023353a in call_rcu_thread (opaque=<optimized out>) at /home/cborntra/REPOS/qemu/util/rcu.c:250
#4  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6

Thread 4 (Thread 0x3ff78eca910 (LWP 236426)):
#0  0x000003ff7cdf819a in ioctl () from /lib64/libc.so.6
#1  0x000000001005ddf8 in kvm_vcpu_ioctl (cpu=cpu@entry=0x10c27d40, type=type@entry=44672) at /home/cborntra/REPOS/qemu/kvm-all.c:1984
#2  0x000000001005df1c in kvm_cpu_exec (cpu=cpu@entry=0x10c27d40) at /home/cborntra/REPOS/qemu/kvm-all.c:1834
#3  0x000000001004b1be in qemu_kvm_cpu_thread_fn (arg=0x10c27d40) at /home/cborntra/REPOS/qemu/cpus.c:1050
#4  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6

Thread 3 (Thread 0x3ff7e8dcbb0 (LWP 236395)):
#0  0x000003ff7cdf66e6 in ppoll () from /lib64/libc.so.6
#1  0x00000000101a5e08 in ppoll (__ss=0x0, __timeout=0x3ffd6afe8a0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=1034000000) at /home/cborntra/REPOS/qemu/qemu-timer.c:325
#3  0x00000000101a56f2 in os_host_main_loop_wait (timeout=1034000000) at /home/cborntra/REPOS/qemu/main-loop.c:251
#4  main_loop_wait (nonblocking=<optimized out>) at /home/cborntra/REPOS/qemu/main-loop.c:505
#5  0x00000000100136d6 in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1933
#6  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4656

Thread 2 (Thread 0x3ff7b0ff910 (LWP 236421)):
#0  0x000003ff7cdf66e6 in ppoll () from /lib64/libc.so.6
#1  0x00000000101a5e28 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:313
#3  0x00000000101a727c in aio_poll (ctx=0x10880560, blocking=<optimized out>) at /home/cborntra/REPOS/qemu/aio-posix.c:453
#4  0x00000000100d39f0 in iothread_run (opaque=0x10880020) at /home/cborntra/REPOS/qemu/iothread.c:46
#5  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
#6  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x3ff57fff910 (LWP 236427)):
#0  0x000003ff7cd3b650 in raise () from /lib64/libc.so.6
#1  0x000003ff7cd3ced8 in abort () from /lib64/libc.so.6
#2  0x000003ff7cd33666 in __assert_fail_base () from /lib64/libc.so.6
#3  0x000003ff7cd336f4 in __assert_fail () from /lib64/libc.so.6
#4  0x000000001007a3c4 in virtio_blk_handle_output (vdev=<optimized out>, vq=<optimized out>) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:595
#5  0x000000001009390e in virtio_queue_notify_vq (vq=0x10d77c70) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
#6  0x0000000010095894 in virtio_queue_notify_vq (vq=<optimized out>) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1091
#7  virtio_queue_notify (vdev=<optimized out>, n=n@entry=0) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1101
#8  0x00000000100a17c8 in virtio_ccw_hcall_notify (args=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:66
#9  0x000000001009c210 in s390_virtio_hypercall (env=env@entry=0x10c75aa0) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-hcall.c:35
#10 0x00000000100cb4e8 in handle_hypercall (run=<optimized out>, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1283
#11 handle_diag (ipb=<optimized out>, run=0x3ff78680000, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1352
#12 handle_instruction (run=0x3ff78680000, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1799
#13 handle_intercept (cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1842
#14 kvm_arch_handle_exit (cs=cs@entry=0x10c6d7d0, run=run@entry=0x3ff78680000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2028
#15 0x000000001005df70 in kvm_cpu_exec (cpu=cpu@entry=0x10c6d7d0) at /home/cborntra/REPOS/qemu/kvm-all.c:1921
#16 0x000000001004b1be in qemu_kvm_cpu_thread_fn (arg=0x10c6d7d0) at /home/cborntra/REPOS/qemu/cpus.c:1050
#17 0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
#18 0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 15:15                   ` Christian Borntraeger
@ 2016-03-17 15:16                     ` Christian Borntraeger
  2016-03-17 16:08                       ` Christian Borntraeger
  0 siblings, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-17 15:16 UTC (permalink / raw)
  To: Paolo Bonzini, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin

On 03/17/2016 04:15 PM, Christian Borntraeger wrote:
> On 03/17/2016 04:02 PM, Paolo Bonzini wrote:
>>
>>
>> On 17/03/2016 13:39, Christian Borntraeger wrote:
>>> As an interesting side note, I updated my system from F20 to F23 some days ago
>>> (after the initial report). While To Bo is still on a F20 system. I was not able
>>> to reproduce the original crash on f23. but going back to F20 made this
>>> problem re-appear.
>>>  
>>>   Stack trace of thread 26429:
>>>                 #0  0x00000000802008aa tracked_request_begin (qemu-system-s390x)
>>>                 #1  0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x)
>>>                 #2  0x000000008020567c bdrv_co_do_readv (qemu-system-s390x)
>>>                 #3  0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x)
>>>                 #4  0x000003ff943d150a __makecontext_ret (libc.so.6)
>>>
>>> this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read.
>>>
>>> Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected).
>>>
>>> Maybe we have two independent issues here and this is some old bug in glibc or
>>> whatever?
>>
>> I'm happy to try and reproduce on x86 if you give me some instruction
>> (RHEL7 should be close enough to Fedora 20).
>>
>> Can you add an assert in virtio_blk_handle_output to catch reentrancy, like
> 
> that was quick (let me know if I should recompile with debugging)
> 
> (gdb) thread apply all bt
> 
> Thread 5 (Thread 0x3ff7b8ff910 (LWP 236419)):
> #0  0x000003ff7cdfcf56 in syscall () from /lib64/libc.so.6
> #1  0x000000001022452e in futex_wait (val=<optimized out>, ev=<optimized out>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:292
> #2  qemu_event_wait (ev=ev@entry=0x1082b5c4 <rcu_call_ready_event>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399
> #3  0x000000001023353a in call_rcu_thread (opaque=<optimized out>) at /home/cborntra/REPOS/qemu/util/rcu.c:250
> #4  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 4 (Thread 0x3ff78eca910 (LWP 236426)):
> #0  0x000003ff7cdf819a in ioctl () from /lib64/libc.so.6
> #1  0x000000001005ddf8 in kvm_vcpu_ioctl (cpu=cpu@entry=0x10c27d40, type=type@entry=44672) at /home/cborntra/REPOS/qemu/kvm-all.c:1984
> #2  0x000000001005df1c in kvm_cpu_exec (cpu=cpu@entry=0x10c27d40) at /home/cborntra/REPOS/qemu/kvm-all.c:1834
> #3  0x000000001004b1be in qemu_kvm_cpu_thread_fn (arg=0x10c27d40) at /home/cborntra/REPOS/qemu/cpus.c:1050
> #4  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 3 (Thread 0x3ff7e8dcbb0 (LWP 236395)):
> #0  0x000003ff7cdf66e6 in ppoll () from /lib64/libc.so.6
> #1  0x00000000101a5e08 in ppoll (__ss=0x0, __timeout=0x3ffd6afe8a0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=1034000000) at /home/cborntra/REPOS/qemu/qemu-timer.c:325
> #3  0x00000000101a56f2 in os_host_main_loop_wait (timeout=1034000000) at /home/cborntra/REPOS/qemu/main-loop.c:251
> #4  main_loop_wait (nonblocking=<optimized out>) at /home/cborntra/REPOS/qemu/main-loop.c:505
> #5  0x00000000100136d6 in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1933
> #6  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/cborntra/REPOS/qemu/vl.c:4656
> 
> Thread 2 (Thread 0x3ff7b0ff910 (LWP 236421)):
> #0  0x000003ff7cdf66e6 in ppoll () from /lib64/libc.so.6
> #1  0x00000000101a5e28 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:313
> #3  0x00000000101a727c in aio_poll (ctx=0x10880560, blocking=<optimized out>) at /home/cborntra/REPOS/qemu/aio-posix.c:453
> #4  0x00000000100d39f0 in iothread_run (opaque=0x10880020) at /home/cborntra/REPOS/qemu/iothread.c:46
> #5  0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
> #6  0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x3ff57fff910 (LWP 236427)):
> #0  0x000003ff7cd3b650 in raise () from /lib64/libc.so.6
> #1  0x000003ff7cd3ced8 in abort () from /lib64/libc.so.6
> #2  0x000003ff7cd33666 in __assert_fail_base () from /lib64/libc.so.6
> #3  0x000003ff7cd336f4 in __assert_fail () from /lib64/libc.so.6
> #4  0x000000001007a3c4 in virtio_blk_handle_output (vdev=<optimized out>, vq=<optimized out>) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:595
> #5  0x000000001009390e in virtio_queue_notify_vq (vq=0x10d77c70) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
> #6  0x0000000010095894 in virtio_queue_notify_vq (vq=<optimized out>) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1091
> #7  virtio_queue_notify (vdev=<optimized out>, n=n@entry=0) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1101
> #8  0x00000000100a17c8 in virtio_ccw_hcall_notify (args=<optimized out>) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:66
> #9  0x000000001009c210 in s390_virtio_hypercall (env=env@entry=0x10c75aa0) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-hcall.c:35
> #10 0x00000000100cb4e8 in handle_hypercall (run=<optimized out>, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1283
> #11 handle_diag (ipb=<optimized out>, run=0x3ff78680000, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1352

FWIW, this looks like that we still have a case, without eventfd during reboot or startup

> #12 handle_instruction (run=0x3ff78680000, cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1799
> #13 handle_intercept (cpu=0x10c6d7d0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1842
> #14 kvm_arch_handle_exit (cs=cs@entry=0x10c6d7d0, run=run@entry=0x3ff78680000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2028
> #15 0x000000001005df70 in kvm_cpu_exec (cpu=cpu@entry=0x10c6d7d0) at /home/cborntra/REPOS/qemu/kvm-all.c:1921
> #16 0x000000001004b1be in qemu_kvm_cpu_thread_fn (arg=0x10c6d7d0) at /home/cborntra/REPOS/qemu/cpus.c:1050
> #17 0x000003ff7cf084c6 in start_thread () from /lib64/libpthread.so.0
> #18 0x000003ff7ce02ec2 in thread_start () from /lib64/libc.so.6
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 15:16                     ` Christian Borntraeger
@ 2016-03-17 16:08                       ` Christian Borntraeger
  2016-03-18 15:03                         ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-17 16:08 UTC (permalink / raw)
  To: Paolo Bonzini, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin

Good (or bad?) news is the assert also triggers on F23, it just seems to take longer.

trace 1 (compiled on F20)

Thread 5 (Thread 0x3ff84b7f910 (LWP 33030)):
#0  0x000003ff86a817b8 in ppoll () at /lib64/libc.so.6
#1  0x00000000102c712e in qemu_poll_ns (fds=0x3ff80001e70, nfds=3, timeout=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:313
#2  0x00000000102c96d6 in aio_poll (ctx=0x10a7b050, blocking=true) at /home/cborntra/REPOS/qemu/aio-posix.c:453
#3  0x000000001014fb1e in iothread_run (opaque=0x10a7ab10) at /home/cborntra/REPOS/qemu/iothread.c:46
#4  0x000003ff86b87c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ff86a8ec9a in thread_start () at /lib64/libc.so.6

Thread 4 (Thread 0x3ff8537f910 (LWP 33029)):
#0  0x000003ff86a8841e in syscall () at /lib64/libc.so.6
#1  0x000000001039cbe8 in futex_wait (ev=0x10a140ec <rcu_call_ready_event>, val=4294967295) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:292
#2  0x000000001039ce1e in qemu_event_wait (ev=0x10a140ec <rcu_call_ready_event>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399
#3  0x00000000103b9678 in call_rcu_thread (opaque=0x0) at /home/cborntra/REPOS/qemu/util/rcu.c:250
#4  0x000003ff86b87c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ff86a8ec9a in thread_start () at /lib64/libc.so.6

Thread 3 (Thread 0x3ff66428910 (LWP 33041)):
#0  0x000003ff86a8334a in ioctl () at /lib64/libc.so.6
#1  0x000000001007b97a in kvm_vcpu_ioctl (cpu=0x10b15970, type=44672) at /home/cborntra/REPOS/qemu/kvm-all.c:1984
#2  0x000000001007b2f0 in kvm_cpu_exec (cpu=0x10b15970) at /home/cborntra/REPOS/qemu/kvm-all.c:1834
#3  0x000000001005bd92 in qemu_kvm_cpu_thread_fn (arg=0x10b15970) at /home/cborntra/REPOS/qemu/cpus.c:1050
#4  0x000003ff86b87c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ff86a8ec9a in thread_start () at /lib64/libc.so.6

Thread 2 (Thread 0x3ff885dbb90 (LWP 32970)):
#0  0x000003ff86a817b8 in ppoll () at /lib64/libc.so.6
#1  0x00000000102c7244 in qemu_poll_ns (fds=0x10a77a90, nfds=5, timeout=965000000) at /home/cborntra/REPOS/qemu/qemu-timer.c:325
#2  0x00000000102c5ef2 in os_host_main_loop_wait (timeout=965000000) at /home/cborntra/REPOS/qemu/main-loop.c:251
#3  0x00000000102c6006 in main_loop_wait (nonblocking=0) at /home/cborntra/REPOS/qemu/main-loop.c:505
#4  0x00000000101667e4 in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1933
#5  0x000000001016e8ea in main (argc=72, argv=0x3ffe977e978, envp=0x3ffe977ebc0) at /home/cborntra/REPOS/qemu/vl.c:4656

Thread 1 (Thread 0x3ff66c28910 (LWP 33033)):
#0  0x000003ff869be2c0 in raise () at /lib64/libc.so.6
#1  0x000003ff869bfc26 in abort () at /lib64/libc.so.6
#2  0x000003ff869b5bce in __assert_fail_base () at /lib64/libc.so.6
#3  0x000003ff869b5c5c in  () at /lib64/libc.so.6
#4  0x00000000100ab8f2 in virtio_blk_handle_output (vdev=0x10ad57e8, vq=0x10eec270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:595
#5  0x00000000100e18a4 in virtio_queue_notify_vq (vq=0x10eec270) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
#6  0x00000000100e1906 in virtio_queue_notify (vdev=0x10ad57e8, n=0) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1101
#7  0x00000000100f921c in virtio_ccw_hcall_notify (args=0x10e2aad0) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-ccw.c:66
#8  0x00000000100ee518 in s390_virtio_hypercall (env=0x10e2aac0) at /home/cborntra/REPOS/qemu/hw/s390x/s390-virtio-hcall.c:35
#9  0x000000001014192e in handle_hypercall (cpu=0x10e227f0, run=0x3ff84280000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1283
#10 0x0000000010141c36 in handle_diag (cpu=0x10e227f0, run=0x3ff84280000, ipb=83886080) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1352
#11 0x00000000101431b6 in handle_instruction (cpu=0x10e227f0, run=0x3ff84280000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1799
#12 0x00000000101433ee in handle_intercept (cpu=0x10e227f0) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:1842
#13 0x0000000010143c22 in kvm_arch_handle_exit (cs=0x10e227f0, run=0x3ff84280000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2028
#14 0x000000001007b5aa in kvm_cpu_exec (cpu=0x10e227f0) at /home/cborntra/REPOS/qemu/kvm-all.c:1921
#15 0x000000001005bd92 in qemu_kvm_cpu_thread_fn (arg=0x10e227f0) at /home/cborntra/REPOS/qemu/cpus.c:1050
#16 0x000003ff86b87c2c in start_thread () at /lib64/libpthread.so.0
#17 0x000003ff86a8ec9a in thread_start () at /lib64/libc.so.6


trace 2 (compiled on F23)

Thread 5 (Thread 0x3ffb897f910 (LWP 37895)):
#0  0x000003ffba00841e in syscall () at /lib64/libc.so.6
#1  0x00000000803d5306 in futex_wait (ev=0x80a4a104 <rcu_call_ready_event>, val=4294967295) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:292
#2  0x00000000803d5596 in qemu_event_wait (ev=0x80a4a104 <rcu_call_ready_event>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399
#3  0x00000000803f2c3c in call_rcu_thread (opaque=0x0) at /home/cborntra/REPOS/qemu/util/rcu.c:250
#4  0x000003ffba107c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ffba00ec9a in thread_start () at /lib64/libc.so.6

Thread 4 (Thread 0x3ffb574a910 (LWP 37907)):
#0  0x000003ffba00334a in ioctl () at /lib64/libc.so.6
#1  0x0000000080081a44 in kvm_vcpu_ioctl (cpu=0x80b4b970, type=44672) at /home/cborntra/REPOS/qemu/kvm-all.c:1984
#2  0x000000008008130c in kvm_cpu_exec (cpu=0x80b4b970) at /home/cborntra/REPOS/qemu/kvm-all.c:1834
#3  0x000000008006074c in qemu_kvm_cpu_thread_fn (arg=0x80b4b970) at /home/cborntra/REPOS/qemu/cpus.c:1050
#4  0x000003ffba107c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ffba00ec9a in thread_start () at /lib64/libc.so.6

Thread 3 (Thread 0x3ffb5f4a910 (LWP 37906)):
#0  0x000003ffb9f3f5b2 in sigtimedwait () at /lib64/libc.so.6
#1  0x000000008005fcda in qemu_kvm_eat_signals (cpu=0x80e587f0) at /home/cborntra/REPOS/qemu/cpus.c:804
#2  0x00000000800605fc in qemu_kvm_wait_io_event (cpu=0x80e587f0) at /home/cborntra/REPOS/qemu/cpus.c:1019
#3  0x000000008006077a in qemu_kvm_cpu_thread_fn (arg=0x80e587f0) at /home/cborntra/REPOS/qemu/cpus.c:1055
#4  0x000003ffba107c2c in start_thread () at /lib64/libpthread.so.0
#5  0x000003ffba00ec9a in thread_start () at /lib64/libc.so.6

Thread 2 (Thread 0x3ffbbbdbb90 (LWP 37839)):
#0  0x000003ffba110cd4 in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x000003ffba113ed4 in __lll_lock_elision () at /lib64/libpthread.so.0
#2  0x00000000803d49d6 in qemu_mutex_lock (mutex=0x8061f260 <qemu_global_mutex>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:64
#3  0x0000000080060ef4 in qemu_mutex_lock_iothread () at /home/cborntra/REPOS/qemu/cpus.c:1226
#4  0x00000000802eeae4 in os_host_main_loop_wait (timeout=1291000000) at /home/cborntra/REPOS/qemu/main-loop.c:254
#5  0x00000000802eebf6 in main_loop_wait (nonblocking=0) at /home/cborntra/REPOS/qemu/main-loop.c:505
#6  0x000000008017b4bc in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1933
#7  0x00000000801839aa in main (argc=72, argv=0x3ffda07e4b8, envp=0x3ffda07e700) at /home/cborntra/REPOS/qemu/vl.c:4656

Thread 1 (Thread 0x3ffb817f910 (LWP 37897)):
#0  0x000003ffb9f3e2c0 in raise () at /lib64/libc.so.6
#1  0x000003ffb9f3fc26 in abort () at /lib64/libc.so.6
#2  0x000003ffb9f35bce in __assert_fail_base () at /lib64/libc.so.6
#3  0x000003ffb9f35c5c in  () at /lib64/libc.so.6
#4  0x00000000800b4ee2 in virtio_blk_handle_output (vdev=0x80e51468, vq=0x80f3e280) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:595
#5  0x00000000800ef434 in virtio_queue_notify_vq (vq=0x80f3e280) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095
#6  0x00000000800f1cf4 in virtio_queue_host_notifier_read (n=0x80f3e2d8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785
#7  0x00000000802f1cdc in aio_dispatch (ctx=0x80ab1050) at /home/cborntra/REPOS/qemu/aio-posix.c:327
#8  0x00000000802f2662 in aio_poll (ctx=0x80ab1050, blocking=false) at /home/cborntra/REPOS/qemu/aio-posix.c:475
#9  0x0000000080163942 in iothread_run (opaque=0x80ab0b10) at /home/cborntra/REPOS/qemu/iothread.c:46
#10 0x000003ffba107c2c in start_thread () at /lib64/libpthread.so.0
#11 0x000003ffba00ec9a in thread_start () at /lib64/libc.so.6
(gdb) 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 16:08                       ` Christian Borntraeger
@ 2016-03-18 15:03                         ` Paolo Bonzini
  2016-03-21  9:42                           ` Fam Zheng
  2016-03-21 13:47                           ` TU BO
  0 siblings, 2 replies; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-18 15:03 UTC (permalink / raw)
  To: Christian Borntraeger, tu bo, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, Stefan Hajnoczi, qemu-block,
	Michael S. Tsirkin



On 17/03/2016 17:08, Christian Borntraeger wrote:
> Good (or bad?) news is the assert also triggers on F23, it just seems
> to take longer.

I guess good news, because we can rule out the kernel (not that I
believed it was a kernel problem, but the thought is always there in
the background...).

The interaction between ioeventfd and dataplane is too complicated.  I
think if we get rid of the start/stop ioeventfd calls (just set up the
ioeventfd as soon as possible and then only set/clear the handlers)
things would be much simpler.

I'll see if I can produce something based on Conny's patches, which are
already a start.  Today I had a short day so I couldn't play with the
bug; out of curiosity, does the bug reproduce with her work + patch 4
from this series + the reentrancy assertion?

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-18 15:03                         ` Paolo Bonzini
@ 2016-03-21  9:42                           ` Fam Zheng
  2016-03-21 11:10                             ` Christian Borntraeger
  2016-03-21 12:17                             ` Cornelia Huck
  2016-03-21 13:47                           ` TU BO
  1 sibling, 2 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-21  9:42 UTC (permalink / raw)
  To: Paolo Bonzini, Christian Borntraeger, tu bo
  Cc: cornelia.huck, Stefan Hajnoczi, qemu-devel, qemu-block,
	Michael S. Tsirkin

On Fri, 03/18 16:03, Paolo Bonzini wrote:
>
>
> On 17/03/2016 17:08, Christian Borntraeger wrote:
> > Good (or bad?) news is the assert also triggers on F23, it just seems
> > to take longer.
>
> I guess good news, because we can rule out the kernel (not that I
> believed it was a kernel problem, but the thought is always there in
> the background...).
>
> The interaction between ioeventfd and dataplane is too complicated.  I
> think if we get rid of the start/stop ioeventfd calls (just set up the
> ioeventfd as soon as possible and then only set/clear the handlers)
> things would be much simpler.
>
> I'll see if I can produce something based on Conny's patches, which are
> already a start.  Today I had a short day so I couldn't play with the
> bug; out of curiosity, does the bug reproduce with her work + patch 4
> from this series + the reentrancy assertion?

The other half of the race condition is from ioport write in the vcpu thread. I
hit this by adding an extra assert(is_in_iothread()) in
virtio_blk_handle_request(), at the same place with Paolo's atomic read of
variable "test".

I haven't tried to find where this ioport write is from, but that is indeed an
issue in virtio-pci.

(gdb) thread apply all bt

<...>

Thread 3 (Thread 0x7f9e8928b700 (LWP 30671)):
#0  0x00007f9e8bac65d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f9e8bac7cc8 in __GI_abort () at abort.c:90
#2  0x00007f9e8babf546 in __assert_fail_base (fmt=0x7f9e8bc0f128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9e8e04e9d1 "is_in_iothread()",
    file=file@entry=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=line@entry=597,
    function=function@entry=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:92
#3  0x00007f9e8babf5f2 in __GI___assert_fail (assertion=0x7f9e8e04e9d1 "is_in_iothread()", file=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=597,
    function=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:101
#4  0x00007f9e8dc9f414 in virtio_blk_handle_output (vdev=0x7f9e929d7b68, vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/block/virtio-blk.c:597
#5  0x00007f9e8dcd6f53 in virtio_queue_notify_vq (vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/virtio/virtio.c:1095
#6  0x00007f9e8dcd6f91 in virtio_queue_notify (vdev=0x7f9e929d7b68, n=0) at /home/fam/work/qemu/hw/virtio/virtio.c:1101
#7  0x00007f9e8df03d2f in virtio_ioport_write (opaque=0x7f9e929cf840, addr=16, val=0) at /home/fam/work/qemu/hw/virtio/virtio-pci.c:419
#8  0x00007f9e8df041be in virtio_pci_config_write (opaque=0x7f9e929cf840, addr=16, val=0, size=2) at /home/fam/work/qemu/hw/virtio/virtio-pci.c:552
#9  0x00007f9e8dc7c8c9 in memory_region_write_accessor (mr=0x7f9e929d00c0, addr=16, value=0x7f9e8928a988, size=2, shift=0, mask=65535, attrs=...)
    at /home/fam/work/qemu/memory.c:524
#10 0x00007f9e8dc7cad4 in access_with_adjusted_size (addr=16, value=0x7f9e8928a988, size=2, access_size_min=1, access_size_max=4, access=
    0x7f9e8dc7c7e8 <memory_region_write_accessor>, mr=0x7f9e929d00c0, attrs=...) at /home/fam/work/qemu/memory.c:590
#11 0x00007f9e8dc7f71b in memory_region_dispatch_write (mr=0x7f9e929d00c0, addr=16, data=0, size=2, attrs=...) at /home/fam/work/qemu/memory.c:1272
#12 0x00007f9e8dc32815 in address_space_write_continue (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>,
    len=2, addr1=16, l=2, mr=0x7f9e929d00c0) at /home/fam/work/qemu/exec.c:2607
#13 0x00007f9e8dc329c1 in address_space_write (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>, len=2)
    at /home/fam/work/qemu/exec.c:2659
#14 0x00007f9e8dc32d78 in address_space_rw (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>, len=2,
    is_write=true) at /home/fam/work/qemu/exec.c:2762
#15 0x00007f9e8dc79358 in kvm_handle_io (port=49232, attrs=..., data=0x7f9e8daa9000, direction=1, size=2, count=1) at /home/fam/work/qemu/kvm-all.c:1699
#16 0x00007f9e8dc79858 in kvm_cpu_exec (cpu=0x7f9e905a5250) at /home/fam/work/qemu/kvm-all.c:1863
#17 0x00007f9e8dc619a3 in qemu_kvm_cpu_thread_fn (arg=0x7f9e905a5250) at /home/fam/work/qemu/cpus.c:1056
#18 0x00007f9e8be59df5 in start_thread (arg=0x7f9e8928b700) at pthread_create.c:308
#19 0x00007f9e8bb871ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

<...>

Thread 1 (Thread 0x7f9e8b28f700 (LWP 30667)):
#0  0x00007f9e8bac65d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f9e8bac7cc8 in __GI_abort () at abort.c:90
#2  0x00007f9e8babf546 in __assert_fail_base (fmt=0x7f9e8bc0f128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9e8e04e9e2 "x == 0",
    file=file@entry=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=line@entry=598,
    function=function@entry=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:92
#3  0x00007f9e8babf5f2 in __GI___assert_fail (assertion=0x7f9e8e04e9e2 "x == 0", file=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=598,
    function=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:101
#4  0x00007f9e8dc9f43c in virtio_blk_handle_output (vdev=0x7f9e929d7b68, vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/block/virtio-blk.c:598
#5  0x00007f9e8dcd6f53 in virtio_queue_notify_vq (vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/virtio/virtio.c:1095
#6  0x00007f9e8dcd8dfd in virtio_queue_host_notifier_read (n=0x7f9e92a76348) at /home/fam/work/qemu/hw/virtio/virtio.c:1785
#7  0x00007f9e8df76b40 in aio_dispatch (ctx=0x7f9e90540a50) at /home/fam/work/qemu/aio-posix.c:327
#8  0x00007f9e8df770f8 in aio_poll (ctx=0x7f9e90540a50, blocking=true) at /home/fam/work/qemu/aio-posix.c:475
#9  0x00007f9e8dd7b5c4 in iothread_run (opaque=0x7f9e905404d0) at /home/fam/work/qemu/iothread.c:55
#10 0x00007f9e8be59df5 in start_thread (arg=0x7f9e8b28f700) at pthread_create.c:308
#11 0x00007f9e8bb871ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-17 11:03                 ` tu bo
@ 2016-03-21 10:57                   ` Fam Zheng
  2016-03-21 11:15                     ` Cornelia Huck
  2016-03-22  7:10                     ` tu bo
  0 siblings, 2 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-21 10:57 UTC (permalink / raw)
  To: tu bo
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, cornelia.huck,
	Paolo Bonzini

On Thu, 03/17 19:03, tu bo wrote:
> 
> On 03/17/2016 08:39 AM, Fam Zheng wrote:
> >On Wed, 03/16 14:45, Paolo Bonzini wrote:
> >>
> >>
> >>On 16/03/2016 14:38, Christian Borntraeger wrote:
> >>>>If you just remove the calls to virtio_queue_host_notifier_read, here
> >>>>and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> >>>>(keeping patches 2-4 in)?
> >>>
> >>>With these changes and patch 2-4 it does no longer locks up.
> >>>I keep it running some hour to check if a crash happens.
> >>>
> >>>Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> >>
> >>Great, I'll prepare a patch to virtio then sketching the solution that
> >>Conny agreed with.
> >>
> >>While Fam and I agreed that patch 1 is not required, I'm not sure if the
> >>mutex is necessary in the end.
> >
> >If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
> >are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
> >to have it. I'm not sure about the BH.
> >
> >And on a hindsight I realize we don't want patches 2-3 too. Actually the
> >begin/end pair won't work as expected because of the blk_set_aio_context.
> >
> >Let's hold on this series.
> >
> >>
> >>So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
> >>and both with/without Fam's patches, it would be great.
> >
> >Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.
> >
> 1. without the virtio_queue_host_notifier_read calls,  without patch 4
> 
> crash happens very often,
> 
> (gdb) bt
> #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> #1  0x000002aa165da37e in coroutine_trampoline (i0=<optimized out>,
> i1=1812051552) at util/coroutine-ucontext.c:79
> #2  0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6
> 
> 
> 2. without the virtio_queue_host_notifier_read calls, with patch 4
> 
> crash happens very often,
> 
> (gdb) bt
> #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> #1  0x000002aa39dda43e in coroutine_trampoline (i0=<optimized out>,
> i1=-1677715600) at util/coroutine-ucontext.c:79
> #2  0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6
> 
> 

Tu Bo,

Could you help test this patch (on top of master, without patch 4)?

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 08275a9..47f8043 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
 
 void virtio_queue_notify(VirtIODevice *vdev, int n)
 {
-    virtio_queue_notify_vq(&vdev->vq[n]);
+    VirtQueue *vq = &vdev->vq[n];
+    EventNotifier *n;
+    n = virtio_queue_get_host_notifier(vq);
+    if (n) {
+        event_notifier_set(n);
+    } else {
+        virtio_queue_notify_vq(vq);
+    }
 }
 
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21  9:42                           ` Fam Zheng
@ 2016-03-21 11:10                             ` Christian Borntraeger
  2016-03-21 12:17                             ` Cornelia Huck
  1 sibling, 0 replies; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-21 11:10 UTC (permalink / raw)
  To: Fam Zheng, Paolo Bonzini, tu bo
  Cc: cornelia.huck, Stefan Hajnoczi, qemu-devel, qemu-block,
	Michael S. Tsirkin

On 03/21/2016 10:42 AM, Fam Zheng wrote:
> On Fri, 03/18 16:03, Paolo Bonzini wrote:
>>
>>
>> On 17/03/2016 17:08, Christian Borntraeger wrote:
>>> Good (or bad?) news is the assert also triggers on F23, it just seems
>>> to take longer.
>>
>> I guess good news, because we can rule out the kernel (not that I
>> believed it was a kernel problem, but the thought is always there in
>> the background...).
>>
>> The interaction between ioeventfd and dataplane is too complicated.  I
>> think if we get rid of the start/stop ioeventfd calls (just set up the
>> ioeventfd as soon as possible and then only set/clear the handlers)
>> things would be much simpler.
>>
>> I'll see if I can produce something based on Conny's patches, which are
>> already a start.  Today I had a short day so I couldn't play with the
>> bug; out of curiosity, does the bug reproduce with her work + patch 4
>> from this series + the reentrancy assertion?
> 
> The other half of the race condition is from ioport write in the vcpu thread. I
> hit this by adding an extra assert(is_in_iothread()) in
> virtio_blk_handle_request(), at the same place with Paolo's atomic read of
> variable "test".

Thats good, that you can reproduce on x86.
the ioport write in the vcpu thread, is the equivalent of s390_virtio_hypercall on
s390 - a virtio kick that is usually handled by eventfd but here we  have a case
where we go the slow path. So the good thing is that this is not s390 specific,
which might help to find the issue more quickly.



> 
> I haven't tried to find where this ioport write is from, but that is indeed an
> issue in virtio-pci.
> 
> (gdb) thread apply all bt
> 
> <...>
> 
> Thread 3 (Thread 0x7f9e8928b700 (LWP 30671)):
> #0  0x00007f9e8bac65d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007f9e8bac7cc8 in __GI_abort () at abort.c:90
> #2  0x00007f9e8babf546 in __assert_fail_base (fmt=0x7f9e8bc0f128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9e8e04e9d1 "is_in_iothread()",
>     file=file@entry=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=line@entry=597,
>     function=function@entry=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:92
> #3  0x00007f9e8babf5f2 in __GI___assert_fail (assertion=0x7f9e8e04e9d1 "is_in_iothread()", file=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=597,
>     function=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:101
> #4  0x00007f9e8dc9f414 in virtio_blk_handle_output (vdev=0x7f9e929d7b68, vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/block/virtio-blk.c:597
> #5  0x00007f9e8dcd6f53 in virtio_queue_notify_vq (vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/virtio/virtio.c:1095
> #6  0x00007f9e8dcd6f91 in virtio_queue_notify (vdev=0x7f9e929d7b68, n=0) at /home/fam/work/qemu/hw/virtio/virtio.c:1101
> #7  0x00007f9e8df03d2f in virtio_ioport_write (opaque=0x7f9e929cf840, addr=16, val=0) at /home/fam/work/qemu/hw/virtio/virtio-pci.c:419
> #8  0x00007f9e8df041be in virtio_pci_config_write (opaque=0x7f9e929cf840, addr=16, val=0, size=2) at /home/fam/work/qemu/hw/virtio/virtio-pci.c:552
> #9  0x00007f9e8dc7c8c9 in memory_region_write_accessor (mr=0x7f9e929d00c0, addr=16, value=0x7f9e8928a988, size=2, shift=0, mask=65535, attrs=...)
>     at /home/fam/work/qemu/memory.c:524
> #10 0x00007f9e8dc7cad4 in access_with_adjusted_size (addr=16, value=0x7f9e8928a988, size=2, access_size_min=1, access_size_max=4, access=
>     0x7f9e8dc7c7e8 <memory_region_write_accessor>, mr=0x7f9e929d00c0, attrs=...) at /home/fam/work/qemu/memory.c:590
> #11 0x00007f9e8dc7f71b in memory_region_dispatch_write (mr=0x7f9e929d00c0, addr=16, data=0, size=2, attrs=...) at /home/fam/work/qemu/memory.c:1272
> #12 0x00007f9e8dc32815 in address_space_write_continue (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>,
>     len=2, addr1=16, l=2, mr=0x7f9e929d00c0) at /home/fam/work/qemu/exec.c:2607
> #13 0x00007f9e8dc329c1 in address_space_write (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>, len=2)
>     at /home/fam/work/qemu/exec.c:2659
> #14 0x00007f9e8dc32d78 in address_space_rw (as=0x7f9e8e5834a0 <address_space_io>, addr=49232, attrs=..., buf=0x7f9e8daa9000 <Address 0x7f9e8daa9000 out of bounds>, len=2,
>     is_write=true) at /home/fam/work/qemu/exec.c:2762
> #15 0x00007f9e8dc79358 in kvm_handle_io (port=49232, attrs=..., data=0x7f9e8daa9000, direction=1, size=2, count=1) at /home/fam/work/qemu/kvm-all.c:1699
> #16 0x00007f9e8dc79858 in kvm_cpu_exec (cpu=0x7f9e905a5250) at /home/fam/work/qemu/kvm-all.c:1863
> #17 0x00007f9e8dc619a3 in qemu_kvm_cpu_thread_fn (arg=0x7f9e905a5250) at /home/fam/work/qemu/cpus.c:1056
> #18 0x00007f9e8be59df5 in start_thread (arg=0x7f9e8928b700) at pthread_create.c:308
> #19 0x00007f9e8bb871ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> 
> <...>
> 
> Thread 1 (Thread 0x7f9e8b28f700 (LWP 30667)):
> #0  0x00007f9e8bac65d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007f9e8bac7cc8 in __GI_abort () at abort.c:90
> #2  0x00007f9e8babf546 in __assert_fail_base (fmt=0x7f9e8bc0f128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f9e8e04e9e2 "x == 0",
>     file=file@entry=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=line@entry=598,
>     function=function@entry=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:92
> #3  0x00007f9e8babf5f2 in __GI___assert_fail (assertion=0x7f9e8e04e9e2 "x == 0", file=0x7f9e8e04e8e0 "/home/fam/work/qemu/hw/block/virtio-blk.c", line=598,
>     function=0x7f9e8e04ec30 <__PRETTY_FUNCTION__.37148> "virtio_blk_handle_output") at assert.c:101
> #4  0x00007f9e8dc9f43c in virtio_blk_handle_output (vdev=0x7f9e929d7b68, vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/block/virtio-blk.c:598
> #5  0x00007f9e8dcd6f53 in virtio_queue_notify_vq (vq=0x7f9e92a762f0) at /home/fam/work/qemu/hw/virtio/virtio.c:1095
> #6  0x00007f9e8dcd8dfd in virtio_queue_host_notifier_read (n=0x7f9e92a76348) at /home/fam/work/qemu/hw/virtio/virtio.c:1785
> #7  0x00007f9e8df76b40 in aio_dispatch (ctx=0x7f9e90540a50) at /home/fam/work/qemu/aio-posix.c:327
> #8  0x00007f9e8df770f8 in aio_poll (ctx=0x7f9e90540a50, blocking=true) at /home/fam/work/qemu/aio-posix.c:475
> #9  0x00007f9e8dd7b5c4 in iothread_run (opaque=0x7f9e905404d0) at /home/fam/work/qemu/iothread.c:55
> #10 0x00007f9e8be59df5 in start_thread (arg=0x7f9e8b28f700) at pthread_create.c:308
> #11 0x00007f9e8bb871ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 10:57                   ` Fam Zheng
@ 2016-03-21 11:15                     ` Cornelia Huck
  2016-03-21 12:45                       ` Fam Zheng
  2016-03-22  7:10                     ` tu bo
  1 sibling, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-21 11:15 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Mon, 21 Mar 2016 18:57:18 +0800
Fam Zheng <famz@redhat.com> wrote:

> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 08275a9..47f8043 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> 
>  void virtio_queue_notify(VirtIODevice *vdev, int n)
>  {
> -    virtio_queue_notify_vq(&vdev->vq[n]);
> +    VirtQueue *vq = &vdev->vq[n];
> +    EventNotifier *n;
> +    n = virtio_queue_get_host_notifier(vq);
> +    if (n) {

Isn't that always true, even if the notifier has not been setup?

> +        event_notifier_set(n);
> +    } else {
> +        virtio_queue_notify_vq(vq);
> +    }
>  }
> 
>  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21  9:42                           ` Fam Zheng
  2016-03-21 11:10                             ` Christian Borntraeger
@ 2016-03-21 12:17                             ` Cornelia Huck
  1 sibling, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-21 12:17 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Mon, 21 Mar 2016 17:42:06 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Fri, 03/18 16:03, Paolo Bonzini wrote:
> >
> >
> > On 17/03/2016 17:08, Christian Borntraeger wrote:
> > > Good (or bad?) news is the assert also triggers on F23, it just seems
> > > to take longer.
> >
> > I guess good news, because we can rule out the kernel (not that I
> > believed it was a kernel problem, but the thought is always there in
> > the background...).
> >
> > The interaction between ioeventfd and dataplane is too complicated.  I
> > think if we get rid of the start/stop ioeventfd calls (just set up the
> > ioeventfd as soon as possible and then only set/clear the handlers)
> > things would be much simpler.
> >
> > I'll see if I can produce something based on Conny's patches, which are
> > already a start.  Today I had a short day so I couldn't play with the
> > bug; out of curiosity, does the bug reproduce with her work + patch 4
> > from this series + the reentrancy assertion?
> 
> The other half of the race condition is from ioport write in the vcpu thread. I
> hit this by adding an extra assert(is_in_iothread()) in
> virtio_blk_handle_request(), at the same place with Paolo's atomic read of
> variable "test".
> 
> I haven't tried to find where this ioport write is from, but that is indeed an
> issue in virtio-pci.

Might this be a slow-path notification from before the ioeventfd got
registered on the io bus (IOW before qemu got control)? We have the
first notification that triggers dataplane setup (including registering
the ioeventfd). The second notification was already making its way to
qemu, and gets only handled when ->started had already been set.

Now I'm totally disregarding any possible locking between threads etc.
(perhaps somebody on cc: knows better?), but would the following logic
in handle_output make sense?

if (dataplane) {
   if (!started) {
      dataplane_start();
   }
   if (!disabled) {
      return;
   }
}
/* slow path processing */

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 11:15                     ` Cornelia Huck
@ 2016-03-21 12:45                       ` Fam Zheng
  2016-03-21 13:02                         ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-21 12:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Mon, 03/21 12:15, Cornelia Huck wrote:
> On Mon, 21 Mar 2016 18:57:18 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index 08275a9..47f8043 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > 
> >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> >  {
> > -    virtio_queue_notify_vq(&vdev->vq[n]);
> > +    VirtQueue *vq = &vdev->vq[n];
> > +    EventNotifier *n;
> > +    n = virtio_queue_get_host_notifier(vq);
> > +    if (n) {
> 
> Isn't that always true, even if the notifier has not been setup?

You are right, this doesn't make a correct fix. But we can still do a quick
test with this as the else branch should never be used with ioeventfd=on. Am I
right?

Fam

> 
> > +        event_notifier_set(n);
> > +    } else {
> > +        virtio_queue_notify_vq(vq);
> > +    }
> >  }
> > 
> >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 12:45                       ` Fam Zheng
@ 2016-03-21 13:02                         ` Cornelia Huck
  2016-03-21 23:45                           ` Fam Zheng
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-21 13:02 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Mon, 21 Mar 2016 20:45:27 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Mon, 03/21 12:15, Cornelia Huck wrote:
> > On Mon, 21 Mar 2016 18:57:18 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> > 
> > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > index 08275a9..47f8043 100644
> > > --- a/hw/virtio/virtio.c
> > > +++ b/hw/virtio/virtio.c
> > > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > > 
> > >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> > >  {
> > > -    virtio_queue_notify_vq(&vdev->vq[n]);
> > > +    VirtQueue *vq = &vdev->vq[n];
> > > +    EventNotifier *n;
> > > +    n = virtio_queue_get_host_notifier(vq);
> > > +    if (n) {
> > 
> > Isn't that always true, even if the notifier has not been setup?
> 
> You are right, this doesn't make a correct fix. But we can still do a quick
> test with this as the else branch should never be used with ioeventfd=on. Am I
> right?
> 
> Fam

Won't we come through here for the very first kick, when we haven't
registered the ioeventfd with the kernel yet?

> 
> > 
> > > +        event_notifier_set(n);
> > > +    } else {
> > > +        virtio_queue_notify_vq(vq);
> > > +    }
> > >  }
> > > 
> > >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-18 15:03                         ` Paolo Bonzini
  2016-03-21  9:42                           ` Fam Zheng
@ 2016-03-21 13:47                           ` TU BO
  2016-03-21 13:54                             ` Paolo Bonzini
  1 sibling, 1 reply; 70+ messages in thread
From: TU BO @ 2016-03-21 13:47 UTC (permalink / raw)
  To: Paolo Bonzini, Christian Borntraeger, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, qemu-block, Stefan Hajnoczi,
	Michael S. Tsirkin

On 16/3/18 下午11:03, Paolo Bonzini wrote:
>
> On 17/03/2016 17:08, Christian Borntraeger wrote:
>> Good (or bad?) news is the assert also triggers on F23, it just seems
>> to take longer.
> I guess good news, because we can rule out the kernel (not that I
> believed it was a kernel problem, but the thought is always there in
> the background...).
>
> The interaction between ioeventfd and dataplane is too complicated.  I
> think if we get rid of the start/stop ioeventfd calls (just set up the
> ioeventfd as soon as possible and then only set/clear the handlers)
> things would be much simpler.
>
> I'll see if I can produce something based on Conny's patches, which are
> already a start.  Today I had a short day so I couldn't play with the
> bug; out of curiosity, does the bug reproduce with her work + patch 4
> from this series + the reentrancy assertion?
I did NOT see crash with qemu master + "[PATCH RFC 0/6] virtio: refactor 
host notifiers" from Conny + patch 4 + assertion.  thx
>
> Paolo
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 13:47                           ` TU BO
@ 2016-03-21 13:54                             ` Paolo Bonzini
  2016-03-21 14:19                               ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-21 13:54 UTC (permalink / raw)
  To: TU BO, Christian Borntraeger, Fam Zheng, qemu-devel
  Cc: Kevin Wolf, cornelia.huck, qemu-block, Stefan Hajnoczi,
	Michael S. Tsirkin



On 21/03/2016 14:47, TU BO wrote:
>> I'll see if I can produce something based on Conny's patches, which are
>> already a start.  Today I had a short day so I couldn't play with the
>> bug; out of curiosity, does the bug reproduce with her work + patch 4
>> from this series + the reentrancy assertion?
> I did NOT see crash with qemu master + "[PATCH RFC 0/6] virtio: refactor
> host notifiers" from Conny + patch 4 + assertion.  thx

That's unexpected, but I guess it only says that I didn't review her
patches well enough. :)

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 13:54                             ` Paolo Bonzini
@ 2016-03-21 14:19                               ` Cornelia Huck
  2016-03-22  0:31                                 ` Fam Zheng
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-21 14:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Christian Borntraeger,
	Michael S. Tsirkin, qemu-devel, TU BO, Stefan Hajnoczi

On Mon, 21 Mar 2016 14:54:04 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 21/03/2016 14:47, TU BO wrote:
> >> I'll see if I can produce something based on Conny's patches, which are
> >> already a start.  Today I had a short day so I couldn't play with the
> >> bug; out of curiosity, does the bug reproduce with her work + patch 4
> >> from this series + the reentrancy assertion?
> > I did NOT see crash with qemu master + "[PATCH RFC 0/6] virtio: refactor
> > host notifiers" from Conny + patch 4 + assertion.  thx
> 
> That's unexpected, but I guess it only says that I didn't review her
> patches well enough. :)

I'm also a bit surprised, the only thing that should really be
different is passing the 'assign' argument in stop_ioeventfd(). Any
other fixes are purely accidental :)

Would be interesting to see how this setup fares with virtio-pci.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 13:02                         ` Cornelia Huck
@ 2016-03-21 23:45                           ` Fam Zheng
  2016-03-22  8:06                             ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-21 23:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Mon, 03/21 14:02, Cornelia Huck wrote:
> On Mon, 21 Mar 2016 20:45:27 +0800
> Fam Zheng <famz@redhat.com> wrote:
> 
> > On Mon, 03/21 12:15, Cornelia Huck wrote:
> > > On Mon, 21 Mar 2016 18:57:18 +0800
> > > Fam Zheng <famz@redhat.com> wrote:
> > > 
> > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > index 08275a9..47f8043 100644
> > > > --- a/hw/virtio/virtio.c
> > > > +++ b/hw/virtio/virtio.c
> > > > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > > > 
> > > >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> > > >  {
> > > > -    virtio_queue_notify_vq(&vdev->vq[n]);
> > > > +    VirtQueue *vq = &vdev->vq[n];
> > > > +    EventNotifier *n;
> > > > +    n = virtio_queue_get_host_notifier(vq);
> > > > +    if (n) {
> > > 
> > > Isn't that always true, even if the notifier has not been setup?
> > 
> > You are right, this doesn't make a correct fix. But we can still do a quick
> > test with this as the else branch should never be used with ioeventfd=on. Am I
> > right?
> > 
> > Fam
> 
> Won't we come through here for the very first kick, when we haven't
> registered the ioeventfd with the kernel yet?
> 

The ioeventfd in virtio-ccw is registered in the main loop when
VIRTIO_CONFIG_S_DRIVER_OK is set, so I think the first kick is okay.

Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 14:19                               ` Cornelia Huck
@ 2016-03-22  0:31                                 ` Fam Zheng
  0 siblings, 0 replies; 70+ messages in thread
From: Fam Zheng @ 2016-03-22  0:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, qemu-block, TU BO, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi, Paolo Bonzini

On Mon, 03/21 15:19, Cornelia Huck wrote:
> On Mon, 21 Mar 2016 14:54:04 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 21/03/2016 14:47, TU BO wrote:
> > >> I'll see if I can produce something based on Conny's patches, which are
> > >> already a start.  Today I had a short day so I couldn't play with the
> > >> bug; out of curiosity, does the bug reproduce with her work + patch 4
> > >> from this series + the reentrancy assertion?
> > > I did NOT see crash with qemu master + "[PATCH RFC 0/6] virtio: refactor
> > > host notifiers" from Conny + patch 4 + assertion.  thx
> > 
> > That's unexpected, but I guess it only says that I didn't review her
> > patches well enough. :)
> 
> I'm also a bit surprised, the only thing that should really be
> different is passing the 'assign' argument in stop_ioeventfd(). Any
> other fixes are purely accidental :)
> 
> Would be interesting to see how this setup fares with virtio-pci.
> 

Seems to fix the assertion I'm hitting too.

Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 10:57                   ` Fam Zheng
  2016-03-21 11:15                     ` Cornelia Huck
@ 2016-03-22  7:10                     ` tu bo
  2016-03-22  7:18                       ` Fam Zheng
  1 sibling, 1 reply; 70+ messages in thread
From: tu bo @ 2016-03-22  7:10 UTC (permalink / raw)
  To: qemu-devel

Hi Fam:

On 03/21/2016 06:57 PM, Fam Zheng wrote:
> On Thu, 03/17 19:03, tu bo wrote:
>>
>> On 03/17/2016 08:39 AM, Fam Zheng wrote:
>>> On Wed, 03/16 14:45, Paolo Bonzini wrote:
>>>>
>>>>
>>>> On 16/03/2016 14:38, Christian Borntraeger wrote:
>>>>>> If you just remove the calls to virtio_queue_host_notifier_read, here
>>>>>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
>>>>>> (keeping patches 2-4 in)?
>>>>>
>>>>> With these changes and patch 2-4 it does no longer locks up.
>>>>> I keep it running some hour to check if a crash happens.
>>>>>
>>>>> Tu Bo, your setup is currently better suited for reproducing. Can you also check?
>>>>
>>>> Great, I'll prepare a patch to virtio then sketching the solution that
>>>> Conny agreed with.
>>>>
>>>> While Fam and I agreed that patch 1 is not required, I'm not sure if the
>>>> mutex is necessary in the end.
>>>
>>> If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
>>> are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
>>> to have it. I'm not sure about the BH.
>>>
>>> And on a hindsight I realize we don't want patches 2-3 too. Actually the
>>> begin/end pair won't work as expected because of the blk_set_aio_context.
>>>
>>> Let's hold on this series.
>>>
>>>>
>>>> So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
>>>> and both with/without Fam's patches, it would be great.
>>>
>>> Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.
>>>
>> 1. without the virtio_queue_host_notifier_read calls,  without patch 4
>>
>> crash happens very often,
>>
>> (gdb) bt
>> #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
>> #1  0x000002aa165da37e in coroutine_trampoline (i0=<optimized out>,
>> i1=1812051552) at util/coroutine-ucontext.c:79
>> #2  0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6
>>
>>
>> 2. without the virtio_queue_host_notifier_read calls, with patch 4
>>
>> crash happens very often,
>>
>> (gdb) bt
>> #0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
>> #1  0x000002aa39dda43e in coroutine_trampoline (i0=<optimized out>,
>> i1=-1677715600) at util/coroutine-ucontext.c:79
>> #2  0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6
>>
>>
>
> Tu Bo,
>
> Could you help test this patch (on top of master, without patch 4)?
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 08275a9..47f8043 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
>
>   void virtio_queue_notify(VirtIODevice *vdev, int n)
>   {
> -    virtio_queue_notify_vq(&vdev->vq[n]);
> +    VirtQueue *vq = &vdev->vq[n];
> +    EventNotifier *n;
> +    n = virtio_queue_get_host_notifier(vq);
> +    if (n) {
> +        event_notifier_set(n);
> +    } else {
> +        virtio_queue_notify_vq(vq);
> +    }
>   }
>
>   uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
>
>

I got a build error as below,

/BUILD/qemu-2.5.50/hw/virtio/virtio.c: In function 'virtio_queue_notify':
/BUILD/qemu-2.5.50/hw/virtio/virtio.c:1102:20: error: 'n' redeclared as 
different kind of symbol
      EventNotifier *n;
                     ^
/BUILD/qemu-2.5.50/hw/virtio/virtio.c:1099:50: note: previous definition 
of 'n' was here
  void virtio_queue_notify(VirtIODevice *vdev, int n)


Then I did some change for your patch as below,
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 08275a9..a10da39 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)

  void virtio_queue_notify(VirtIODevice *vdev, int n)
  {
-    virtio_queue_notify_vq(&vdev->vq[n]);
+    VirtQueue *vq = &vdev->vq[n];
+    EventNotifier *en;
+    en = virtio_queue_get_host_notifier(vq);
+    if (en) {
+        event_notifier_set(en);
+    } else {
+        virtio_queue_notify_vq(vq);
+    }
  }

  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)

With qemu master + modified patch above(without patch 4, without Conny's 
patches), I did NOT get crash so far.  thanks



>

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22  7:10                     ` tu bo
@ 2016-03-22  7:18                       ` Fam Zheng
  2016-03-22  9:07                         ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-22  7:18 UTC (permalink / raw)
  To: tu bo; +Cc: qemu-devel

On Tue, 03/22 15:10, tu bo wrote:
> Hi Fam:
> 
> On 03/21/2016 06:57 PM, Fam Zheng wrote:
> >On Thu, 03/17 19:03, tu bo wrote:
> >>
> >>On 03/17/2016 08:39 AM, Fam Zheng wrote:
> >>>On Wed, 03/16 14:45, Paolo Bonzini wrote:
> >>>>
> >>>>
> >>>>On 16/03/2016 14:38, Christian Borntraeger wrote:
> >>>>>>If you just remove the calls to virtio_queue_host_notifier_read, here
> >>>>>>and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> >>>>>>(keeping patches 2-4 in)?
> >>>>>
> >>>>>With these changes and patch 2-4 it does no longer locks up.
> >>>>>I keep it running some hour to check if a crash happens.
> >>>>>
> >>>>>Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> >>>>
> >>>>Great, I'll prepare a patch to virtio then sketching the solution that
> >>>>Conny agreed with.
> >>>>
> >>>>While Fam and I agreed that patch 1 is not required, I'm not sure if the
> >>>>mutex is necessary in the end.
> >>>
> >>>If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
> >>>are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
> >>>to have it. I'm not sure about the BH.
> >>>
> >>>And on a hindsight I realize we don't want patches 2-3 too. Actually the
> >>>begin/end pair won't work as expected because of the blk_set_aio_context.
> >>>
> >>>Let's hold on this series.
> >>>
> >>>>
> >>>>So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
> >>>>and both with/without Fam's patches, it would be great.
> >>>
> >>>Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.
> >>>
> >>1. without the virtio_queue_host_notifier_read calls,  without patch 4
> >>
> >>crash happens very often,
> >>
> >>(gdb) bt
> >>#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> >>#1  0x000002aa165da37e in coroutine_trampoline (i0=<optimized out>,
> >>i1=1812051552) at util/coroutine-ucontext.c:79
> >>#2  0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6
> >>
> >>
> >>2. without the virtio_queue_host_notifier_read calls, with patch 4
> >>
> >>crash happens very often,
> >>
> >>(gdb) bt
> >>#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> >>#1  0x000002aa39dda43e in coroutine_trampoline (i0=<optimized out>,
> >>i1=-1677715600) at util/coroutine-ucontext.c:79
> >>#2  0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6
> >>
> >>
> >
> >Tu Bo,
> >
> >Could you help test this patch (on top of master, without patch 4)?
> >
> >diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >index 08275a9..47f8043 100644
> >--- a/hw/virtio/virtio.c
> >+++ b/hw/virtio/virtio.c
> >@@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> >
> >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> >  {
> >-    virtio_queue_notify_vq(&vdev->vq[n]);
> >+    VirtQueue *vq = &vdev->vq[n];
> >+    EventNotifier *n;
> >+    n = virtio_queue_get_host_notifier(vq);
> >+    if (n) {
> >+        event_notifier_set(n);
> >+    } else {
> >+        virtio_queue_notify_vq(vq);
> >+    }
> >  }
> >
> >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> >
> >
> 
> I got a build error as below,
> 
> /BUILD/qemu-2.5.50/hw/virtio/virtio.c: In function 'virtio_queue_notify':
> /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1102:20: error: 'n' redeclared
> as different kind of symbol
>      EventNotifier *n;
>                     ^
> /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1099:50: note: previous
> definition of 'n' was here
>  void virtio_queue_notify(VirtIODevice *vdev, int n)
> 
> 
> Then I did some change for your patch as below,
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 08275a9..a10da39 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> 
>  void virtio_queue_notify(VirtIODevice *vdev, int n)
>  {
> -    virtio_queue_notify_vq(&vdev->vq[n]);
> +    VirtQueue *vq = &vdev->vq[n];
> +    EventNotifier *en;
> +    en = virtio_queue_get_host_notifier(vq);
> +    if (en) {
> +        event_notifier_set(en);
> +    } else {
> +        virtio_queue_notify_vq(vq);
> +    }
>  }
> 
>  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> 
> With qemu master + modified patch above(without patch 4, without
> Conny's patches), I did NOT get crash so far.  thanks

Yes, it was a mistake.  Thanks for the testing!

Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-21 23:45                           ` Fam Zheng
@ 2016-03-22  8:06                             ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-22  8:06 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, tu bo,
	Stefan Hajnoczi, Paolo Bonzini, Christian Borntraeger

On Tue, 22 Mar 2016 07:45:19 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Mon, 03/21 14:02, Cornelia Huck wrote:
> > On Mon, 21 Mar 2016 20:45:27 +0800
> > Fam Zheng <famz@redhat.com> wrote:
> > 
> > > On Mon, 03/21 12:15, Cornelia Huck wrote:
> > > > On Mon, 21 Mar 2016 18:57:18 +0800
> > > > Fam Zheng <famz@redhat.com> wrote:
> > > > 
> > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > > index 08275a9..47f8043 100644
> > > > > --- a/hw/virtio/virtio.c
> > > > > +++ b/hw/virtio/virtio.c
> > > > > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > > > > 
> > > > >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> > > > >  {
> > > > > -    virtio_queue_notify_vq(&vdev->vq[n]);
> > > > > +    VirtQueue *vq = &vdev->vq[n];
> > > > > +    EventNotifier *n;
> > > > > +    n = virtio_queue_get_host_notifier(vq);
> > > > > +    if (n) {
> > > > 
> > > > Isn't that always true, even if the notifier has not been setup?
> > > 
> > > You are right, this doesn't make a correct fix. But we can still do a quick
> > > test with this as the else branch should never be used with ioeventfd=on. Am I
> > > right?
> > > 
> > > Fam
> > 
> > Won't we come through here for the very first kick, when we haven't
> > registered the ioeventfd with the kernel yet?
> > 
> 
> The ioeventfd in virtio-ccw is registered in the main loop when
> VIRTIO_CONFIG_S_DRIVER_OK is set, so I think the first kick is okay.

You're right, for well-behaved drivers this will be done from
set_status, so for testing that's fine.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22  7:18                       ` Fam Zheng
@ 2016-03-22  9:07                         ` Cornelia Huck
  2016-03-22  9:46                           ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-22  9:07 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Christian Borntraeger,
	Michael S. Tsirkin, qemu-devel, tu bo, Stefan Hajnoczi,
	Paolo Bonzini

(re-adding cc:s)

On Tue, 22 Mar 2016 15:18:05 +0800
Fam Zheng <famz@redhat.com> wrote:

> On Tue, 03/22 15:10, tu bo wrote:
> > Hi Fam:
> > 
> > On 03/21/2016 06:57 PM, Fam Zheng wrote:
> > >On Thu, 03/17 19:03, tu bo wrote:
> > >>
> > >>On 03/17/2016 08:39 AM, Fam Zheng wrote:
> > >>>On Wed, 03/16 14:45, Paolo Bonzini wrote:
> > >>>>
> > >>>>
> > >>>>On 16/03/2016 14:38, Christian Borntraeger wrote:
> > >>>>>>If you just remove the calls to virtio_queue_host_notifier_read, here
> > >>>>>>and in virtio_queue_aio_set_host_notifier_fd_handler, does it work
> > >>>>>>(keeping patches 2-4 in)?
> > >>>>>
> > >>>>>With these changes and patch 2-4 it does no longer locks up.
> > >>>>>I keep it running some hour to check if a crash happens.
> > >>>>>
> > >>>>>Tu Bo, your setup is currently better suited for reproducing. Can you also check?
> > >>>>
> > >>>>Great, I'll prepare a patch to virtio then sketching the solution that
> > >>>>Conny agreed with.
> > >>>>
> > >>>>While Fam and I agreed that patch 1 is not required, I'm not sure if the
> > >>>>mutex is necessary in the end.
> > >>>
> > >>>If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH
> > >>>are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good
> > >>>to have it. I'm not sure about the BH.
> > >>>
> > >>>And on a hindsight I realize we don't want patches 2-3 too. Actually the
> > >>>begin/end pair won't work as expected because of the blk_set_aio_context.
> > >>>
> > >>>Let's hold on this series.
> > >>>
> > >>>>
> > >>>>So if Tu Bo can check without the virtio_queue_host_notifier_read calls,
> > >>>>and both with/without Fam's patches, it would be great.
> > >>>
> > >>>Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise.
> > >>>
> > >>1. without the virtio_queue_host_notifier_read calls,  without patch 4
> > >>
> > >>crash happens very often,
> > >>
> > >>(gdb) bt
> > >>#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> > >>#1  0x000002aa165da37e in coroutine_trampoline (i0=<optimized out>,
> > >>i1=1812051552) at util/coroutine-ucontext.c:79
> > >>#2  0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6
> > >>
> > >>
> > >>2. without the virtio_queue_host_notifier_read calls, with patch 4
> > >>
> > >>crash happens very often,
> > >>
> > >>(gdb) bt
> > >>#0  bdrv_co_do_rw (opaque=0x0) at block/io.c:2172
> > >>#1  0x000002aa39dda43e in coroutine_trampoline (i0=<optimized out>,
> > >>i1=-1677715600) at util/coroutine-ucontext.c:79
> > >>#2  0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6
> > >>
> > >>
> > >
> > >Tu Bo,
> > >
> > >Could you help test this patch (on top of master, without patch 4)?
> > >
> > >diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > >index 08275a9..47f8043 100644
> > >--- a/hw/virtio/virtio.c
> > >+++ b/hw/virtio/virtio.c
> > >@@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > >
> > >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> > >  {
> > >-    virtio_queue_notify_vq(&vdev->vq[n]);
> > >+    VirtQueue *vq = &vdev->vq[n];
> > >+    EventNotifier *n;
> > >+    n = virtio_queue_get_host_notifier(vq);
> > >+    if (n) {
> > >+        event_notifier_set(n);
> > >+    } else {
> > >+        virtio_queue_notify_vq(vq);
> > >+    }
> > >  }
> > >
> > >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> > >
> > >
> > 
> > I got a build error as below,
> > 
> > /BUILD/qemu-2.5.50/hw/virtio/virtio.c: In function 'virtio_queue_notify':
> > /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1102:20: error: 'n' redeclared
> > as different kind of symbol
> >      EventNotifier *n;
> >                     ^
> > /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1099:50: note: previous
> > definition of 'n' was here
> >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> > 
> > 
> > Then I did some change for your patch as below,
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index 08275a9..a10da39 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq)
> > 
> >  void virtio_queue_notify(VirtIODevice *vdev, int n)
> >  {
> > -    virtio_queue_notify_vq(&vdev->vq[n]);
> > +    VirtQueue *vq = &vdev->vq[n];
> > +    EventNotifier *en;
> > +    en = virtio_queue_get_host_notifier(vq);
> > +    if (en) {
> > +        event_notifier_set(en);
> > +    } else {
> > +        virtio_queue_notify_vq(vq);
> > +    }
> >  }
> > 
> >  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
> > 
> > With qemu master + modified patch above(without patch 4, without
> > Conny's patches), I did NOT get crash so far.  thanks
> 
> Yes, it was a mistake.  Thanks for the testing!

I'm wondering what we learn from this. Bypassing notify_vq() removes
the race, but that's not the solution here.

So far, we had the best results with my refactoring + the mutex/bh
change. Two problems:

- We don't really understand yet why my refactoring helps, but passing
the assign value is a good canditate (and it's agreed that this fixes a
bug, I think.)
- There's some problem with the bh, if I understood Stefan correctly.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22  9:07                         ` Cornelia Huck
@ 2016-03-22  9:46                           ` Paolo Bonzini
  2016-03-22 11:59                             ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-22  9:46 UTC (permalink / raw)
  To: Cornelia Huck, Fam Zheng
  Cc: Kevin Wolf, qemu-block, tu bo, Michael S. Tsirkin, qemu-devel,
	Christian Borntraeger, Stefan Hajnoczi



On 22/03/2016 10:07, Cornelia Huck wrote:
> So far, we had the best results with my refactoring + the mutex/bh
> change. Two problems:
> 
> - We don't really understand yet why my refactoring helps, but passing
> the assign value is a good canditate (and it's agreed that this fixes a
> bug, I think.)
> - There's some problem with the bh, if I understood Stefan correctly.

They can be fixed with just an extra object_ref/object_unref.

I didn't understand that Tu Bo also needed the BH fix, and with that
information it makes sense.  Passing the assign value ensures that
ioeventfd remains always assigned.  With the CPU threads out of the
picture, the BH becomes enough to make everything thread-safe.

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22  9:46                           ` Paolo Bonzini
@ 2016-03-22 11:59                             ` Cornelia Huck
  2016-03-22 12:11                               ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-22 11:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tu bo, Stefan Hajnoczi

On Tue, 22 Mar 2016 10:46:58 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 22/03/2016 10:07, Cornelia Huck wrote:
> > So far, we had the best results with my refactoring + the mutex/bh
> > change. Two problems:
> > 
> > - We don't really understand yet why my refactoring helps, but passing
> > the assign value is a good canditate (and it's agreed that this fixes a
> > bug, I think.)
> > - There's some problem with the bh, if I understood Stefan correctly.
> 
> They can be fixed with just an extra object_ref/object_unref.
> 
> I didn't understand that Tu Bo also needed the BH fix, and with that
> information it makes sense.  Passing the assign value ensures that
> ioeventfd remains always assigned.  With the CPU threads out of the
> picture, the BH becomes enough to make everything thread-safe.

Yes, this makes sense.

Might we still have a hole somewhere in dataplane teardown? Probably
not, from reading the code, even if it runs in cpu thread context.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22 11:59                             ` Cornelia Huck
@ 2016-03-22 12:11                               ` Paolo Bonzini
  2016-03-22 12:54                                 ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-22 12:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tu bo, Stefan Hajnoczi



On 22/03/2016 12:59, Cornelia Huck wrote:
>> > They can be fixed with just an extra object_ref/object_unref.
>> > 
>> > I didn't understand that Tu Bo also needed the BH fix, and with that
>> > information it makes sense.  Passing the assign value ensures that
>> > ioeventfd remains always assigned.  With the CPU threads out of the
>> > picture, the BH becomes enough to make everything thread-safe.
> Yes, this makes sense.
> 
> Might we still have a hole somewhere in dataplane teardown? Probably
> not, from reading the code, even if it runs in cpu thread context.

The bug arises when the main thread sets started = true, a CPU thread
comes along while the ioeventfd is reset, and as soon as the BQL is
released by the main thread the CPU thread thinks it is a dataplane
thread.  This does horrible things to non-reentrant code.  For stop we
should be safe because the CPU thread is the one that sets started = false.

IOW, we should be safe as long as the ioeventfd is never unassigned
(your fix) _and_ we ensure serialization between threads that touch
dataplane_started (Fam's fix).

Paolo

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-17 15:07     ` Paolo Bonzini
@ 2016-03-22 12:52       ` Fam Zheng
  2016-03-22 18:05         ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Fam Zheng @ 2016-03-22 12:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, Stefan Hajnoczi,
	qemu-devel, tubo, Stefan Hajnoczi, cornelia.huck, borntraeger

On Thu, 03/17 16:07, Paolo Bonzini wrote:
> 
> 
> On 17/03/2016 16:00, Stefan Hajnoczi wrote:
> >> > +    data = g_new(VirtIOBlockStartData, 1);
> >> > +    data->vblk = vblk;
> >> > +    data->bh = aio_bh_new(s->ctx, virtio_blk_data_plane_start_bh_cb, data);
> >> > +    qemu_bh_schedule(data->bh);
> >> > +    qemu_mutex_unlock(&s->start_stop_lock);
> >> >      return;
> > This BH usage pattern is dangerous:
> > 
> > 1. The BH is created and scheduled.
> > 2. Before the BH executes the device is unrealized.
> > 3. The data->bh pointer is inaccessible so we have a dangling BH that
> >    will access vblk after vblk has been freed.
> > 
> > In some cases it can be safe but I don't see why the pattern is safe in
> > this case.  Either the BH needs to hold some sort of reference to keep
> > vblk alive, or vblk needs to know about pending BHs so they can be
> > deleted.
> 
> You're right.  After unrealizing virtio_blk_data_plane_stop has set of
> vblk->dataplane_started = false, so that's covered.  However, you still
> need an object_ref/object_object_unref pair.

Is it safe to call object_unref outside BQL?

Fam

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop
  2016-03-22 12:11                               ` Paolo Bonzini
@ 2016-03-22 12:54                                 ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-22 12:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	qemu-devel, Christian Borntraeger, tu bo, Stefan Hajnoczi

On Tue, 22 Mar 2016 13:11:05 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 22/03/2016 12:59, Cornelia Huck wrote:
> >> > They can be fixed with just an extra object_ref/object_unref.
> >> > 
> >> > I didn't understand that Tu Bo also needed the BH fix, and with that
> >> > information it makes sense.  Passing the assign value ensures that
> >> > ioeventfd remains always assigned.  With the CPU threads out of the
> >> > picture, the BH becomes enough to make everything thread-safe.
> > Yes, this makes sense.
> > 
> > Might we still have a hole somewhere in dataplane teardown? Probably
> > not, from reading the code, even if it runs in cpu thread context.
> 
> The bug arises when the main thread sets started = true, a CPU thread
> comes along while the ioeventfd is reset, and as soon as the BQL is
> released by the main thread the CPU thread thinks it is a dataplane
> thread.  This does horrible things to non-reentrant code.  For stop we
> should be safe because the CPU thread is the one that sets started = false.
> 
> IOW, we should be safe as long as the ioeventfd is never unassigned
> (your fix) _and_ we ensure serialization between threads that touch
> dataplane_started (Fam's fix).

We should really add something like that explanation to the changelog
so that future generations may understand what's going on here :)

So, what do we do for 2.6? A respin of Fam's fix + my refactoring (with
some interface doc added)? I'd still like some reviews and maybe a test
on virtio-mmio.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-22 12:52       ` Fam Zheng
@ 2016-03-22 18:05         ` Paolo Bonzini
  2016-03-23  8:10           ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-22 18:05 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, Stefan Hajnoczi,
	qemu-devel, tubo, Stefan Hajnoczi, cornelia.huck, borntraeger



On 22/03/2016 13:52, Fam Zheng wrote:
>> You're right.  After unrealizing virtio_blk_data_plane_stop has set of
>> vblk->dataplane_started = false, so that's covered.  However, you still
>> need an object_ref/object_object_unref pair.
> 
> Is it safe to call object_unref outside BQL?

Hmm, no.

However, perhaps we can fix the code without a bottom half, using the 
assertion in virtio_blk_data_plane_start to ensure that there is no 
unwanted reentrancy.

Conny's patches are also enough to mask the bug for me, so my tests
do not say much.  But in any case the following patch works here too
instead of Fam's 4/4; it is a mess including some other experiments,
but I'm including it as is because that's what I tested and it's
dinner time now.

Even if it fails for you or Tu Bo, perhaps the backtraces say
something.

Thanks,

Paolo

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 1b2d5fa..5f72671 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -26,8 +26,7 @@
 #include "qom/object_interfaces.h"
 
 struct VirtIOBlockDataPlane {
-    bool starting;
-    bool stopping;
+    int starting;
     bool disabled;
 
     VirtIOBlkConf *conf;
@@ -192,11 +191,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
     int r;
 
-    if (vblk->dataplane_started || s->starting) {
-        return;
-    }
-
-    s->starting = true;
+    assert(atomic_fetch_inc(&s->starting) == 0);
     s->vq = virtio_get_queue(s->vdev, 0);
 
     /* Set up guest notifier (irq) */
@@ -215,27 +210,28 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
         goto fail_host_notifier;
     }
 
-    s->starting = false;
-    vblk->dataplane_started = true;
     trace_virtio_blk_data_plane_start(s);
 
     blk_set_aio_context(s->conf->conf.blk, s->ctx);
 
-    /* Kick right away to begin processing requests already in vring */
-    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
+    vblk->dataplane_started = true;
 
-    /* Get this show started by hooking up our callbacks */
+    /* Get this show started by hooking up our callbacks.  */
     aio_context_acquire(s->ctx);
     virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
     aio_context_release(s->ctx);
+    atomic_dec(&s->starting);
+
+    /* Kick right away to begin processing requests already in vring */
+    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
     return;
 
   fail_host_notifier:
     k->set_guest_notifiers(qbus->parent, 1, false);
   fail_guest_notifiers:
     s->disabled = true;
-    s->starting = false;
     vblk->dataplane_started = true;
+    atomic_dec(&s->starting);
 }
 
 /* Context: QEMU global mutex held */
@@ -245,7 +241,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
 
-    if (!vblk->dataplane_started || s->stopping) {
+    if (!vblk->dataplane_started) {
         return;
     }
 
@@ -255,7 +251,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
         vblk->dataplane_started = false;
         return;
     }
-    s->stopping = true;
+
     trace_virtio_blk_data_plane_stop(s);
 
     aio_context_acquire(s->ctx);
@@ -274,5 +270,4 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     k->set_guest_notifiers(qbus->parent, 1, false);
 
     vblk->dataplane_started = false;
-    s->stopping = false;
 }

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-22 18:05         ` Paolo Bonzini
@ 2016-03-23  8:10           ` Cornelia Huck
  2016-03-23  9:08             ` Paolo Bonzini
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-23  8:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, tubo, Stefan Hajnoczi, borntraeger

On Tue, 22 Mar 2016 19:05:09 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 22/03/2016 13:52, Fam Zheng wrote:
> >> You're right.  After unrealizing virtio_blk_data_plane_stop has set of
> >> vblk->dataplane_started = false, so that's covered.  However, you still
> >> need an object_ref/object_object_unref pair.
> > 
> > Is it safe to call object_unref outside BQL?
> 
> Hmm, no.
> 
> However, perhaps we can fix the code without a bottom half, using the 
> assertion in virtio_blk_data_plane_start to ensure that there is no 
> unwanted reentrancy.
> 
> Conny's patches are also enough to mask the bug for me, so my tests
> do not say much.  But in any case the following patch works here too
> instead of Fam's 4/4; it is a mess including some other experiments,
> but I'm including it as is because that's what I tested and it's
> dinner time now.
> 
> Even if it fails for you or Tu Bo, perhaps the backtraces say
> something.
> 
> Thanks,
> 
> Paolo
> 
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 1b2d5fa..5f72671 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -26,8 +26,7 @@
>  #include "qom/object_interfaces.h"
> 
>  struct VirtIOBlockDataPlane {
> -    bool starting;
> -    bool stopping;
> +    int starting;
>      bool disabled;
> 
>      VirtIOBlkConf *conf;
> @@ -192,11 +191,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
>      VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
>      int r;
> 
> -    if (vblk->dataplane_started || s->starting) {
> -        return;
> -    }
> -
> -    s->starting = true;
> +    assert(atomic_fetch_inc(&s->starting) == 0);
>      s->vq = virtio_get_queue(s->vdev, 0);
> 
>      /* Set up guest notifier (irq) */
> @@ -215,27 +210,28 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
>          goto fail_host_notifier;
>      }
> 
> -    s->starting = false;
> -    vblk->dataplane_started = true;
>      trace_virtio_blk_data_plane_start(s);
> 
>      blk_set_aio_context(s->conf->conf.blk, s->ctx);
> 
> -    /* Kick right away to begin processing requests already in vring */
> -    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
> +    vblk->dataplane_started = true;
> 
> -    /* Get this show started by hooking up our callbacks */
> +    /* Get this show started by hooking up our callbacks.  */
>      aio_context_acquire(s->ctx);
>      virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
>      aio_context_release(s->ctx);
> +    atomic_dec(&s->starting);
> +
> +    /* Kick right away to begin processing requests already in vring */
> +    event_notifier_set(virtio_queue_get_host_notifier(s->vq));

I'm wondering whether moving this event_notifier_set() masks something?
IOW, may we run into trouble if the event notifier is set from some
other path before the callbacks are set up properly?

>      return;
> 
>    fail_host_notifier:
>      k->set_guest_notifiers(qbus->parent, 1, false);
>    fail_guest_notifiers:
>      s->disabled = true;
> -    s->starting = false;
>      vblk->dataplane_started = true;
> +    atomic_dec(&s->starting);
>  }
> 
>  /* Context: QEMU global mutex held */
> @@ -245,7 +241,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>      VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>      VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
> 
> -    if (!vblk->dataplane_started || s->stopping) {
> +    if (!vblk->dataplane_started) {

No fear of reentrancy here?

>          return;
>      }
> 
> @@ -255,7 +251,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>          vblk->dataplane_started = false;
>          return;
>      }
> -    s->stopping = true;
> +
>      trace_virtio_blk_data_plane_stop(s);
> 
>      aio_context_acquire(s->ctx);
> @@ -274,5 +270,4 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>      k->set_guest_notifiers(qbus->parent, 1, false);
> 
>      vblk->dataplane_started = false;
> -    s->stopping = false;
>  }
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-23  8:10           ` Cornelia Huck
@ 2016-03-23  9:08             ` Paolo Bonzini
  2016-03-23  9:12               ` Christian Borntraeger
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Bonzini @ 2016-03-23  9:08 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, tubo, Stefan Hajnoczi, borntraeger



On 23/03/2016 09:10, Cornelia Huck wrote:
>> -    /* Kick right away to begin processing requests already in vring */
>> -    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
>> +    vblk->dataplane_started = true;
>>
>> -    /* Get this show started by hooking up our callbacks */
>> +    /* Get this show started by hooking up our callbacks.  */
>>      aio_context_acquire(s->ctx);
>>      virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
>>      aio_context_release(s->ctx);
>> +    atomic_dec(&s->starting);
>> +
>> +    /* Kick right away to begin processing requests already in vring */
>> +    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
> 
> I'm wondering whether moving this event_notifier_set() masks something?
> IOW, may we run into trouble if the event notifier is set from some
> other path before the callbacks are set up properly?

The reentrancy check should catch that...  But:

1) the patch really makes no difference, your fix is enough for me

2) vblk->dataplane_started becomes true before the callbacks are set;
that should be enough.

3) this matches what I tested, but it would of course be better if the
assertions on s->starting suffice

>> -    if (!vblk->dataplane_started || s->stopping) {
>> +    if (!vblk->dataplane_started) {
> 
> No fear of reentrancy here?

No, because this is only invoked from reset, hence only from the CPU
thread and only under the BQL.

On start, reentrancy happens between iothread (running outside BQL) and
a CPU thread (running within BQL).

Paolo

>>          return;
>>      }
>>
>> @@ -255,7 +251,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>          vblk->dataplane_started = false;
>>          return;
>>      }
>> -    s->stopping = true;
>> +
>>      trace_virtio_blk_data_plane_stop(s);
>>
>>      aio_context_acquire(s->ctx);
>> @@ -274,5 +270,4 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>      k->set_guest_notifiers(qbus->parent, 1, false);
>>
>>      vblk->dataplane_started = false;
>> -    s->stopping = false;
>>  }
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-23  9:08             ` Paolo Bonzini
@ 2016-03-23  9:12               ` Christian Borntraeger
  2016-03-24  8:19                 ` tu bo
  0 siblings, 1 reply; 70+ messages in thread
From: Christian Borntraeger @ 2016-03-23  9:12 UTC (permalink / raw)
  To: Paolo Bonzini, Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, tubo, Stefan Hajnoczi

On 03/23/2016 10:08 AM, Paolo Bonzini wrote:
> 
> 
> On 23/03/2016 09:10, Cornelia Huck wrote:
>>> -    /* Kick right away to begin processing requests already in vring */
>>> -    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
>>> +    vblk->dataplane_started = true;
>>>
>>> -    /* Get this show started by hooking up our callbacks */
>>> +    /* Get this show started by hooking up our callbacks.  */
>>>      aio_context_acquire(s->ctx);
>>>      virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
>>>      aio_context_release(s->ctx);
>>> +    atomic_dec(&s->starting);
>>> +
>>> +    /* Kick right away to begin processing requests already in vring */
>>> +    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
>>
>> I'm wondering whether moving this event_notifier_set() masks something?
>> IOW, may we run into trouble if the event notifier is set from some
>> other path before the callbacks are set up properly?
> 
> The reentrancy check should catch that...  But:
> 
> 1) the patch really makes no difference, your fix is enough for me


Tu Bo, can you test with master + Cornelias 6 refactoring patches and nothing on top?


 
> 2) vblk->dataplane_started becomes true before the callbacks are set;
> that should be enough.
> 
> 3) this matches what I tested, but it would of course be better if the
> assertions on s->starting suffice
> 
>>> -    if (!vblk->dataplane_started || s->stopping) {
>>> +    if (!vblk->dataplane_started) {
>>
>> No fear of reentrancy here?
> 
> No, because this is only invoked from reset, hence only from the CPU
> thread and only under the BQL.
> 
> On start, reentrancy happens between iothread (running outside BQL) and
> a CPU thread (running within BQL).
> 
> Paolo
> 
>>>          return;
>>>      }
>>>
>>> @@ -255,7 +251,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>>          vblk->dataplane_started = false;
>>>          return;
>>>      }
>>> -    s->stopping = true;
>>> +
>>>      trace_virtio_blk_data_plane_stop(s);
>>>
>>>      aio_context_acquire(s->ctx);
>>> @@ -274,5 +270,4 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>>      k->set_guest_notifiers(qbus->parent, 1, false);
>>>
>>>      vblk->dataplane_started = false;
>>> -    s->stopping = false;
>>>  }
>>>
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-23  9:12               ` Christian Borntraeger
@ 2016-03-24  8:19                 ` tu bo
  2016-03-24  8:32                   ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: tu bo @ 2016-03-24  8:19 UTC (permalink / raw)
  To: Christian Borntraeger, Paolo Bonzini, Cornelia Huck
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, Stefan Hajnoczi

Hi Christian:

On 03/23/2016 05:12 PM, Christian Borntraeger wrote:
> On 03/23/2016 10:08 AM, Paolo Bonzini wrote:
>>
>>
>> On 23/03/2016 09:10, Cornelia Huck wrote:
>>>> -    /* Kick right away to begin processing requests already in vring */
>>>> -    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
>>>> +    vblk->dataplane_started = true;
>>>>
>>>> -    /* Get this show started by hooking up our callbacks */
>>>> +    /* Get this show started by hooking up our callbacks.  */
>>>>       aio_context_acquire(s->ctx);
>>>>       virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
>>>>       aio_context_release(s->ctx);
>>>> +    atomic_dec(&s->starting);
>>>> +
>>>> +    /* Kick right away to begin processing requests already in vring */
>>>> +    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
>>>
>>> I'm wondering whether moving this event_notifier_set() masks something?
>>> IOW, may we run into trouble if the event notifier is set from some
>>> other path before the callbacks are set up properly?
>>
>> The reentrancy check should catch that...  But:
>>
>> 1) the patch really makes no difference, your fix is enough for me
>
>
> Tu Bo, can you test with master + Cornelias 6 refactoring patches and nothing on top?

With qemu master + Cornelia's 6 refactoring patches and nothing on top, 
I did NOT see crash so far.

>
>
>
>> 2) vblk->dataplane_started becomes true before the callbacks are set;
>> that should be enough.
>>
>> 3) this matches what I tested, but it would of course be better if the
>> assertions on s->starting suffice
>>
>>>> -    if (!vblk->dataplane_started || s->stopping) {
>>>> +    if (!vblk->dataplane_started) {
>>>
>>> No fear of reentrancy here?
>>
>> No, because this is only invoked from reset, hence only from the CPU
>> thread and only under the BQL.
>>
>> On start, reentrancy happens between iothread (running outside BQL) and
>> a CPU thread (running within BQL).
>>
>> Paolo
>>
>>>>           return;
>>>>       }
>>>>
>>>> @@ -255,7 +251,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>>>           vblk->dataplane_started = false;
>>>>           return;
>>>>       }
>>>> -    s->stopping = true;
>>>> +
>>>>       trace_virtio_blk_data_plane_stop(s);
>>>>
>>>>       aio_context_acquire(s->ctx);
>>>> @@ -274,5 +270,4 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>>>       k->set_guest_notifiers(qbus->parent, 1, false);
>>>>
>>>>       vblk->dataplane_started = false;
>>>> -    s->stopping = false;
>>>>   }
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-24  8:19                 ` tu bo
@ 2016-03-24  8:32                   ` Cornelia Huck
  2016-03-24  8:47                     ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-24  8:32 UTC (permalink / raw)
  To: tu bo
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Michael S. Tsirkin,
	Stefan Hajnoczi, qemu-devel, Christian Borntraeger,
	Stefan Hajnoczi, Paolo Bonzini

On Thu, 24 Mar 2016 16:19:41 +0800
tu bo <tubo@linux.vnet.ibm.com> wrote:

> Hi Christian:
> 
> On 03/23/2016 05:12 PM, Christian Borntraeger wrote:
> > On 03/23/2016 10:08 AM, Paolo Bonzini wrote:
> >>
> >>
> >> On 23/03/2016 09:10, Cornelia Huck wrote:
> >>>> -    /* Kick right away to begin processing requests already in vring */
> >>>> -    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
> >>>> +    vblk->dataplane_started = true;
> >>>>
> >>>> -    /* Get this show started by hooking up our callbacks */
> >>>> +    /* Get this show started by hooking up our callbacks.  */
> >>>>       aio_context_acquire(s->ctx);
> >>>>       virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true);
> >>>>       aio_context_release(s->ctx);
> >>>> +    atomic_dec(&s->starting);
> >>>> +
> >>>> +    /* Kick right away to begin processing requests already in vring */
> >>>> +    event_notifier_set(virtio_queue_get_host_notifier(s->vq));
> >>>
> >>> I'm wondering whether moving this event_notifier_set() masks something?
> >>> IOW, may we run into trouble if the event notifier is set from some
> >>> other path before the callbacks are set up properly?
> >>
> >> The reentrancy check should catch that...  But:
> >>
> >> 1) the patch really makes no difference, your fix is enough for me
> >
> >
> > Tu Bo, can you test with master + Cornelias 6 refactoring patches and nothing on top?
> 
> With qemu master + Cornelia's 6 refactoring patches and nothing on top, 
> I did NOT see crash so far.

Cool, thanks for testing!

I'll re-send my patches with some added interface doc in patch 1. Stay
tuned.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-24  8:32                   ` Cornelia Huck
@ 2016-03-24  8:47                     ` Cornelia Huck
  2016-03-24  9:31                       ` Cornelia Huck
  0 siblings, 1 reply; 70+ messages in thread
From: Cornelia Huck @ 2016-03-24  8:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Christian Borntraeger,
	Stefan Hajnoczi, Michael S. Tsirkin, tu bo, Stefan Hajnoczi,
	Paolo Bonzini

On Thu, 24 Mar 2016 09:32:30 +0100
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> I'll re-send my patches with some added interface doc in patch 1. Stay
> tuned.

Grr. Unfortunately, this fails for _me_ now (-EEXIST after reboot).
Debugging.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH
  2016-03-24  8:47                     ` Cornelia Huck
@ 2016-03-24  9:31                       ` Cornelia Huck
  0 siblings, 0 replies; 70+ messages in thread
From: Cornelia Huck @ 2016-03-24  9:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Christian Borntraeger,
	Stefan Hajnoczi, Michael S. Tsirkin, tu bo, Stefan Hajnoczi,
	Paolo Bonzini

On Thu, 24 Mar 2016 09:47:56 +0100
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Thu, 24 Mar 2016 09:32:30 +0100
> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> 
> > I'll re-send my patches with some added interface doc in patch 1. Stay
> > tuned.
> 
> Grr. Unfortunately, this fails for _me_ now (-EEXIST after reboot).
> Debugging.

Good news is that I think I understand what happens. Bad news is that
we can scratch all of the previous testing :(

My patchset had a typo (check for !disabled instead of disabled). After
I fixed that, the second assignment of the ioeventfd started failing
(that's what changed when I started passing assign in stop_ioeventfd)
with -EEXIST as the previous ioeventfd is of course still assigned.

What we actually want is to keep the ioeventfd assigned, not add a new
one. But we still want adding a new ioeventfd to fail in case of
collisions.

I think we need to track whether we already assigned an ioeventfd and
don't re-register in that case. I'll try to cook something up.

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2016-03-24  9:31 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-16 10:10 [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Fam Zheng
2016-03-16 10:10 ` [Qemu-devel] [PATCH 1/4] block: Use drained section in bdrv_set_aio_context Fam Zheng
2016-03-16 10:27   ` Paolo Bonzini
2016-03-16 10:51     ` Fam Zheng
2016-03-16 10:10 ` [Qemu-devel] [PATCH 2/4] block-backend: Introduce blk_drained_begin/end Fam Zheng
2016-03-16 10:10 ` [Qemu-devel] [PATCH 3/4] virtio-blk: Use blk_drained_begin/end around dataplane stop Fam Zheng
2016-03-16 10:10 ` [Qemu-devel] [PATCH 4/4] virtio-blk: Clean up start/stop with mutex and BH Fam Zheng
2016-03-17 15:00   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-03-17 15:07     ` Paolo Bonzini
2016-03-22 12:52       ` Fam Zheng
2016-03-22 18:05         ` Paolo Bonzini
2016-03-23  8:10           ` Cornelia Huck
2016-03-23  9:08             ` Paolo Bonzini
2016-03-23  9:12               ` Christian Borntraeger
2016-03-24  8:19                 ` tu bo
2016-03-24  8:32                   ` Cornelia Huck
2016-03-24  8:47                     ` Cornelia Huck
2016-03-24  9:31                       ` Cornelia Huck
2016-03-16 10:28 ` [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop Paolo Bonzini
2016-03-16 10:49   ` Christian Borntraeger
2016-03-16 11:09     ` Paolo Bonzini
2016-03-16 11:24       ` Christian Borntraeger
2016-03-16 12:55         ` Paolo Bonzini
2016-03-16 13:38           ` Christian Borntraeger
2016-03-16 13:45             ` Paolo Bonzini
2016-03-17  0:39               ` Fam Zheng
2016-03-17 11:03                 ` tu bo
2016-03-21 10:57                   ` Fam Zheng
2016-03-21 11:15                     ` Cornelia Huck
2016-03-21 12:45                       ` Fam Zheng
2016-03-21 13:02                         ` Cornelia Huck
2016-03-21 23:45                           ` Fam Zheng
2016-03-22  8:06                             ` Cornelia Huck
2016-03-22  7:10                     ` tu bo
2016-03-22  7:18                       ` Fam Zheng
2016-03-22  9:07                         ` Cornelia Huck
2016-03-22  9:46                           ` Paolo Bonzini
2016-03-22 11:59                             ` Cornelia Huck
2016-03-22 12:11                               ` Paolo Bonzini
2016-03-22 12:54                                 ` Cornelia Huck
2016-03-17 12:22             ` tu bo
2016-03-17 12:39               ` Christian Borntraeger
2016-03-17 13:02                 ` Cornelia Huck
2016-03-17 15:02                 ` Paolo Bonzini
2016-03-17 15:07                   ` Christian Borntraeger
2016-03-17 15:15                   ` Christian Borntraeger
2016-03-17 15:16                     ` Christian Borntraeger
2016-03-17 16:08                       ` Christian Borntraeger
2016-03-18 15:03                         ` Paolo Bonzini
2016-03-21  9:42                           ` Fam Zheng
2016-03-21 11:10                             ` Christian Borntraeger
2016-03-21 12:17                             ` Cornelia Huck
2016-03-21 13:47                           ` TU BO
2016-03-21 13:54                             ` Paolo Bonzini
2016-03-21 14:19                               ` Cornelia Huck
2016-03-22  0:31                                 ` Fam Zheng
2016-03-16 11:32       ` Cornelia Huck
2016-03-16 11:48         ` Paolo Bonzini
2016-03-16 11:56           ` Cornelia Huck
2016-03-16 11:59             ` Paolo Bonzini
2016-03-16 12:22               ` Cornelia Huck
2016-03-16 12:32                 ` Paolo Bonzini
2016-03-16 12:42                   ` Cornelia Huck
2016-03-16 12:49                     ` Paolo Bonzini
2016-03-16 13:04                       ` Cornelia Huck
2016-03-16 13:10                         ` Paolo Bonzini
2016-03-16 13:14                           ` Cornelia Huck
2016-03-16 13:15                             ` Paolo Bonzini
2016-03-16 11:52         ` Cornelia Huck
2016-03-16 11:54           ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.