All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed"
@ 2016-06-24 13:40 Roman Pen
  2016-06-24 14:46 ` Kevin Wolf
  2016-06-27 16:01 ` Stefan Hajnoczi
  0 siblings, 2 replies; 5+ messages in thread
From: Roman Pen @ 2016-06-24 13:40 UTC (permalink / raw)
  Cc: Roman Pen, Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi, qemu-devel

This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a,
which causes MQ stuck while doing IO thru virtio_blk.

I reproduce very easily this stuck on recent v4 Stefan's set
using num-queues=4:

  "[PATCH v4 0/7] virtio-blk: multiqueue support"
  https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg05999.html

Some debug output from guest:
-----------------------------

[root@andbd-vm ~]# cat /sys/block/vda/inflight
     106       98
[root@andbd-vm ~]# cat /sys/block/vda/mq/*/tags
nr_tags=128, reserved_tags=0, bits_per_word=5
nr_free=89, nr_reserved=0
active_queues=0
nr_tags=128, reserved_tags=0, bits_per_word=5
nr_free=83, nr_reserved=0
active_queues=0
nr_tags=128, reserved_tags=0, bits_per_word=5
nr_free=31, nr_reserved=0
active_queues=0
nr_tags=128, reserved_tags=0, bits_per_word=5
nr_free=105, nr_reserved=0
active_queues=0

Fio configuration:
------------------

[global]
description=Emulation of Storage Server Access Pattern
bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4
fadvise_hint=0
rw=randrw:2
direct=1

ioengine=libaio
iodepth=64
iodepth_batch_submit=64
iodepth_batch_complete=64
numjobs=8
gtod_reduce=1
group_reporting=1

time_based=1
runtime=30

[job]
filename=/dev/vda

VM configuration:
-----------------

-object iothread,id=t0 \
-drive if=none,id=d0,file=/dev/nullb0,format=raw,snapshot=off,cache=none,aio=native \
-device virtio-blk-pci,drive=d0,iothread=t0,num-queues=4,disable-modern=off,disable-legacy=on \

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org
---
 block/linux-aio.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index e468960..fe7cece 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque)
     if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
         ioq_submit(s);
     }
-
-    qemu_bh_cancel(s->completion_bh);
 }
 
 static void qemu_laio_completion_cb(EventNotifier *e)
@@ -158,7 +156,7 @@ static void qemu_laio_completion_cb(EventNotifier *e)
     LinuxAioState *s = container_of(e, LinuxAioState, e);
 
     if (event_notifier_test_and_clear(&s->e)) {
-        qemu_laio_completion_bh(s);
+        qemu_bh_schedule(s->completion_bh);
     }
 }
 
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed"
  2016-06-24 13:40 [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed" Roman Pen
@ 2016-06-24 14:46 ` Kevin Wolf
  2016-06-27 15:09   ` Stefan Hajnoczi
  2016-06-28  8:41   ` Stefan Hajnoczi
  2016-06-27 16:01 ` Stefan Hajnoczi
  1 sibling, 2 replies; 5+ messages in thread
From: Kevin Wolf @ 2016-06-24 14:46 UTC (permalink / raw)
  To: Roman Pen; +Cc: Paolo Bonzini, Stefan Hajnoczi, qemu-devel

Am 24.06.2016 um 15:40 hat Roman Pen geschrieben:
> This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a,
> which causes MQ stuck while doing IO thru virtio_blk.

It would be good to have a theory why this happens.

> diff --git a/block/linux-aio.c b/block/linux-aio.c
> index e468960..fe7cece 100644
> --- a/block/linux-aio.c
> +++ b/block/linux-aio.c
> @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque)
>      if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
>          ioq_submit(s);
>      }
> -
> -    qemu_bh_cancel(s->completion_bh);
>  }

Maybe if a nested event loops cancels the BH, it's missing on the next
loop iteration. Before my patch, the nested callback happened to leave
an additional BH around which the outer one actually needs.

I find this a bit ugly, but if we're okay with this mechanism we could
add a counter for the nesting level and only cancel on the top level.

If you find it as ugly as I do, a cleaner solution would be to schedule
the BH inside the loop.

> @@ -158,7 +156,7 @@ static void qemu_laio_completion_cb(EventNotifier *e)
>      LinuxAioState *s = container_of(e, LinuxAioState, e);
>  
>      if (event_notifier_test_and_clear(&s->e)) {
> -        qemu_laio_completion_bh(s);
> +        qemu_bh_schedule(s->completion_bh);
>      }
>  }

I can't see how this hunk would make a difference. Can you confirm that
just the first hunk is enough to fix the problem?

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed"
  2016-06-24 14:46 ` Kevin Wolf
@ 2016-06-27 15:09   ` Stefan Hajnoczi
  2016-06-28  8:41   ` Stefan Hajnoczi
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-06-27 15:09 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Roman Pen, Paolo Bonzini, qemu-devel, Stefan Hajnoczi

On Fri, Jun 24, 2016 at 3:46 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 24.06.2016 um 15:40 hat Roman Pen geschrieben:
>> This reverts commit ccb9dc10129954d0bcd7814298ed445e684d5a2a,
>> which causes MQ stuck while doing IO thru virtio_blk.
>
> It would be good to have a theory why this happens.

It's worth taking the batch notify BH out of the equation in
virtio_blk_data_plane_notify():

-    set_bit(virtio_get_queue_index(vq), s->batch_notify_vqs);
-    qemu_bh_schedule(s->bh);
+    if (virtio_should_notify(s->vdev, vq)) {
+        event_notifier_set(virtio_queue_get_guest_notifier(vq));
+    }

I wonder if that makes any difference?

I don't have a concrete theory why batch notify interferes with
Kevin's patch though.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed"
  2016-06-24 13:40 [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed" Roman Pen
  2016-06-24 14:46 ` Kevin Wolf
@ 2016-06-27 16:01 ` Stefan Hajnoczi
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-06-27 16:01 UTC (permalink / raw)
  To: Roman Pen; +Cc: Kevin Wolf, Paolo Bonzini, Stefan Hajnoczi, qemu-devel

On Fri, Jun 24, 2016 at 2:40 PM, Roman Pen
<roman.penyaev@profitbricks.com> wrote:
> diff --git a/block/linux-aio.c b/block/linux-aio.c
> index e468960..fe7cece 100644
> --- a/block/linux-aio.c
> +++ b/block/linux-aio.c
> @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque)
>      if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
>          ioq_submit(s);
>      }
> -
> -    qemu_bh_cancel(s->completion_bh);

This was the cause.  I've found the root cause and will send a patch.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed"
  2016-06-24 14:46 ` Kevin Wolf
  2016-06-27 15:09   ` Stefan Hajnoczi
@ 2016-06-28  8:41   ` Stefan Hajnoczi
  1 sibling, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-06-28  8:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Roman Pen, Paolo Bonzini, qemu-devel, Stefan Hajnoczi

On Fri, Jun 24, 2016 at 3:46 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> diff --git a/block/linux-aio.c b/block/linux-aio.c
>> index e468960..fe7cece 100644
>> --- a/block/linux-aio.c
>> +++ b/block/linux-aio.c
>> @@ -149,8 +149,6 @@ static void qemu_laio_completion_bh(void *opaque)
>>      if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
>>          ioq_submit(s);
>>      }
>> -
>> -    qemu_bh_cancel(s->completion_bh);
>>  }
>
> Maybe if a nested event loops cancels the BH, it's missing on the next
> loop iteration. Before my patch, the nested callback happened to leave
> an additional BH around which the outer one actually needs.

The scenario you described is:

qemu_laio_completion_bh()
 -> cb1()
     -> aio_poll()
         -> qemu_laio_completion_bh()
         <- qemu_laio_completion_bh() (cancel BH)
     <- aio_poll()
 <- cb1()
 -> cb2()
     -> aio_poll()
        (hang!)

This hang seems impossible because the qemu_laio_completion_bh() loop
processes all pending events.  Therefore cb1() consumes all pending
events and cb2() will not poll.

If new I/O was submitted during cb1() and cb2() waits for it, then the
eventfd will become readable upon completion and cb2() does not hang
in that case either.

If, instead of the original scenario, cb1() nests deeper then the BH
is still scheduled and events will be processed without a hang.

In summary, the job of scheduling the BH is not to force all nested
callbacks to call qemu_laio_completion_bh().  Only the first nested
callback needs the BH so that all pending events will be processed.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-28  8:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-24 13:40 [Qemu-devel] [PATCH 1/1] Revert "linux-aio: Cancel BH if not needed" Roman Pen
2016-06-24 14:46 ` Kevin Wolf
2016-06-27 15:09   ` Stefan Hajnoczi
2016-06-28  8:41   ` Stefan Hajnoczi
2016-06-27 16:01 ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.