All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs
@ 2014-07-10 15:11 Daniel P. Berrange
  2014-07-10 15:21 ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel P. Berrange @ 2014-07-10 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Fam Zheng, Stefan Hajnoczi

I've spent the last week debugging an issue that is hitting OpenStack
with drive-mirror/block job usage.

The OpenStack automated test system runs a massive integration test
suite against every commit to any part of OpenStack. Amongst the many
things tested is a live snapshot feature which relies on QMEU's
drive-mirror functionality. When this was enabled in the test suite,
we started seeing failures in the automated test harness.

Specifically we are seeing that a monitor command for 'query-block-jobs'
never replies to libvirt. After 3 minutes of waiting the test harness
times out and kills the VM. When working normally the entire test will
complete in just a couple of seconds, so we don't think the 3 minute
timeout is hitting a false positive.

For each commit going into openstack the tests trigger about 200
uses of drive-mirror, and we have about a 50% change of failing on
any single commit. So the rate failure of this problem is only
around 1 in 400 uses of drive-mirror, but that's still enough to
totally doom our automated testing rig, since a failing test blocks
the commit from merging into GIT master.

Unfortunately it is a total PITA to debug this. No one has ever
managed to reproduce the problem outside of our automated test
system environment, even when running the same tests locally and
we can't log into the test systems to get GDB traces or install
custom QEMU builds.

The best I can do is to collect debug logs from libvirtd, and get
stdio from QEMU. The QEMU stderr/stdout shows nothing at all. The
libvirtd log shows the following sequence of monitor interactions
with QEMU:

1. Libvirt tries to cancel any potential stale jobs:

msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-9"}
reply={"id": "libvirt-9", "error": {"class": "DeviceNotActive", "desc": "No active block job on device 'drive-virtio-disk0'"}}

msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-10"}
reply={"id": "libvirt-10", "error": {"class": "DeviceNotActive", "desc": "No active block job on device 'drive-virtio-disk0'"}}

2. Libvirt starts the drive mirror operation

msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"/opt/stack/data/nova/instances/snapshots/tmp79q8xw/9739fea13dfe4e4794f85b57d6572d81.delta","speed":0,"sync":"top","mode":"existing","format":"qcow2"},"id":"libvirt-11"}
reply={"return": {}, "id": "libvirt-11"}

3. Libvirt checks progress of the block jobs
   until offset == len

msg={"execute":"query-block-jobs","id":"libvirt-12"}
reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 0, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-12"}
event={"timestamp": {"seconds": 1404903950, "microseconds": 273185}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 25165824, "offset": 25165824, "speed": 0, "type": "mirror"}}

msg={"execute":"query-block-jobs","id":"libvirt-13"}
reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": false, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-13"}

4. Libvirt cleans up the block job:

msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-14"}
reply={"return": {}, "id": "libvirt-14"}

5. Libvirt waits for cleanup to complete:

msg={"execute":"query-block-jobs","id":"libvirt-15"}
reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-15"}

msg={"execute":"query-block-jobs","id":"libvirt-16"}
reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-16"}

msg={"execute":"query-block-jobs","id":"libvirt-17"}
<...hang...>

So we can see this last 'query-block-jobs' command hangs. I've looked at
the code for handling this monitor command and struggling to come up with
any ideas of why this might hang.  My best idea was the bdrv_iterate()
call it does might be happening at the same time as another thread modifies
the list, but debugging on a local QEMU shows no changes to the list at
all due to drive-mirror/block jobs, so that doesn't seem to be the cause.

Can anyone conceive of a reason why this monitor command might hang ?

Our test systems run Ubuntu 14.04 which has QEMU version 2.0.0.

I see a couple of changes in GIT master to blockjob.c sinc v2.0.0 but
nothing that claims to be solving a hang.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs
  2014-07-10 15:11 [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs Daniel P. Berrange
@ 2014-07-10 15:21 ` Paolo Bonzini
  2014-07-10 15:53   ` Daniel P. Berrange
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2014-07-10 15:21 UTC (permalink / raw)
  To: Daniel P. Berrange, qemu-devel; +Cc: Kevin Wolf, Fam Zheng, Stefan Hajnoczi

Il 10/07/2014 17:11, Daniel P. Berrange ha scritto:
> I've spent the last week debugging an issue that is hitting OpenStack
> with drive-mirror/block job usage.
>
> Specifically we are seeing that a monitor command for 'query-block-jobs'
> never replies to libvirt. After 3 minutes of waiting the test harness
> times out and kills the VM. When working normally the entire test will
> complete in just a couple of seconds, so we don't think the 3 minute
> timeout is hitting a false positive.
>
> [...] The rate failure of this problem is only
> around 1 in 400 uses of drive-mirror. [...] No one has ever
> managed to reproduce the problem outside of our automated test
> system environment, even when running the same tests locally and
> we can't log into the test systems to get GDB traces or install
> custom QEMU builds.
>
> The best I can do is to collect debug logs from libvirtd, and get
> stdio from QEMU. The QEMU stderr/stdout shows nothing at all. The
> libvirtd log shows the following sequence of monitor interactions
> with QEMU:
>
> [...]
>
> 5. Libvirt waits for cleanup to complete:
>
> msg={"execute":"query-block-jobs","id":"libvirt-15"}
> reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-15"}
>
> msg={"execute":"query-block-jobs","id":"libvirt-16"}
> reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-16"}
>
> msg={"execute":"query-block-jobs","id":"libvirt-17"}
> <...hang...>
>
> So we can see this last 'query-block-jobs' command hangs. I've looked at
> the code for handling this monitor command and struggling to come up with
> any ideas of why this might hang.  My best idea was the bdrv_iterate()
> call it does might be happening at the same time as another thread modifies
> the list, but debugging on a local QEMU shows no changes to the list at
> all due to drive-mirror/block jobs, so that doesn't seem to be the cause.

No, all these modifications are anyway done with the big QEMU lock.  All 
multitasking happens in coroutines, and the preemption points only 
happen when you have AIO.

Can you install a custom QEMU?  How many megabytes of stdout can your 
test rig tolerate?  Any chance you can collect other files (traces)?

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs
  2014-07-10 15:21 ` Paolo Bonzini
@ 2014-07-10 15:53   ` Daniel P. Berrange
  2014-07-10 16:02     ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel P. Berrange @ 2014-07-10 15:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, Fam Zheng, qemu-devel, Stefan Hajnoczi

On Thu, Jul 10, 2014 at 05:21:36PM +0200, Paolo Bonzini wrote:
> Il 10/07/2014 17:11, Daniel P. Berrange ha scritto:
> >I've spent the last week debugging an issue that is hitting OpenStack
> >with drive-mirror/block job usage.
> >
> >Specifically we are seeing that a monitor command for 'query-block-jobs'
> >never replies to libvirt. After 3 minutes of waiting the test harness
> >times out and kills the VM. When working normally the entire test will
> >complete in just a couple of seconds, so we don't think the 3 minute
> >timeout is hitting a false positive.
> >
> >[...] The rate failure of this problem is only
> >around 1 in 400 uses of drive-mirror. [...] No one has ever
> >managed to reproduce the problem outside of our automated test
> >system environment, even when running the same tests locally and
> >we can't log into the test systems to get GDB traces or install
> >custom QEMU builds.
> >
> >The best I can do is to collect debug logs from libvirtd, and get
> >stdio from QEMU. The QEMU stderr/stdout shows nothing at all. The
> >libvirtd log shows the following sequence of monitor interactions
> >with QEMU:
> >
> >[...]
> >
> >5. Libvirt waits for cleanup to complete:
> >
> >msg={"execute":"query-block-jobs","id":"libvirt-15"}
> >reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-15"}
> >
> >msg={"execute":"query-block-jobs","id":"libvirt-16"}
> >reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 25165824, "offset": 25165824, "paused": false, "speed": 0, "type": "mirror"}], "id": "libvirt-16"}
> >
> >msg={"execute":"query-block-jobs","id":"libvirt-17"}
> ><...hang...>
> >
> >So we can see this last 'query-block-jobs' command hangs. I've looked at
> >the code for handling this monitor command and struggling to come up with
> >any ideas of why this might hang.  My best idea was the bdrv_iterate()
> >call it does might be happening at the same time as another thread modifies
> >the list, but debugging on a local QEMU shows no changes to the list at
> >all due to drive-mirror/block jobs, so that doesn't seem to be the cause.
> 
> No, all these modifications are anyway done with the big QEMU lock.  All
> multitasking happens in coroutines, and the preemption points only happen
> when you have AIO.
> 
> Can you install a custom QEMU?  How many megabytes of stdout can your test
> rig tolerate?  Any chance you can collect other files (traces)?

I can possibly come up with some gross hack to wget a qemu binary from
an external host at the start of the test. Can generate 100MB of test
logs without it being an issue.

What sort of info would be helpful to collect. I can't easily get
arbitrary files out of the test hosts, but if there's a way to get the
debug info into QEMU's  stdout/stderr logs then we're collecting those
already.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs
  2014-07-10 15:53   ` Daniel P. Berrange
@ 2014-07-10 16:02     ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2014-07-10 16:02 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Kevin Wolf, Fam Zheng, qemu-devel, Stefan Hajnoczi

Il 10/07/2014 17:53, Daniel P. Berrange ha scritto:
>> > Can you install a custom QEMU?  How many megabytes of stdout can your test
>> > rig tolerate?  Any chance you can collect other files (traces)?
> I can possibly come up with some gross hack to wget a qemu binary from
> an external host at the start of the test. Can generate 100MB of test
> logs without it being an issue.
>
> What sort of info would be helpful to collect. I can't easily get
> arbitrary files out of the test hosts, but if there's a way to get the
> debug info into QEMU's  stdout/stderr logs then we're collecting those
> already.

If you can compile a custom QEMU, then yes it's possible.

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-10 16:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-10 15:11 [Qemu-devel] Help on possible hang in drive-mirror / query-block-jobs Daniel P. Berrange
2014-07-10 15:21 ` Paolo Bonzini
2014-07-10 15:53   ` Daniel P. Berrange
2014-07-10 16:02     ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.