All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
@ 2017-03-30  2:01 Ed Swierk
  2017-03-30  2:16 ` Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ed Swierk @ 2017-03-30  2:01 UTC (permalink / raw)
  To: Fam Zheng, Kevin Wolf, Paolo Bonzini, John Snow, qemu-devel, qemu-block

Parts of qemu's block code have changed a lot in recent months but are
not well exercised by current tests.

Subtle bugs have crept in causing assertion failures, hangs and other
crashes in a variety of situations: immediately on start, on first
guest activity, on external snapshot create or commit, on qmp quit
command.

Reproducing these bugs has proved tricky, as each may occur only with
a specific combination of qemu version, block device type (virtio-blk
or virtio-scsi) and iothread enabled or not. In some cases the bug
occurs only after several external snapshot operations. And in some
cases the bug only manifests when a guest is accessing the block
device simultaneously.

I've written an iotest (number 176, for now) that attempts to cover
many of these configurations. Currently it only exercises the external
snapshot create and commit lifted from iotest 118. The new iotest does
this repeatedly in each of 16 combinations:
- no guest / guest
- virtio-blk / virtio-scsi
- no iothread / iothread
- single / repeated external snapshot create+commit

I made some minor changes to the test infrastructure so the new iotest
can deal gracefully with qemu hanging--the test script itself
shouldn't hang. And in all failure modes the test needs to expose
enough console output and other information to diagnose the problem.

The main departure from existing iotests is running a real guest. I
used buildroot to generate a small (~4 MB) Linux kernel with built-in
initrd containing a busybox-based userland. After the iotest launches
qemu, the guest loops writing to the block device, while the test
performs snapshot operations.

I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
2.9.0-rc2. The latter two fail several test cases, all
iothread-enabled. Only 2.7.1 passes all the cases.

Here is the code for the new iotest (I didn't dare email patches with
a 4 MB blob):
https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9

And here is the buildroot I used to generate the guest Linux kernel+initrd:
https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests

Please check out the code and try the new test--particularly anyone
who can also help figure out these failures. (Note that since half the
test cases use an iothread, /dev/kvm must be readable and writable.)

* stable-2.8-staging
- guest, virtio-blk, iothread, single snapshot create+commit: hang on
quit (intermittent)
- guest, virtio-blk, iothread, repeated snapshot create+commit: hang
after 1 iteration
- guest, virtio-scsi, iothread, single snapshot create+commit: hang on
quit (intermittent)
- guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
after 1 iteration

* 2.9.0-rc2
- guest, virtio-blk, iothread, single snapshot create+commit:
"include/block/aio.h:457: aio_enable_external: Assertion
`ctx->external_disable_cnt > 0' failed." after snapshot create
- guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
- guest, virtio-scsi, iothread, single snapshot create+commit: same as above
- guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
- no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
- no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
- no guest, virtio-scsi, iothread, repeated snapshot create+commit:
same as above

--Ed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30  2:01 [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread Ed Swierk
@ 2017-03-30  2:16 ` Eric Blake
  2017-03-30 23:43   ` Laszlo Ersek
  2017-04-07 13:58   ` Max Reitz
  2017-03-30 23:06 ` John Snow
  2017-04-03 16:57 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2 siblings, 2 replies; 9+ messages in thread
From: Eric Blake @ 2017-03-30  2:16 UTC (permalink / raw)
  To: Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini, John Snow,
	qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 4466 bytes --]

On 03/29/2017 09:01 PM, Ed Swierk via Qemu-devel wrote:
> Parts of qemu's block code have changed a lot in recent months but are
> not well exercised by current tests.
> 
> Subtle bugs have crept in causing assertion failures, hangs and other
> crashes in a variety of situations: immediately on start, on first
> guest activity, on external snapshot create or commit, on qmp quit
> command.
> 
> Reproducing these bugs has proved tricky, as each may occur only with
> a specific combination of qemu version, block device type (virtio-blk
> or virtio-scsi) and iothread enabled or not. In some cases the bug
> occurs only after several external snapshot operations. And in some
> cases the bug only manifests when a guest is accessing the block
> device simultaneously.
> 
> I've written an iotest (number 176, for now) that attempts to cover

At least one other thread has already proposed a test 176.  It's
somewhat straightforward to renumber things, but I'm wondering if there
is some even-more-efficient way of reserving test numbers, perhaps
through the wiki, since we are finding that test numbers get reserved
several weeks before actually getting merged into the tree.

> many of these configurations. Currently it only exercises the external
> snapshot create and commit lifted from iotest 118. The new iotest does
> this repeatedly in each of 16 combinations:
> - no guest / guest
> - virtio-blk / virtio-scsi
> - no iothread / iothread
> - single / repeated external snapshot create+commit
> 
> I made some minor changes to the test infrastructure so the new iotest
> can deal gracefully with qemu hanging--the test script itself
> shouldn't hang. And in all failure modes the test needs to expose
> enough console output and other information to diagnose the problem.

Some of those changes sound like they are worth posting to the list
as-is, separate from the actual new test.

> 
> The main departure from existing iotests is running a real guest. I
> used buildroot to generate a small (~4 MB) Linux kernel with built-in
> initrd containing a busybox-based userland. After the iotest launches
> qemu, the guest loops writing to the block device, while the test
> performs snapshot operations.
> 
> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
> 2.9.0-rc2. The latter two fail several test cases, all
> iothread-enabled. Only 2.7.1 passes all the cases.
> 
> Here is the code for the new iotest (I didn't dare email patches with
> a 4 MB blob):
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
> 
> And here is the buildroot I used to generate the guest Linux kernel+initrd:
> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
> 
> Please check out the code and try the new test--particularly anyone
> who can also help figure out these failures. (Note that since half the
> test cases use an iothread, /dev/kvm must be readable and writable.)
> 
> * stable-2.8-staging
> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> 
> * 2.9.0-rc2
> - guest, virtio-blk, iothread, single snapshot create+commit:
> "include/block/aio.h:457: aio_enable_external: Assertion
> `ctx->external_disable_cnt > 0' failed." after snapshot create

It would be nice if we could get to the root cause and squash that one
before 2.9.

> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
> same as above
> 
> --Ed
> 
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30  2:01 [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread Ed Swierk
  2017-03-30  2:16 ` Eric Blake
@ 2017-03-30 23:06 ` John Snow
  2017-03-30 23:26   ` Ed Swierk
  2017-04-03 16:57 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2 siblings, 1 reply; 9+ messages in thread
From: John Snow @ 2017-03-30 23:06 UTC (permalink / raw)
  To: Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block



On 03/29/2017 10:01 PM, Ed Swierk via Qemu-devel wrote:
> Parts of qemu's block code have changed a lot in recent months but are
> not well exercised by current tests.
> 
> Subtle bugs have crept in causing assertion failures, hangs and other
> crashes in a variety of situations: immediately on start, on first
> guest activity, on external snapshot create or commit, on qmp quit
> command.
> 
> Reproducing these bugs has proved tricky, as each may occur only with
> a specific combination of qemu version, block device type (virtio-blk
> or virtio-scsi) and iothread enabled or not. In some cases the bug
> occurs only after several external snapshot operations. And in some
> cases the bug only manifests when a guest is accessing the block
> device simultaneously.
> 
> I've written an iotest (number 176, for now) that attempts to cover
> many of these configurations. Currently it only exercises the external
> snapshot create and commit lifted from iotest 118. The new iotest does
> this repeatedly in each of 16 combinations:
> - no guest / guest
> - virtio-blk / virtio-scsi
> - no iothread / iothread
> - single / repeated external snapshot create+commit
> 
> I made some minor changes to the test infrastructure so the new iotest
> can deal gracefully with qemu hanging--the test script itself
> shouldn't hang. And in all failure modes the test needs to expose
> enough console output and other information to diagnose the problem.
> 
> The main departure from existing iotests is running a real guest. I
> used buildroot to generate a small (~4 MB) Linux kernel with built-in
> initrd containing a busybox-based userland. After the iotest launches
> qemu, the guest loops writing to the block device, while the test
> performs snapshot operations.
> 
> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
> 2.9.0-rc2. The latter two fail several test cases, all
> iothread-enabled. Only 2.7.1 passes all the cases.
> 
> Here is the code for the new iotest (I didn't dare email patches with
> a 4 MB blob):
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
> 
> And here is the buildroot I used to generate the guest Linux kernel+initrd:
> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
> 
> Please check out the code and try the new test--particularly anyone
> who can also help figure out these failures. (Note that since half the
> test cases use an iothread, /dev/kvm must be readable and writable.)
> 
> * stable-2.8-staging
> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> 
> * 2.9.0-rc2
> - guest, virtio-blk, iothread, single snapshot create+commit:
> "include/block/aio.h:457: aio_enable_external: Assertion
> `ctx->external_disable_cnt > 0' failed." after snapshot create
> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
> same as above
> 

Do you mean to say that all of these 2.9.0-rc2 cases produce the same
aio.h assertion?

> --Ed
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30 23:06 ` John Snow
@ 2017-03-30 23:26   ` Ed Swierk
  0 siblings, 0 replies; 9+ messages in thread
From: Ed Swierk @ 2017-03-30 23:26 UTC (permalink / raw)
  To: John Snow; +Cc: Fam Zheng, Kevin Wolf, Paolo Bonzini, qemu-devel, qemu-block

On Thu, Mar 30, 2017 at 4:06 PM, John Snow <jsnow@redhat.com> wrote:
> On 03/29/2017 10:01 PM, Ed Swierk via Qemu-devel wrote:
>> * 2.9.0-rc2
>> - guest, virtio-blk, iothread, single snapshot create+commit:
>> "include/block/aio.h:457: aio_enable_external: Assertion
>> `ctx->external_disable_cnt > 0' failed." after snapshot create
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
>> same as above
>
> Do you mean to say that all of these 2.9.0-rc2 cases produce the same
> aio.h assertion?

Correct.

--Ed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30  2:16 ` Eric Blake
@ 2017-03-30 23:43   ` Laszlo Ersek
  2017-04-07 13:58   ` Max Reitz
  1 sibling, 0 replies; 9+ messages in thread
From: Laszlo Ersek @ 2017-03-30 23:43 UTC (permalink / raw)
  To: Eric Blake, Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini,
	John Snow, qemu-devel, qemu-block

On 03/30/17 04:16, Eric Blake wrote:
> On 03/29/2017 09:01 PM, Ed Swierk via Qemu-devel wrote:
>> Parts of qemu's block code have changed a lot in recent months but are
>> not well exercised by current tests.
>>
>> Subtle bugs have crept in causing assertion failures, hangs and other
>> crashes in a variety of situations: immediately on start, on first
>> guest activity, on external snapshot create or commit, on qmp quit
>> command.
>>
>> Reproducing these bugs has proved tricky, as each may occur only with
>> a specific combination of qemu version, block device type (virtio-blk
>> or virtio-scsi) and iothread enabled or not. In some cases the bug
>> occurs only after several external snapshot operations. And in some
>> cases the bug only manifests when a guest is accessing the block
>> device simultaneously.
>>
>> I've written an iotest (number 176, for now) that attempts to cover
> 
> At least one other thread has already proposed a test 176.  It's
> somewhat straightforward to renumber things, but I'm wondering if there
> is some even-more-efficient way of reserving test numbers, perhaps
> through the wiki, since we are finding that test numbers get reserved
> several weeks before actually getting merged into the tree.

UEFI / edk2 solves this problem elegantly by naming everything with
globally unique identifiers, so if you need a new thing, just run
"uuidgen". No coordination required.

In practice it would result in subjects like

[Qemu-devel] [PATCH for-2.9] iotests: Fix test
3dec30b6-f69b-4eb0-8f89-87063433c830

I shall now retreat to my cave.

Laszlo
;)

> 
>> many of these configurations. Currently it only exercises the external
>> snapshot create and commit lifted from iotest 118. The new iotest does
>> this repeatedly in each of 16 combinations:
>> - no guest / guest
>> - virtio-blk / virtio-scsi
>> - no iothread / iothread
>> - single / repeated external snapshot create+commit
>>
>> I made some minor changes to the test infrastructure so the new iotest
>> can deal gracefully with qemu hanging--the test script itself
>> shouldn't hang. And in all failure modes the test needs to expose
>> enough console output and other information to diagnose the problem.
> 
> Some of those changes sound like they are worth posting to the list
> as-is, separate from the actual new test.
> 
>>
>> The main departure from existing iotests is running a real guest. I
>> used buildroot to generate a small (~4 MB) Linux kernel with built-in
>> initrd containing a busybox-based userland. After the iotest launches
>> qemu, the guest loops writing to the block device, while the test
>> performs snapshot operations.
>>
>> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
>> 2.9.0-rc2. The latter two fail several test cases, all
>> iothread-enabled. Only 2.7.1 passes all the cases.
>>
>> Here is the code for the new iotest (I didn't dare email patches with
>> a 4 MB blob):
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
>>
>> And here is the buildroot I used to generate the guest Linux kernel+initrd:
>> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
>>
>> Please check out the code and try the new test--particularly anyone
>> who can also help figure out these failures. (Note that since half the
>> test cases use an iothread, /dev/kvm must be readable and writable.)
>>
>> * stable-2.8-staging
>> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>>
>> * 2.9.0-rc2
>> - guest, virtio-blk, iothread, single snapshot create+commit:
>> "include/block/aio.h:457: aio_enable_external: Assertion
>> `ctx->external_disable_cnt > 0' failed." after snapshot create
> 
> It would be nice if we could get to the root cause and squash that one
> before 2.9.
> 
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
>> same as above
>>
>> --Ed
>>
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [Qemu-block] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30  2:01 [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread Ed Swierk
  2017-03-30  2:16 ` Eric Blake
  2017-03-30 23:06 ` John Snow
@ 2017-04-03 16:57 ` Stefan Hajnoczi
  2017-04-04 14:13   ` Alex Bennée
  2 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-04-03 16:57 UTC (permalink / raw)
  To: Ed Swierk
  Cc: Fam Zheng, Kevin Wolf, Paolo Bonzini, John Snow, qemu-devel,
	qemu-block, Alex Bennée

[-- Attachment #1: Type: text/plain, Size: 5664 bytes --]

On Wed, Mar 29, 2017 at 07:01:38PM -0700, Ed Swierk wrote:
> Parts of qemu's block code have changed a lot in recent months but are
> not well exercised by current tests.
> 
> Subtle bugs have crept in causing assertion failures, hangs and other
> crashes in a variety of situations: immediately on start, on first
> guest activity, on external snapshot create or commit, on qmp quit
> command.
> 
> Reproducing these bugs has proved tricky, as each may occur only with
> a specific combination of qemu version, block device type (virtio-blk
> or virtio-scsi) and iothread enabled or not. In some cases the bug
> occurs only after several external snapshot operations. And in some
> cases the bug only manifests when a guest is accessing the block
> device simultaneously.
> 
> I've written an iotest (number 176, for now) that attempts to cover
> many of these configurations. Currently it only exercises the external
> snapshot create and commit lifted from iotest 118. The new iotest does
> this repeatedly in each of 16 combinations:
> - no guest / guest
> - virtio-blk / virtio-scsi
> - no iothread / iothread
> - single / repeated external snapshot create+commit

Thanks Ed!  This is has a lot of potential.  I see three different
issues that can be discussed separately:

1. Urgent 2.9 bug fix for `ctx->external_disable_cnt > 0' failed
assertion.  I believe you've already started a separate email thread
about it called "Assertion failure taking external snapshot with virtio
drive + iothread".

2. QEMU 2.8 stable hang.  Less urgent but worth understanding, perhaps
via git-bisect against QEMU 2.9.

3. Minor iotest enhancements.  Please send a separate patch series.

4. How to automate tests with real Linux guests?  This is a complex
topic and probably what we should discuss in this email thread.

The buildroot + busybox approach is good for a small set of sanity
tests.  There was a similar attempt here:
https://github.com/stsquad/qemu-jeos

Building from source becomes a challenge when other people want to add
software to test other areas of QEMU.  The process also requires
attention to maintain the image over time (e.g. as host build
environments change).

There are image builder tools like virt-builder and mkosi for building
bootable virtual machine images based on standard Linux distros:
http://libguestfs.org/virt-builder.1.html
https://github.com/systemd/mkosi

This eliminates the build-from-source hassles and gives us a full Linux
guest environment.  Booting is very fast with mkosi so the advantage to
custom building a minimal image is negligible.

My suggestion is:

Let's pick an image builder tool like virt-builder and keep a single
build script per guest architecture (e.g.  build-test-os-x86_64.sh).
All tests for that architecture run against the same disk image.

It's easy to add additional software to the disk image by modifying the
build script.

A Makefile ensures that the image file gets rebuilt if the build script
has changed.

> 
> I made some minor changes to the test infrastructure so the new iotest
> can deal gracefully with qemu hanging--the test script itself
> shouldn't hang. And in all failure modes the test needs to expose
> enough console output and other information to diagnose the problem.
> 
> The main departure from existing iotests is running a real guest. I
> used buildroot to generate a small (~4 MB) Linux kernel with built-in
> initrd containing a busybox-based userland. After the iotest launches
> qemu, the guest loops writing to the block device, while the test
> performs snapshot operations.
> 
> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
> 2.9.0-rc2. The latter two fail several test cases, all
> iothread-enabled. Only 2.7.1 passes all the cases.
> 
> Here is the code for the new iotest (I didn't dare email patches with
> a 4 MB blob):
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
> 
> And here is the buildroot I used to generate the guest Linux kernel+initrd:
> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
> 
> Please check out the code and try the new test--particularly anyone
> who can also help figure out these failures. (Note that since half the
> test cases use an iothread, /dev/kvm must be readable and writable.)
> 
> * stable-2.8-staging
> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> 
> * 2.9.0-rc2
> - guest, virtio-blk, iothread, single snapshot create+commit:
> "include/block/aio.h:457: aio_enable_external: Assertion
> `ctx->external_disable_cnt > 0' failed." after snapshot create
> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
> same as above
> 
> --Ed
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [Qemu-block] New iotest repros failures on virtio external snapshot with iothread
  2017-04-03 16:57 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2017-04-04 14:13   ` Alex Bennée
  2017-04-06  9:10     ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Bennée @ 2017-04-04 14:13 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini, John Snow,
	qemu-devel, qemu-block


Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Wed, Mar 29, 2017 at 07:01:38PM -0700, Ed Swierk wrote:
>> Parts of qemu's block code have changed a lot in recent months but are
>> not well exercised by current tests.
<snip>
>
> 4. How to automate tests with real Linux guests?  This is a complex
> topic and probably what we should discuss in this email thread.
>
> The buildroot + busybox approach is good for a small set of sanity
> tests.  There was a similar attempt here:
> https://github.com/stsquad/qemu-jeos
>
> Building from source becomes a challenge when other people want to add
> software to test other areas of QEMU.  The process also requires
> attention to maintain the image over time (e.g. as host build
> environments change).
>
> There are image builder tools like virt-builder and mkosi for building
> bootable virtual machine images based on standard Linux distros:
> http://libguestfs.org/virt-builder.1.html
> https://github.com/systemd/mkosi
>
> This eliminates the build-from-source hassles and gives us a full Linux
> guest environment.  Booting is very fast with mkosi so the advantage to
> custom building a minimal image is negligible.

Does it entirely? If your building a ARM guest on x86 how do you ensure
the cross-compilers are correct for the kernel and userspace?

> My suggestion is:
>
> Let's pick an image builder tool like virt-builder and keep a single
> build script per guest architecture (e.g.  build-test-os-x86_64.sh).
> All tests for that architecture run against the same disk image.
>
> It's easy to add additional software to the disk image by modifying the
> build script.
>
> A Makefile ensures that the image file gets rebuilt if the build script
> has changed.

I have experimented building LTP for foreign guests inside docker
images. I expect the docker build image could be extended to build full
kernel and file-systems in a known environment, possibly using
virt-builder to do it.

>
>>
>> I made some minor changes to the test infrastructure so the new iotest
>> can deal gracefully with qemu hanging--the test script itself
>> shouldn't hang. And in all failure modes the test needs to expose
>> enough console output and other information to diagnose the problem.
>>
>> The main departure from existing iotests is running a real guest. I
>> used buildroot to generate a small (~4 MB) Linux kernel with built-in
>> initrd containing a busybox-based userland. After the iotest launches
>> qemu, the guest loops writing to the block device, while the test
>> performs snapshot operations.
>>
>> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
>> 2.9.0-rc2. The latter two fail several test cases, all
>> iothread-enabled. Only 2.7.1 passes all the cases.
>>
>> Here is the code for the new iotest (I didn't dare email patches with
>> a 4 MB blob):
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
>>
>> And here is the buildroot I used to generate the guest Linux kernel+initrd:
>> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
>>
>> Please check out the code and try the new test--particularly anyone
>> who can also help figure out these failures. (Note that since half the
>> test cases use an iothread, /dev/kvm must be readable and writable.)
>>
>> * stable-2.8-staging
>> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>>
>> * 2.9.0-rc2
>> - guest, virtio-blk, iothread, single snapshot create+commit:
>> "include/block/aio.h:457: aio_enable_external: Assertion
>> `ctx->external_disable_cnt > 0' failed." after snapshot create
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
>> same as above
>>
>> --Ed
>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [Qemu-block] New iotest repros failures on virtio external snapshot with iothread
  2017-04-04 14:13   ` Alex Bennée
@ 2017-04-06  9:10     ` Stefan Hajnoczi
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-04-06  9:10 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini, John Snow,
	qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2890 bytes --]

On Tue, Apr 04, 2017 at 03:13:44PM +0100, Alex Bennée wrote:
> 
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> > On Wed, Mar 29, 2017 at 07:01:38PM -0700, Ed Swierk wrote:
> >> Parts of qemu's block code have changed a lot in recent months but are
> >> not well exercised by current tests.
> <snip>
> >
> > 4. How to automate tests with real Linux guests?  This is a complex
> > topic and probably what we should discuss in this email thread.
> >
> > The buildroot + busybox approach is good for a small set of sanity
> > tests.  There was a similar attempt here:
> > https://github.com/stsquad/qemu-jeos
> >
> > Building from source becomes a challenge when other people want to add
> > software to test other areas of QEMU.  The process also requires
> > attention to maintain the image over time (e.g. as host build
> > environments change).
> >
> > There are image builder tools like virt-builder and mkosi for building
> > bootable virtual machine images based on standard Linux distros:
> > http://libguestfs.org/virt-builder.1.html
> > https://github.com/systemd/mkosi
> >
> > This eliminates the build-from-source hassles and gives us a full Linux
> > guest environment.  Booting is very fast with mkosi so the advantage to
> > custom building a minimal image is negligible.
> 
> Does it entirely? If your building a ARM guest on x86 how do you ensure
> the cross-compilers are correct for the kernel and userspace?

virt-builder and mkosi install binary distros like Debian or Fedora.
They do not compile from source.

virt-builder supports cross-arch image building so it's a good starting
point for QEMU guest images.

> > My suggestion is:
> >
> > Let's pick an image builder tool like virt-builder and keep a single
> > build script per guest architecture (e.g.  build-test-os-x86_64.sh).
> > All tests for that architecture run against the same disk image.
> >
> > It's easy to add additional software to the disk image by modifying the
> > build script.
> >
> > A Makefile ensures that the image file gets rebuilt if the build script
> > has changed.
> 
> I have experimented building LTP for foreign guests inside docker
> images. I expect the docker build image could be extended to build full
> kernel and file-systems in a known environment, possibly using
> virt-builder to do it.

My concern with building from source is that extending and maintaining
the infrastructure does not scale.  These efforts fizzle out like
qemu-jeos because no one really has time to maintain them.  Few people
want to extend them because they are complex and brittle.

Image builder tools skip the complexity of build-from-source.  You start
with something like a Dockerfile or virt-builder command-line.  It's
much easier for people to contribute and there is much less that can go
wrong at image build time.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
  2017-03-30  2:16 ` Eric Blake
  2017-03-30 23:43   ` Laszlo Ersek
@ 2017-04-07 13:58   ` Max Reitz
  1 sibling, 0 replies; 9+ messages in thread
From: Max Reitz @ 2017-04-07 13:58 UTC (permalink / raw)
  To: Eric Blake, Ed Swierk, Fam Zheng, Kevin Wolf, Paolo Bonzini,
	John Snow, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On 30.03.2017 04:16, Eric Blake wrote:
> On 03/29/2017 09:01 PM, Ed Swierk via Qemu-devel wrote:
>> Parts of qemu's block code have changed a lot in recent months but are
>> not well exercised by current tests.
>>
>> Subtle bugs have crept in causing assertion failures, hangs and other
>> crashes in a variety of situations: immediately on start, on first
>> guest activity, on external snapshot create or commit, on qmp quit
>> command.
>>
>> Reproducing these bugs has proved tricky, as each may occur only with
>> a specific combination of qemu version, block device type (virtio-blk
>> or virtio-scsi) and iothread enabled or not. In some cases the bug
>> occurs only after several external snapshot operations. And in some
>> cases the bug only manifests when a guest is accessing the block
>> device simultaneously.
>>
>> I've written an iotest (number 176, for now) that attempts to cover
> 
> At least one other thread has already proposed a test 176.  It's
> somewhat straightforward to renumber things, but I'm wondering if there
> is some even-more-efficient way of reserving test numbers, perhaps
> through the wiki, since we are finding that test numbers get reserved
> several weeks before actually getting merged into the tree.

As a maintainer, I don't mind collisions; I can and will rename tests.
For all I care, you can even name all of your tests 001.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-07 13:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-30  2:01 [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread Ed Swierk
2017-03-30  2:16 ` Eric Blake
2017-03-30 23:43   ` Laszlo Ersek
2017-04-07 13:58   ` Max Reitz
2017-03-30 23:06 ` John Snow
2017-03-30 23:26   ` Ed Swierk
2017-04-03 16:57 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-04-04 14:13   ` Alex Bennée
2017-04-06  9:10     ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.