All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] virtio-blk-test failure
@ 2015-02-23 22:22 John Snow
  2015-02-23 22:35 ` Marc Marí
  0 siblings, 1 reply; 3+ messages in thread
From: John Snow @ 2015-02-23 22:22 UTC (permalink / raw)
  To: marc Marí; +Cc: qemu-devel

I've been seeing this failure pop up very occasionally and I can usually 
get the test to pass again by just re-running, but every now and again:

GTESTER check-qtest-x86_64
blkdebug: Suspended request 'A'
blkdebug: Resuming request 'A'
main-loop: WARNING: I/O thread spun for 1000 iterations
main-loop: WARNING: I/O thread spun for 1000 iterations
**
ERROR:/home/bos/jhuston/src/qemu/tests/libqos/virtio.c:91:qvirtio_wait_queue_isr: 
assertion failed: (g_get_monotonic_time() - start_time <= timeout_us)
GTester: last random seed: R02S3ba253e130ac76bbcb0bade0a2d54b2f
[vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio extension. 
Task offloads will be emulated.
make: *** [check-qtest-x86_64] Error 1


I wrote a test loop that runs virtio-blk-test over and over again in a 
loop and saw it fail after 137 runs.

It looks like the culprit is /virtio/blk/pci/msix; if you run only that 
test it could take anywhere from 20-250 runs before you see it fail.

I only did a little bit of debugging, but the QMP command that 
immediately precedes the wait_config_isr call here appears to execute 
successfully.

Any hunches, Marc?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] virtio-blk-test failure
  2015-02-23 22:22 [Qemu-devel] virtio-blk-test failure John Snow
@ 2015-02-23 22:35 ` Marc Marí
  2015-02-23 22:37   ` John Snow
  0 siblings, 1 reply; 3+ messages in thread
From: Marc Marí @ 2015-02-23 22:35 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-devel

El Mon, 23 Feb 2015 17:22:57 -0500
John Snow <jsnow@redhat.com> escribió:
> I've been seeing this failure pop up very occasionally and I can
> usually get the test to pass again by just re-running, but every now
> and again:
> 
> GTESTER check-qtest-x86_64
> blkdebug: Suspended request 'A'
> blkdebug: Resuming request 'A'
> main-loop: WARNING: I/O thread spun for 1000 iterations
> main-loop: WARNING: I/O thread spun for 1000 iterations
> **
> ERROR:/home/bos/jhuston/src/qemu/tests/libqos/virtio.c:91:qvirtio_wait_queue_isr: 
> assertion failed: (g_get_monotonic_time() - start_time <= timeout_us)
> GTester: last random seed: R02S3ba253e130ac76bbcb0bade0a2d54b2f
> [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio
> extension. Task offloads will be emulated.
> make: *** [check-qtest-x86_64] Error 1
> 
> 
> I wrote a test loop that runs virtio-blk-test over and over again in
> a loop and saw it fail after 137 runs.
> 
> It looks like the culprit is /virtio/blk/pci/msix; if you run only
> that test it could take anywhere from 20-250 runs before you see it
> fail.
> 
> I only did a little bit of debugging, but the QMP command that 
> immediately precedes the wait_config_isr call here appears to execute 
> successfully.
> 
> Any hunches, Marc?

This is very similar to the one that took back the virtio MMIO patch.
And the reason is the same, although nobody reported it:

test/libqos/virtio-pci.c:162

    data = readl(dev->config_msix_addr);
    writel(dev->config_msix_addr, 0);
    return data == dev->config_msix_data;

If my memory is correct, this write is acking the interrupt. But it is
always acking, without checking first what was read. There might be a
race condition there.

Tomorrow I'll send a patch.

Thanks
Marc

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] virtio-blk-test failure
  2015-02-23 22:35 ` Marc Marí
@ 2015-02-23 22:37   ` John Snow
  0 siblings, 0 replies; 3+ messages in thread
From: John Snow @ 2015-02-23 22:37 UTC (permalink / raw)
  To: Marc Marí; +Cc: qemu-devel



On 02/23/2015 05:35 PM, Marc Marí wrote:
> El Mon, 23 Feb 2015 17:22:57 -0500
> John Snow <jsnow@redhat.com> escribió:
>> I've been seeing this failure pop up very occasionally and I can
>> usually get the test to pass again by just re-running, but every now
>> and again:
>>
>> GTESTER check-qtest-x86_64
>> blkdebug: Suspended request 'A'
>> blkdebug: Resuming request 'A'
>> main-loop: WARNING: I/O thread spun for 1000 iterations
>> main-loop: WARNING: I/O thread spun for 1000 iterations
>> **
>> ERROR:/home/bos/jhuston/src/qemu/tests/libqos/virtio.c:91:qvirtio_wait_queue_isr:
>> assertion failed: (g_get_monotonic_time() - start_time <= timeout_us)
>> GTester: last random seed: R02S3ba253e130ac76bbcb0bade0a2d54b2f
>> [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio
>> extension. Task offloads will be emulated.
>> make: *** [check-qtest-x86_64] Error 1
>>
>>
>> I wrote a test loop that runs virtio-blk-test over and over again in
>> a loop and saw it fail after 137 runs.
>>
>> It looks like the culprit is /virtio/blk/pci/msix; if you run only
>> that test it could take anywhere from 20-250 runs before you see it
>> fail.
>>
>> I only did a little bit of debugging, but the QMP command that
>> immediately precedes the wait_config_isr call here appears to execute
>> successfully.
>>
>> Any hunches, Marc?
>
> This is very similar to the one that took back the virtio MMIO patch.
> And the reason is the same, although nobody reported it:
>
> test/libqos/virtio-pci.c:162
>
>      data = readl(dev->config_msix_addr);
>      writel(dev->config_msix_addr, 0);
>      return data == dev->config_msix_data;
>
> If my memory is correct, this write is acking the interrupt. But it is
> always acking, without checking first what was read. There might be a
> race condition there.
>
> Tomorrow I'll send a patch.
>
> Thanks
> Marc
>

Awesome, CC me on it and I will run tests, thanks!

--js

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-02-23 22:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-23 22:22 [Qemu-devel] virtio-blk-test failure John Snow
2015-02-23 22:35 ` Marc Marí
2015-02-23 22:37   ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.