* [Qemu-devel] thread-pool.c race condition?
@ 2015-04-02 16:26 Stefan Hajnoczi
2015-04-02 16:43 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2015-04-02 16:26 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: John Snow, qemu-devel
John Snow has reported that qemu-io can hang when the host is under
heavy load. He made the following observations in gdb:
1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
waiting for request completion.
2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
The ThreadPoolElement should have been reaped by
thread_pool_completion_bh() and its callback invoked. For some reason
this didn't happen and the program is blocked in poll(2) waiting.
This suggests a race condition in thread-pool.c or qemu_bh_schedule()
(used to complete ThreadPoolElement from a QEMU event loop).
I don't have a good theory why this happens yet. Just wanted to share
in case someone else hits this problem.
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] thread-pool.c race condition?
2015-04-02 16:26 [Qemu-devel] thread-pool.c race condition? Stefan Hajnoczi
@ 2015-04-02 16:43 ` Paolo Bonzini
2015-04-02 16:44 ` Stefan Hajnoczi
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Paolo Bonzini @ 2015-04-02 16:43 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Laszlo Ersek, John Snow, qemu-devel
On 02/04/2015 18:26, Stefan Hajnoczi wrote:
> John Snow has reported that qemu-io can hang when the host is under
> heavy load. He made the following observations in gdb:
>
> 1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
> waiting for request completion.
>
> 2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
>
> The ThreadPoolElement should have been reaped by
> thread_pool_completion_bh() and its callback invoked. For some reason
> this didn't happen and the program is blocked in poll(2) waiting.
>
> This suggests a race condition in thread-pool.c or qemu_bh_schedule()
> (used to complete ThreadPoolElement from a QEMU event loop).
>
> I don't have a good theory why this happens yet. Just wanted to share
> in case someone else hits this problem.
Laszlo hit something very similar fairly easily with virtio-scsi (but
not virtio-blk!) on aarch64 hosts. Any attempt to debug it (ranging
from compilation with -O0 to tracing) made it disappear. A reliable
reproducer with qemu-io would be a dream...
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] thread-pool.c race condition?
2015-04-02 16:43 ` Paolo Bonzini
@ 2015-04-02 16:44 ` Stefan Hajnoczi
2015-04-02 16:46 ` John Snow
2015-04-02 16:47 ` Stefan Hajnoczi
2 siblings, 0 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2015-04-02 16:44 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Laszlo Ersek, John Snow, qemu-devel
On Thu, Apr 2, 2015 at 5:43 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/04/2015 18:26, Stefan Hajnoczi wrote:
>> John Snow has reported that qemu-io can hang when the host is under
>> heavy load. He made the following observations in gdb:
>>
>> 1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
>> waiting for request completion.
>>
>> 2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
>>
>> The ThreadPoolElement should have been reaped by
>> thread_pool_completion_bh() and its callback invoked. For some reason
>> this didn't happen and the program is blocked in poll(2) waiting.
>>
>> This suggests a race condition in thread-pool.c or qemu_bh_schedule()
>> (used to complete ThreadPoolElement from a QEMU event loop).
>>
>> I don't have a good theory why this happens yet. Just wanted to share
>> in case someone else hits this problem.
>
> Laszlo hit something very similar fairly easily with virtio-scsi (but
> not virtio-blk!) on aarch64 hosts. Any attempt to debug it (ranging
> from compilation with -O0 to tracing) made it disappear. A reliable
> reproducer with qemu-io would be a dream...
John said he hasn't seen it recently so we might have a problem reproducing it.
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] thread-pool.c race condition?
2015-04-02 16:43 ` Paolo Bonzini
2015-04-02 16:44 ` Stefan Hajnoczi
@ 2015-04-02 16:46 ` John Snow
2015-04-02 16:47 ` Stefan Hajnoczi
2 siblings, 0 replies; 6+ messages in thread
From: John Snow @ 2015-04-02 16:46 UTC (permalink / raw)
To: Paolo Bonzini, Stefan Hajnoczi; +Cc: Laszlo Ersek, qemu-devel
On 04/02/2015 12:43 PM, Paolo Bonzini wrote:
>
>
> On 02/04/2015 18:26, Stefan Hajnoczi wrote:
>> John Snow has reported that qemu-io can hang when the host is under
>> heavy load. He made the following observations in gdb:
>>
>> 1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
>> waiting for request completion.
>>
>> 2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
>>
>> The ThreadPoolElement should have been reaped by
>> thread_pool_completion_bh() and its callback invoked. For some reason
>> this didn't happen and the program is blocked in poll(2) waiting.
>>
>> This suggests a race condition in thread-pool.c or qemu_bh_schedule()
>> (used to complete ThreadPoolElement from a QEMU event loop).
>>
>> I don't have a good theory why this happens yet. Just wanted to share
>> in case someone else hits this problem.
>
> Laszlo hit something very similar fairly easily with virtio-scsi (but
> not virtio-blk!) on aarch64 hosts. Any attempt to debug it (ranging
> from compilation with -O0 to tracing) made it disappear. A reliable
> reproducer with qemu-io would be a dream...
>
> Paolo
>
Unfortunately for you, I hit it by running qemu-iotests on my laptop
overnight and I suspect it's triggered by my screensavers hogging CPU
when I am AFK...
I hit it pretty reliably (100% of the time I tried to run tests while
AFK -- three independent screensavers running on three monitors) two
weeks ago, but haven't seen it recently.
I'll keep you posted...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] thread-pool.c race condition?
2015-04-02 16:43 ` Paolo Bonzini
2015-04-02 16:44 ` Stefan Hajnoczi
2015-04-02 16:46 ` John Snow
@ 2015-04-02 16:47 ` Stefan Hajnoczi
2015-04-02 17:00 ` Paolo Bonzini
2 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2015-04-02 16:47 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Laszlo Ersek, John Snow, qemu-devel
On Thu, Apr 2, 2015 at 5:43 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/04/2015 18:26, Stefan Hajnoczi wrote:
>> John Snow has reported that qemu-io can hang when the host is under
>> heavy load. He made the following observations in gdb:
>>
>> 1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
>> waiting for request completion.
>>
>> 2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
>>
>> The ThreadPoolElement should have been reaped by
>> thread_pool_completion_bh() and its callback invoked. For some reason
>> this didn't happen and the program is blocked in poll(2) waiting.
>>
>> This suggests a race condition in thread-pool.c or qemu_bh_schedule()
>> (used to complete ThreadPoolElement from a QEMU event loop).
>>
>> I don't have a good theory why this happens yet. Just wanted to share
>> in case someone else hits this problem.
>
> Laszlo hit something very similar fairly easily with virtio-scsi (but
> not virtio-blk!) on aarch64 hosts. Any attempt to debug it (ranging
> from compilation with -O0 to tracing) made it disappear. A reliable
> reproducer with qemu-io would be a dream...
My initial speculation was that the qemu_bh_schedule():
if (bh->scheduled)
return;
Check is causing us to skip BH invocations.
When I look at the code the lack of explicit barriers or atomic
operations for bh->scheduled itself is a little suspicious.
But now I'm focussing more on thread-pool.c since that has its own
threading constraints.
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] thread-pool.c race condition?
2015-04-02 16:47 ` Stefan Hajnoczi
@ 2015-04-02 17:00 ` Paolo Bonzini
0 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2015-04-02 17:00 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Laszlo Ersek, John Snow, qemu-devel
On 02/04/2015 18:47, Stefan Hajnoczi wrote:
> My initial speculation was that the qemu_bh_schedule():
>
> if (bh->scheduled)
> return;
>
> Check is causing us to skip BH invocations.
>
> When I look at the code the lack of explicit barriers or atomic
> operations for bh->scheduled itself is a little suspicious.
You may have been onto something. If bh->scheduled is already 1, we do not
execute a memory barrier to order "any writes needed by the callback
[...] before the locations are read in the aio_bh_poll" (quoting from
the comment).
In particular, req->state might be see as THREAD_ACTIVE. This would
explain the failure on aarch64, but not on x86_64.
So, this is probably worth testing:
diff --git a/async.c b/async.c
index 2be88cc..c5d9939 100644
--- a/async.c
+++ b/async.c
@@ -122,19 +122,17 @@ void qemu_bh_schedule(QEMUBH *bh)
{
AioContext *ctx;
- if (bh->scheduled)
- return;
ctx = bh->ctx;
bh->idle = 0;
- /* Make sure that:
+ /* The memory barrier implicit in atomic_xchg makes sure that:
* 1. idle & any writes needed by the callback are done before the
* locations are read in the aio_bh_poll.
* 2. ctx is loaded before scheduled is set and the callback has a chance
* to execute.
*/
- smp_mb();
- bh->scheduled = 1;
- aio_notify(ctx);
+ if (atomic_xchg(&bh->scheduled, 1) == 0) {
+ aio_notify(ctx);
+ }
}
> But now I'm focussing more on thread-pool.c since that has its own
> threading constraints.
Making thread-pool.c less clever didn't make the bug go away for Laszlo.
Paolo
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-04-02 17:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-02 16:26 [Qemu-devel] thread-pool.c race condition? Stefan Hajnoczi
2015-04-02 16:43 ` Paolo Bonzini
2015-04-02 16:44 ` Stefan Hajnoczi
2015-04-02 16:46 ` John Snow
2015-04-02 16:47 ` Stefan Hajnoczi
2015-04-02 17:00 ` Paolo Bonzini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.