* [Qemu-devel] qemu 1.6.1
@ 2013-10-23 7:39 Michael W. Bombardieri
2013-10-23 9:00 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Michael W. Bombardieri @ 2013-10-23 7:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mb
Hi,
My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when
booting from an install CD.
C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso
Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly.
No further info at this stage.
Any ideas?
- Michael
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-23 7:39 [Qemu-devel] qemu 1.6.1 Michael W. Bombardieri
@ 2013-10-23 9:00 ` Paolo Bonzini
2013-10-23 20:26 ` Stefan Weil
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-10-23 9:00 UTC (permalink / raw)
To: Michael W. Bombardieri; +Cc: qemu-devel
Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto:
> Hi,
>
> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when
> booting from an install CD.
>
> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso
> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99
>
> This application has requested the Runtime to terminate it in an unusual way.
> Please contact the application's support team for more information.
>
> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly.
> No further info at this stage.
> Any ideas?
It's a known problem that not everyone can reproduce. Please compile
with --disable-coroutine-pool on the configure command line.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-23 9:00 ` Paolo Bonzini
@ 2013-10-23 20:26 ` Stefan Weil
2013-10-24 10:38 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Weil @ 2013-10-23 20:26 UTC (permalink / raw)
To: Paolo Bonzini, Michael W. Bombardieri; +Cc: qemu-devel
Am 23.10.2013 11:00, schrieb Paolo Bonzini:
> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto:
>> Hi,
>>
>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when
>> booting from an install CD.
>>
>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso
>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99
>>
>> This application has requested the Runtime to terminate it in an unusual way.
>> Please contact the application's support team for more information.
>>
>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly.
>> No further info at this stage.
>> Any ideas?
> It's a known problem that not everyone can reproduce. Please compile
> with --disable-coroutine-pool on the configure command line.
>
> Paolo
This patch also helps (at least for me, tested native and on Linux / Wine):
http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47
It looks like a compiler problem related to thread local storage
(variable "current").
I recently got several bug reports from a Windows user and included
patches to fix them in
my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on
qemu.weilnetz.de
are based on that tree.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-23 20:26 ` Stefan Weil
@ 2013-10-24 10:38 ` Paolo Bonzini
2013-10-24 16:37 ` Stefan Weil
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-10-24 10:38 UTC (permalink / raw)
To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel
Il 23/10/2013 21:26, Stefan Weil ha scritto:
> Am 23.10.2013 11:00, schrieb Paolo Bonzini:
>> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto:
>>> Hi,
>>>
>>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when
>>> booting from an install CD.
>>>
>>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso
>>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99
>>>
>>> This application has requested the Runtime to terminate it in an unusual way.
>>> Please contact the application's support team for more information.
>>>
>>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly.
>>> No further info at this stage.
>>> Any ideas?
>> It's a known problem that not everyone can reproduce. Please compile
>> with --disable-coroutine-pool on the configure command line.
>>
>> Paolo
>
> This patch also helps (at least for me, tested native and on Linux / Wine):
> http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47
>
> It looks like a compiler problem related to thread local storage
> (variable "current").
Ugh.
> I recently got several bug reports from a Windows user and included
> patches to fix them in
> my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on
> qemu.weilnetz.de
> are based on that tree.
Does something like
CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current);
also work? Then we can just remove from_.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-24 10:38 ` Paolo Bonzini
@ 2013-10-24 16:37 ` Stefan Weil
2013-10-24 21:47 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Weil @ 2013-10-24 16:37 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Am 24.10.2013 12:38, schrieb Paolo Bonzini:
> Il 23/10/2013 21:26, Stefan Weil ha scritto:
>> Am 23.10.2013 11:00, schrieb Paolo Bonzini:
>>> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto:
>>>> Hi,
>>>>
>>>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when
>>>> booting from an install CD.
>>>>
>>>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso
>>>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99
>>>>
>>>> This application has requested the Runtime to terminate it in an unusual way.
>>>> Please contact the application's support team for more information.
>>>>
>>>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly.
>>>> No further info at this stage.
>>>> Any ideas?
>>> It's a known problem that not everyone can reproduce. Please compile
>>> with --disable-coroutine-pool on the configure command line.
>>>
>>> Paolo
>> This patch also helps (at least for me, tested native and on Linux / Wine):
>> http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47
>>
>> It looks like a compiler problem related to thread local storage
>> (variable "current").
> Ugh.
>
>> I recently got several bug reports from a Windows user and included
>> patches to fix them in
>> my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on
>> qemu.weilnetz.de
>> are based on that tree.
> Does something like
>
> CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current);
>
> also work? Then we can just remove from_.
>
> Paolo
Yes, that works, too. It also fixes the problem with the assertion
(tested with Wine).
No, we cannot remove from_, because the same interface is also used
for Linux and other hosts which don't have a 'current' variable.
Or we would have to call qemu_coroutine_self() to get the current
coroutine.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-24 16:37 ` Stefan Weil
@ 2013-10-24 21:47 ` Paolo Bonzini
2013-10-26 9:51 ` Stefan Weil
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-10-24 21:47 UTC (permalink / raw)
To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Il 24/10/2013 17:37, Stefan Weil ha scritto:
> Yes, that works, too. It also fixes the problem with the assertion
> (tested with Wine).
>
> No, we cannot remove from_, because the same interface is also used
> for Linux and other hosts which don't have a 'current' variable.
> Or we would have to call qemu_coroutine_self() to get the current
> coroutine.
Yes, I was thinking of using qemu_coroutine_self().
By the way, can you post the two assembly language outputs for just
- CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
+ CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current);
which AIUI works and is enough to fix the bug?
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-24 21:47 ` Paolo Bonzini
@ 2013-10-26 9:51 ` Stefan Weil
2013-10-27 6:54 ` Paolo Bonzini
2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini
0 siblings, 2 replies; 14+ messages in thread
From: Stefan Weil @ 2013-10-26 9:51 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Am 24.10.2013 23:47, schrieb Paolo Bonzini:
> Il 24/10/2013 17:37, Stefan Weil ha scritto:
>> Yes, that works, too. It also fixes the problem with the assertion
>> (tested with Wine).
>>
>> No, we cannot remove from_, because the same interface is also used
>> for Linux and other hosts which don't have a 'current' variable.
>> Or we would have to call qemu_coroutine_self() to get the current
>> coroutine.
> Yes, I was thinking of using qemu_coroutine_self().
>
> By the way, can you post the two assembly language outputs for just
>
> - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
> + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current);
>
> which AIUI works and is enough to fix the bug?
>
> Paolo
See disassembled code below. I removed compiler option -fstack-protector-all
to simplify the assembler code and tested that the result was not affected
by this removal.
The C and assembler code from the test is also available at
http://qemu.weilnetz.de/test/coroutine-win32/.
Stefan
unpatched QEMU, crash with assertion
00448670 <_qemu_coroutine_switch>:
448670: 53 push %ebx
448671: 83 ec 18 sub $0x18,%esp
448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx
44867f: e8 ec 9e 27 00 call 6c2570
<___emutls_get_address>
448684: 89 18 mov %ebx,(%eax)
448686: 8b 44 24 28 mov 0x28(%esp),%eax
44868a: 89 43 24 mov %eax,0x24(%ebx)
44868d: 8b 43 20 mov 0x20(%ebx),%eax
448690: 89 04 24 mov %eax,(%esp)
448693: ff 15 c0 5f 83 00 call *0x835fc0
448699: 83 ec 04 sub $0x4,%esp
44869c: 8b 44 24 20 mov 0x20(%esp),%eax
4486a0: 8b 40 24 mov 0x24(%eax),%eax
4486a3: 83 c4 18 add $0x18,%esp
4486a6: 5b pop %ebx
4486a7: c3 ret
patched, works
00448620 <_qemu_coroutine_switch>:
448620: 83 ec 1c sub $0x1c,%esp
448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
44862a: 89 5c 24 14 mov %ebx,0x14(%esp)
44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx
448632: 89 74 24 18 mov %esi,0x18(%esp)
448636: e8 25 9f 27 00 call 6c2560
<___emutls_get_address>
44863b: 8b 30 mov (%eax),%esi
44863d: 89 18 mov %ebx,(%eax)
44863f: 8b 44 24 28 mov 0x28(%esp),%eax
448643: 89 43 24 mov %eax,0x24(%ebx)
448646: 8b 43 20 mov 0x20(%ebx),%eax
448649: 89 04 24 mov %eax,(%esp)
44864c: ff 15 c0 5f 83 00 call *0x835fc0
448652: 8b 46 24 mov 0x24(%esi),%eax
448655: 83 ec 04 sub $0x4,%esp
448658: 8b 5c 24 14 mov 0x14(%esp),%ebx
44865c: 8b 74 24 18 mov 0x18(%esp),%esi
448660: 83 c4 1c add $0x1c,%esp
448663: c3 ret
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-26 9:51 ` Stefan Weil
@ 2013-10-27 6:54 ` Paolo Bonzini
2013-10-27 10:44 ` Stefan Weil
2013-10-27 15:38 ` Stefan Weil
2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini
1 sibling, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2013-10-27 6:54 UTC (permalink / raw)
To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Il 26/10/2013 11:51, Stefan Weil ha scritto:
> Am 24.10.2013 23:47, schrieb Paolo Bonzini:
>> Il 24/10/2013 17:37, Stefan Weil ha scritto:
>>> Yes, that works, too. It also fixes the problem with the assertion
>>> (tested with Wine).
>>>
>>> No, we cannot remove from_, because the same interface is also used
>>> for Linux and other hosts which don't have a 'current' variable.
>>> Or we would have to call qemu_coroutine_self() to get the current
>>> coroutine.
>> Yes, I was thinking of using qemu_coroutine_self().
>>
>> By the way, can you post the two assembly language outputs for just
>>
>> - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
>> + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current);
>>
>> which AIUI works and is enough to fix the bug?
>>
>> Paolo
>
> See disassembled code below. I removed compiler option -fstack-protector-all
> to simplify the assembler code and tested that the result was not affected
> by this removal.
>
> The C and assembler code from the test is also available at
> http://qemu.weilnetz.de/test/coroutine-win32/.
Here is the code with annotations
broken works
-------------------------------------------------------------------------
push %ebx
sub $0x18,%esp sub $0x1c,%esp
mov %ebx,0x14(%esp)
mov %esi,0x18(%esp)
movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp)
mov 0x24(%esp),%ebx mov 0x24(%esp),%ebx ebx = to;
call ___emutls_get_address call ___emutls_get_address eax = ¤t;
mov (%eax),%esi esi = current;
mov %ebx,(%eax) mov %ebx,(%eax) current = to;
mov 0x28(%esp),%eax mov 0x28(%esp),%eax eax = action
mov %eax,0x24(%ebx) mov %eax,0x24(%ebx) to->action = action
mov 0x20(%ebx),%eax mov 0x20(%ebx),%eax eax = to->fiber
mov %eax,(%esp) mov %eax,(%esp) "push" to->fiber
call *0x835fc0 call *0x835fc0 SwitchToFiber(to->fiber)
sub $0x4,%esp sub $0x4,%esp undo PASCAL calling convention
** mov 0x20(%esp),%eax eax = from
mov 0x24(%eax),%eax mov 0x24(%esi),%eax eax = from->action
mov 0x14(%esp),%ebx
mov 0x18(%esp),%esi
add $0x18,%esp add $0x1c,%esp
pop %ebx
ret ret
I think the problem is that 0x20(%esp) gets somehow corrupted at the
instruction I highlighted with **.
The simplest fix then would be to add a barrier() before and after
SwitchToFiber.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-27 6:54 ` Paolo Bonzini
@ 2013-10-27 10:44 ` Stefan Weil
2013-10-27 15:38 ` Stefan Weil
1 sibling, 0 replies; 14+ messages in thread
From: Stefan Weil @ 2013-10-27 10:44 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Am 27.10.2013 07:54, schrieb Paolo Bonzini:
> Here is the code with annotations
>
> broken works
> -------------------------------------------------------------------------
> push %ebx
> sub $0x18,%esp sub $0x1c,%esp
> mov %ebx,0x14(%esp)
> mov %esi,0x18(%esp)
>
> movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp)
> mov 0x24(%esp),%ebx mov 0x24(%esp),%ebx ebx = to;
> call ___emutls_get_address call ___emutls_get_address eax = ¤t;
>
> mov (%eax),%esi esi = current;
>
> mov %ebx,(%eax) mov %ebx,(%eax) current = to;
>
> mov 0x28(%esp),%eax mov 0x28(%esp),%eax eax = action
> mov %eax,0x24(%ebx) mov %eax,0x24(%ebx) to->action = action
> mov 0x20(%ebx),%eax mov 0x20(%ebx),%eax eax = to->fiber
> mov %eax,(%esp) mov %eax,(%esp) "push" to->fiber
> call *0x835fc0 call *0x835fc0 SwitchToFiber(to->fiber)
> sub $0x4,%esp sub $0x4,%esp undo PASCAL calling convention
>
> ** mov 0x20(%esp),%eax eax = from
> mov 0x24(%eax),%eax mov 0x24(%esi),%eax eax = from->action
>
> mov 0x14(%esp),%ebx
> mov 0x18(%esp),%esi
> add $0x18,%esp add $0x1c,%esp
> pop %ebx
> ret ret
>
>
> I think the problem is that 0x20(%esp) gets somehow corrupted at the
> instruction I highlighted with **.
>
> The simplest fix then would be to add a barrier() before and after
> SwitchToFiber.
>
> Paolo
I tried adding two barrier() statements around SwitchToFiber().
That change did not result in different assembler code (=> unchanged
behaviour, QEMU still raises an assertion).
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1
2013-10-27 6:54 ` Paolo Bonzini
2013-10-27 10:44 ` Stefan Weil
@ 2013-10-27 15:38 ` Stefan Weil
1 sibling, 0 replies; 14+ messages in thread
From: Stefan Weil @ 2013-10-27 15:38 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Am 27.10.2013 07:54, schrieb Paolo Bonzini:
>
> I think the problem is that 0x20(%esp) gets somehow corrupted at the
> instruction I highlighted with **.
>
> The simplest fix then would be to add a barrier() before and after
> SwitchToFiber.
>
> Paolo
I added some debugging output (see code at
http://qemu.weilnetz.de/test/coroutine-win32/). The result with pointer
values replaced by function names is shown below.
There are two threads (WinMain, qemu_tcg_cpu_thread_fn) and one
coroutine (bdrv_rw_co_entry).
Stefan
qemu_coroutine_self:WinMain
qemu_in_coroutine:WinMain,null
qemu_coroutine_new:bdrv_rw_co_entry
qemu_coroutine_create,co=bdrv_rw_co_entry
qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain
coroutine_swap(WinMain,bdrv_rw_co_entry)
qemu_coroutine_yield,self=bdrv_rw_co_entry,to=WinMain
coroutine_swap(bdrv_rw_co_entry,WinMain)
qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain
coroutine_swap(WinMain,bdrv_rw_co_entry)
qemu_in_coroutine:bdrv_rw_co_entry,WinMain
# active coroutine bdrv_rw_co_entry is deleted
coroutine_delete(bdrv_rw_co_entry)
qemu_coroutine_self:qemu_tcg_cpu_thread_fn
qemu_in_coroutine:qemu_tcg_cpu_thread_fn,null
qemu_coroutine_create,co=bdrv_rw_co_entry
qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn
coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry)
qemu_coroutine_yield,self=bdrv_rw_co_entry,to=qemu_tcg_cpu_thread_fn
coroutine_swap(bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn)
qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn
coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry)
qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn
qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn
# active coroutine bdrv_rw_co_entry is deleted
coroutine_delete(bdrv_rw_co_entry)
# now we are still in deleted coroutine bdrv_rw_co_entry
qemu_in_coroutine:bdrv_rw_co_entry,null
qemu_coroutine_create,co=bdrv_rw_co_entry
qemu_coroutine_enter(bdrv_rw_co_entry,...),self=bdrv_rw_co_entry
coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry)
qemu_coroutine_yield,self=bdrv_rw_co_entry,to=bdrv_rw_co_entry
coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry)
qemu_in_coroutine:bdrv_rw_co_entry,null
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1)
2013-10-26 9:51 ` Stefan Weil
2013-10-27 6:54 ` Paolo Bonzini
@ 2014-06-23 14:39 ` Paolo Bonzini
2014-06-24 1:41 ` Michael W. Bombardieri
1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-06-23 14:39 UTC (permalink / raw)
To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi
Il 26/10/2013 11:51, Stefan Weil ha scritto:
> unpatched QEMU, crash with assertion
>
> 00448670 <_qemu_coroutine_switch>:
> 448670: 53 push %ebx
> 448671: 83 ec 18 sub $0x18,%esp
> 448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
> 44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx
> 44867f: e8 ec 9e 27 00 call 6c2570
> <___emutls_get_address>
> 448684: 89 18 mov %ebx,(%eax)
> 448686: 8b 44 24 28 mov 0x28(%esp),%eax
> 44868a: 89 43 24 mov %eax,0x24(%ebx)
> 44868d: 8b 43 20 mov 0x20(%ebx),%eax
> 448690: 89 04 24 mov %eax,(%esp)
> 448693: ff 15 c0 5f 83 00 call *0x835fc0
> 448699: 83 ec 04 sub $0x4,%esp
> 44869c: 8b 44 24 20 mov 0x20(%esp),%eax
> 4486a0: 8b 40 24 mov 0x24(%eax),%eax
> 4486a3: 83 c4 18 add $0x18,%esp
> 4486a6: 5b pop %ebx
> 4486a7: c3 ret
>
>
> patched, works
>
> 00448620 <_qemu_coroutine_switch>:
> 448620: 83 ec 1c sub $0x1c,%esp
> 448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
> 44862a: 89 5c 24 14 mov %ebx,0x14(%esp)
> 44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx
> 448632: 89 74 24 18 mov %esi,0x18(%esp)
> 448636: e8 25 9f 27 00 call 6c2560
> <___emutls_get_address>
> 44863b: 8b 30 mov (%eax),%esi
> 44863d: 89 18 mov %ebx,(%eax)
> 44863f: 8b 44 24 28 mov 0x28(%esp),%eax
> 448643: 89 43 24 mov %eax,0x24(%ebx)
> 448646: 8b 43 20 mov 0x20(%ebx),%eax
> 448649: 89 04 24 mov %eax,(%esp)
> 44864c: ff 15 c0 5f 83 00 call *0x835fc0
> 448652: 8b 46 24 mov 0x24(%esi),%eax
> 448655: 83 ec 04 sub $0x4,%esp
> 448658: 8b 5c 24 14 mov 0x14(%esp),%ebx
> 44865c: 8b 74 24 18 mov 0x18(%esp),%esi
> 448660: 83 c4 1c add $0x1c,%esp
> 448663: c3 ret
The only difference here is basically that "from" is being saved in
%esi across the calls to __emutls_get_address and SwitchToFiber. But
as Peter found out, the fix really happens because now the compiler
will not inline qemu_coroutine_switch anymore.
Peter provided another dump, this time from Win64. It is the
coroutine_trampoline with inlined qemu_coroutine_switch:
0: push %rdi
1: push %rsi
2: push %rbx
3: sub $0x30,%rsp
7: mov %rcx,%rbx
a: lea ...(%rip),%rcx
11: mov ...(%rip),%rax
18: mov %rax,0x28(%rsp)
1d: xor %eax,%eax
1f: callq ___emutls_get_address
24: mov ...(%rip),%rsi
2b: mov %rax,%rdi
2e: nop2
30: mov 0x8(%rbx),%rcx # ecx = co->entry_arg
34: callq *(%rbx) # co->entry(co->entry_arg);
36: mov 0x10(%rbx),%rax # load co->caller
3a: mov %rax,(%rdi) # "current" = co->caller; (wrong current!)
3d: movl $0x2,0x48(%rax) # co->caller->action = COROUTINE_TERMINATE;
44: mov 0x40(%rax),%rcx # load co->caller->fiber
48: callq *%rsi
4a: jmp 30 <coroutine_trampoline+0x30>
4c: nopl 0x0(%rax)
Some offsets are incomplete because this is from a .o, so not
linked, but it's already enough to see that ___emutls_get_address
(from "current = to_;" in qemu_coroutine_switch) is being hoisted
outside the loop, which becomes:
Coroutine *co = co_;
Coroutine **p_current = ¤t;
while (true) {
co->entry(co->entry_arg);
CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, co);
CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, co->caller);
*p_current = to_;
to->action = action;
SwitchToFiber(to->fiber);
return from->action;
}
This is wrong if co->entry yields out of the coroutine which is then
restarted on another thread. This can happen quite often. Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:
VCPU thread
main VCPU thread coroutine I/O coroutine
bdrv_aio_readv ----->
start I/O operation
thread_pool_submit_co
<------------ yields
back to emulation
Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread. event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:
iothread
main iothread coroutine I/O coroutine (formerly in VCPU thread)
event_notifier_ready
thread_pool_co_cb -----> current = I/O coroutine;
call AIO callback
But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.
noinline is a good-enough workaround, and quite unlikely to break in
the future.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1)
2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini
@ 2014-06-24 1:41 ` Michael W. Bombardieri
2014-06-24 5:22 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Michael W. Bombardieri @ 2014-06-24 1:41 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi
Hi,
Thanks for adding me to this thread.
I am not familiar with the qemu source code but I am aware
of the coroutine crash and I can test a patch if you have
one.
qemu 1.5.1 was the last version I was able to build and use
on win32. Later versions build without error but exhibit the
coroutine crash.
qemu binaries built with the coroutine feature disabled are
to slow to use.
- Michael
On Mon, Jun 23, 2014 at 04:39:30PM +0200, Paolo Bonzini wrote:
> Il 26/10/2013 11:51, Stefan Weil ha scritto:
> > unpatched QEMU, crash with assertion
> >
> > 00448670 <_qemu_coroutine_switch>:
> > 448670: 53 push %ebx
> > 448671: 83 ec 18 sub $0x18,%esp
> > 448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
> > 44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx
> > 44867f: e8 ec 9e 27 00 call 6c2570
> > <___emutls_get_address>
> > 448684: 89 18 mov %ebx,(%eax)
> > 448686: 8b 44 24 28 mov 0x28(%esp),%eax
> > 44868a: 89 43 24 mov %eax,0x24(%ebx)
> > 44868d: 8b 43 20 mov 0x20(%ebx),%eax
> > 448690: 89 04 24 mov %eax,(%esp)
> > 448693: ff 15 c0 5f 83 00 call *0x835fc0
> > 448699: 83 ec 04 sub $0x4,%esp
> > 44869c: 8b 44 24 20 mov 0x20(%esp),%eax
> > 4486a0: 8b 40 24 mov 0x24(%eax),%eax
> > 4486a3: 83 c4 18 add $0x18,%esp
> > 4486a6: 5b pop %ebx
> > 4486a7: c3 ret
> >
> >
> > patched, works
> >
> > 00448620 <_qemu_coroutine_switch>:
> > 448620: 83 ec 1c sub $0x1c,%esp
> > 448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp)
> > 44862a: 89 5c 24 14 mov %ebx,0x14(%esp)
> > 44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx
> > 448632: 89 74 24 18 mov %esi,0x18(%esp)
> > 448636: e8 25 9f 27 00 call 6c2560
> > <___emutls_get_address>
> > 44863b: 8b 30 mov (%eax),%esi
> > 44863d: 89 18 mov %ebx,(%eax)
> > 44863f: 8b 44 24 28 mov 0x28(%esp),%eax
> > 448643: 89 43 24 mov %eax,0x24(%ebx)
> > 448646: 8b 43 20 mov 0x20(%ebx),%eax
> > 448649: 89 04 24 mov %eax,(%esp)
> > 44864c: ff 15 c0 5f 83 00 call *0x835fc0
> > 448652: 8b 46 24 mov 0x24(%esi),%eax
> > 448655: 83 ec 04 sub $0x4,%esp
> > 448658: 8b 5c 24 14 mov 0x14(%esp),%ebx
> > 44865c: 8b 74 24 18 mov 0x18(%esp),%esi
> > 448660: 83 c4 1c add $0x1c,%esp
> > 448663: c3 ret
>
> The only difference here is basically that "from" is being saved in
> %esi across the calls to __emutls_get_address and SwitchToFiber. But
> as Peter found out, the fix really happens because now the compiler
> will not inline qemu_coroutine_switch anymore.
>
> Peter provided another dump, this time from Win64. It is the
> coroutine_trampoline with inlined qemu_coroutine_switch:
>
> 0: push %rdi
> 1: push %rsi
> 2: push %rbx
> 3: sub $0x30,%rsp
> 7: mov %rcx,%rbx
> a: lea ...(%rip),%rcx
> 11: mov ...(%rip),%rax
> 18: mov %rax,0x28(%rsp)
> 1d: xor %eax,%eax
> 1f: callq ___emutls_get_address
> 24: mov ...(%rip),%rsi
> 2b: mov %rax,%rdi
> 2e: nop2
> 30: mov 0x8(%rbx),%rcx # ecx = co->entry_arg
> 34: callq *(%rbx) # co->entry(co->entry_arg);
> 36: mov 0x10(%rbx),%rax # load co->caller
> 3a: mov %rax,(%rdi) # "current" = co->caller; (wrong current!)
> 3d: movl $0x2,0x48(%rax) # co->caller->action = COROUTINE_TERMINATE;
> 44: mov 0x40(%rax),%rcx # load co->caller->fiber
> 48: callq *%rsi
> 4a: jmp 30 <coroutine_trampoline+0x30>
> 4c: nopl 0x0(%rax)
>
> Some offsets are incomplete because this is from a .o, so not
> linked, but it's already enough to see that ___emutls_get_address
> (from "current = to_;" in qemu_coroutine_switch) is being hoisted
> outside the loop, which becomes:
>
> Coroutine *co = co_;
> Coroutine **p_current = ¤t;
>
> while (true) {
> co->entry(co->entry_arg);
>
> CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, co);
> CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, co->caller);
>
> *p_current = to_;
>
> to->action = action;
> SwitchToFiber(to->fiber);
> return from->action;
> }
>
> This is wrong if co->entry yields out of the coroutine which is then
> restarted on another thread. This can happen quite often. Typically.
> a coroutine is started when a VCPU thread does bdrv_aio_readv:
>
> VCPU thread
>
> main VCPU thread coroutine I/O coroutine
> bdrv_aio_readv ----->
> start I/O operation
> thread_pool_submit_co
> <------------ yields
> back to emulation
>
> Then I/O finishes and the thread-pool.c event notifier triggers in
> the I/O thread. event_notifier_ready calls thread_pool_co_cb, and
> the I/O coroutine now restarts *in another thread*:
>
> iothread
>
> main iothread coroutine I/O coroutine (formerly in VCPU thread)
> event_notifier_ready
> thread_pool_co_cb -----> current = I/O coroutine;
> call AIO callback
>
> But on Win32, because of the bug, the "current" being set here the
> current coroutine of the VCPU thread, not the iothread.
>
> noinline is a good-enough workaround, and quite unlikely to break in
> the future.
>
> Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1)
2014-06-24 1:41 ` Michael W. Bombardieri
@ 2014-06-24 5:22 ` Paolo Bonzini
2014-06-25 6:48 ` Michael W. Bombardieri
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-06-24 5:22 UTC (permalink / raw)
To: Michael W. Bombardieri
Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi
Il 24/06/2014 03:41, Michael W. Bombardieri ha scritto:
> Hi,
>
> Thanks for adding me to this thread.
> I am not familiar with the qemu source code but I am aware
> of the coroutine crash and I can test a patch if you have
> one.
>
> qemu 1.5.1 was the last version I was able to build and use
> on win32. Later versions build without error but exhibit the
> coroutine crash.
> qemu binaries built with the coroutine feature disabled are
> to slow to use.
Peter posted one here:
http://article.gmane.org/gmane.comp.emulators.qemu/282189/raw
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1)
2014-06-24 5:22 ` Paolo Bonzini
@ 2014-06-25 6:48 ` Michael W. Bombardieri
0 siblings, 0 replies; 14+ messages in thread
From: Michael W. Bombardieri @ 2014-06-25 6:48 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi
I applied the patch manually on qemu releases 1.7.1 and 2.0.0.
Configuration
* Host system is Lenovo ThinkPad T520 (windows 7)
* qemu built with --disable-sdl and --disable-gtk
* GCC version is 4.6.1 (MinGW win32)
Test (same steps/results for both 1.7.1 and 2.0.0)
* Created qcow2 disk image using qemu-img
* Started qemu-system-i386 with new disk image and OpenBSD/i386 5.5 install ISO
* Connected VM display using VNC
* VM booted successfully and install script started
* Install process too slow; gave up after >60 minutes
Following the same testing steps against a qemu 1.5.1 binary,
the OpenBSD 5.5 installation finished in 3 minutes on my system.
I haven't tried to profile the code.
- Michael
On Tue, Jun 24, 2014 at 07:22:59AM +0200, Paolo Bonzini wrote:
> Il 24/06/2014 03:41, Michael W. Bombardieri ha scritto:
> >Hi,
> >
> >Thanks for adding me to this thread.
> >I am not familiar with the qemu source code but I am aware
> >of the coroutine crash and I can test a patch if you have
> >one.
> >
> >qemu 1.5.1 was the last version I was able to build and use
> >on win32. Later versions build without error but exhibit the
> >coroutine crash.
> >qemu binaries built with the coroutine feature disabled are
> >to slow to use.
>
> Peter posted one here:
>
> http://article.gmane.org/gmane.comp.emulators.qemu/282189/raw
>
> Paolo
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-06-25 7:16 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-23 7:39 [Qemu-devel] qemu 1.6.1 Michael W. Bombardieri
2013-10-23 9:00 ` Paolo Bonzini
2013-10-23 20:26 ` Stefan Weil
2013-10-24 10:38 ` Paolo Bonzini
2013-10-24 16:37 ` Stefan Weil
2013-10-24 21:47 ` Paolo Bonzini
2013-10-26 9:51 ` Stefan Weil
2013-10-27 6:54 ` Paolo Bonzini
2013-10-27 10:44 ` Stefan Weil
2013-10-27 15:38 ` Stefan Weil
2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini
2014-06-24 1:41 ` Michael W. Bombardieri
2014-06-24 5:22 ` Paolo Bonzini
2014-06-25 6:48 ` Michael W. Bombardieri
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.