* [Qemu-devel] qemu 1.6.1 @ 2013-10-23 7:39 Michael W. Bombardieri 2013-10-23 9:00 ` Paolo Bonzini 0 siblings, 1 reply; 14+ messages in thread From: Michael W. Bombardieri @ 2013-10-23 7:39 UTC (permalink / raw) To: qemu-devel; +Cc: mb Hi, My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when booting from an install CD. C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. No further info at this stage. Any ideas? - Michael ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-23 7:39 [Qemu-devel] qemu 1.6.1 Michael W. Bombardieri @ 2013-10-23 9:00 ` Paolo Bonzini 2013-10-23 20:26 ` Stefan Weil 0 siblings, 1 reply; 14+ messages in thread From: Paolo Bonzini @ 2013-10-23 9:00 UTC (permalink / raw) To: Michael W. Bombardieri; +Cc: qemu-devel Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: > Hi, > > My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when > booting from an install CD. > > C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso > Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 > > This application has requested the Runtime to terminate it in an unusual way. > Please contact the application's support team for more information. > > I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. > No further info at this stage. > Any ideas? It's a known problem that not everyone can reproduce. Please compile with --disable-coroutine-pool on the configure command line. Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-23 9:00 ` Paolo Bonzini @ 2013-10-23 20:26 ` Stefan Weil 2013-10-24 10:38 ` Paolo Bonzini 0 siblings, 1 reply; 14+ messages in thread From: Stefan Weil @ 2013-10-23 20:26 UTC (permalink / raw) To: Paolo Bonzini, Michael W. Bombardieri; +Cc: qemu-devel Am 23.10.2013 11:00, schrieb Paolo Bonzini: > Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: >> Hi, >> >> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when >> booting from an install CD. >> >> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso >> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 >> >> This application has requested the Runtime to terminate it in an unusual way. >> Please contact the application's support team for more information. >> >> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. >> No further info at this stage. >> Any ideas? > It's a known problem that not everyone can reproduce. Please compile > with --disable-coroutine-pool on the configure command line. > > Paolo This patch also helps (at least for me, tested native and on Linux / Wine): http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 It looks like a compiler problem related to thread local storage (variable "current"). I recently got several bug reports from a Windows user and included patches to fix them in my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on qemu.weilnetz.de are based on that tree. Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-23 20:26 ` Stefan Weil @ 2013-10-24 10:38 ` Paolo Bonzini 2013-10-24 16:37 ` Stefan Weil 0 siblings, 1 reply; 14+ messages in thread From: Paolo Bonzini @ 2013-10-24 10:38 UTC (permalink / raw) To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel Il 23/10/2013 21:26, Stefan Weil ha scritto: > Am 23.10.2013 11:00, schrieb Paolo Bonzini: >> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: >>> Hi, >>> >>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when >>> booting from an install CD. >>> >>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso >>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 >>> >>> This application has requested the Runtime to terminate it in an unusual way. >>> Please contact the application's support team for more information. >>> >>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. >>> No further info at this stage. >>> Any ideas? >> It's a known problem that not everyone can reproduce. Please compile >> with --disable-coroutine-pool on the configure command line. >> >> Paolo > > This patch also helps (at least for me, tested native and on Linux / Wine): > http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 > > It looks like a compiler problem related to thread local storage > (variable "current"). Ugh. > I recently got several bug reports from a Windows user and included > patches to fix them in > my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on > qemu.weilnetz.de > are based on that tree. Does something like CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); also work? Then we can just remove from_. Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-24 10:38 ` Paolo Bonzini @ 2013-10-24 16:37 ` Stefan Weil 2013-10-24 21:47 ` Paolo Bonzini 0 siblings, 1 reply; 14+ messages in thread From: Stefan Weil @ 2013-10-24 16:37 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Am 24.10.2013 12:38, schrieb Paolo Bonzini: > Il 23/10/2013 21:26, Stefan Weil ha scritto: >> Am 23.10.2013 11:00, schrieb Paolo Bonzini: >>> Il 23/10/2013 08:39, Michael W. Bombardieri ha scritto: >>>> Hi, >>>> >>>> My newly built qemu/win32 binary (v1.6.1) crashes in qemu-system-i386 and qemu-system-x86_64 when >>>> booting from an install CD. >>>> >>>> C:\Program Files\qemu>qemu-system-x86_64 -boot d -vnc 0.0.0.0:20 -cdrom NetBSD-6.1.2-amd64.iso >>>> Assertion failed: qemu_in_coroutine(), file qemu-coroutine-lock.c, line 99 >>>> >>>> This application has requested the Runtime to terminate it in an unusual way. >>>> Please contact the application's support team for more information. >>>> >>>> I noticed that qemu-system-sparc still booted OpenBSD/sparc 5.3 install CD correctly. >>>> No further info at this stage. >>>> Any ideas? >>> It's a known problem that not everyone can reproduce. Please compile >>> with --disable-coroutine-pool on the configure command line. >>> >>> Paolo >> This patch also helps (at least for me, tested native and on Linux / Wine): >> http://repo.or.cz/w/qemu/ar7.git/commit/c777d5d62a729fd8b19847aaa0aad3d7a1f73f47 >> >> It looks like a compiler problem related to thread local storage >> (variable "current"). > Ugh. > >> I recently got several bug reports from a Windows user and included >> patches to fix them in >> my personal tree http://repo.or.cz/w/qemu/ar7.git. The binaries on >> qemu.weilnetz.de >> are based on that tree. > Does something like > > CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); > > also work? Then we can just remove from_. > > Paolo Yes, that works, too. It also fixes the problem with the assertion (tested with Wine). No, we cannot remove from_, because the same interface is also used for Linux and other hosts which don't have a 'current' variable. Or we would have to call qemu_coroutine_self() to get the current coroutine. Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-24 16:37 ` Stefan Weil @ 2013-10-24 21:47 ` Paolo Bonzini 2013-10-26 9:51 ` Stefan Weil 0 siblings, 1 reply; 14+ messages in thread From: Paolo Bonzini @ 2013-10-24 21:47 UTC (permalink / raw) To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Il 24/10/2013 17:37, Stefan Weil ha scritto: > Yes, that works, too. It also fixes the problem with the assertion > (tested with Wine). > > No, we cannot remove from_, because the same interface is also used > for Linux and other hosts which don't have a 'current' variable. > Or we would have to call qemu_coroutine_self() to get the current > coroutine. Yes, I was thinking of using qemu_coroutine_self(). By the way, can you post the two assembly language outputs for just - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); which AIUI works and is enough to fix the bug? Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-24 21:47 ` Paolo Bonzini @ 2013-10-26 9:51 ` Stefan Weil 2013-10-27 6:54 ` Paolo Bonzini 2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini 0 siblings, 2 replies; 14+ messages in thread From: Stefan Weil @ 2013-10-26 9:51 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Am 24.10.2013 23:47, schrieb Paolo Bonzini: > Il 24/10/2013 17:37, Stefan Weil ha scritto: >> Yes, that works, too. It also fixes the problem with the assertion >> (tested with Wine). >> >> No, we cannot remove from_, because the same interface is also used >> for Linux and other hosts which don't have a 'current' variable. >> Or we would have to call qemu_coroutine_self() to get the current >> coroutine. > Yes, I was thinking of using qemu_coroutine_self(). > > By the way, can you post the two assembly language outputs for just > > - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); > + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); > > which AIUI works and is enough to fix the bug? > > Paolo See disassembled code below. I removed compiler option -fstack-protector-all to simplify the assembler code and tested that the result was not affected by this removal. The C and assembler code from the test is also available at http://qemu.weilnetz.de/test/coroutine-win32/. Stefan unpatched QEMU, crash with assertion 00448670 <_qemu_coroutine_switch>: 448670: 53 push %ebx 448671: 83 ec 18 sub $0x18,%esp 448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) 44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx 44867f: e8 ec 9e 27 00 call 6c2570 <___emutls_get_address> 448684: 89 18 mov %ebx,(%eax) 448686: 8b 44 24 28 mov 0x28(%esp),%eax 44868a: 89 43 24 mov %eax,0x24(%ebx) 44868d: 8b 43 20 mov 0x20(%ebx),%eax 448690: 89 04 24 mov %eax,(%esp) 448693: ff 15 c0 5f 83 00 call *0x835fc0 448699: 83 ec 04 sub $0x4,%esp 44869c: 8b 44 24 20 mov 0x20(%esp),%eax 4486a0: 8b 40 24 mov 0x24(%eax),%eax 4486a3: 83 c4 18 add $0x18,%esp 4486a6: 5b pop %ebx 4486a7: c3 ret patched, works 00448620 <_qemu_coroutine_switch>: 448620: 83 ec 1c sub $0x1c,%esp 448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) 44862a: 89 5c 24 14 mov %ebx,0x14(%esp) 44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx 448632: 89 74 24 18 mov %esi,0x18(%esp) 448636: e8 25 9f 27 00 call 6c2560 <___emutls_get_address> 44863b: 8b 30 mov (%eax),%esi 44863d: 89 18 mov %ebx,(%eax) 44863f: 8b 44 24 28 mov 0x28(%esp),%eax 448643: 89 43 24 mov %eax,0x24(%ebx) 448646: 8b 43 20 mov 0x20(%ebx),%eax 448649: 89 04 24 mov %eax,(%esp) 44864c: ff 15 c0 5f 83 00 call *0x835fc0 448652: 8b 46 24 mov 0x24(%esi),%eax 448655: 83 ec 04 sub $0x4,%esp 448658: 8b 5c 24 14 mov 0x14(%esp),%ebx 44865c: 8b 74 24 18 mov 0x18(%esp),%esi 448660: 83 c4 1c add $0x1c,%esp 448663: c3 ret ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-26 9:51 ` Stefan Weil @ 2013-10-27 6:54 ` Paolo Bonzini 2013-10-27 10:44 ` Stefan Weil 2013-10-27 15:38 ` Stefan Weil 2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini 1 sibling, 2 replies; 14+ messages in thread From: Paolo Bonzini @ 2013-10-27 6:54 UTC (permalink / raw) To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Il 26/10/2013 11:51, Stefan Weil ha scritto: > Am 24.10.2013 23:47, schrieb Paolo Bonzini: >> Il 24/10/2013 17:37, Stefan Weil ha scritto: >>> Yes, that works, too. It also fixes the problem with the assertion >>> (tested with Wine). >>> >>> No, we cannot remove from_, because the same interface is also used >>> for Linux and other hosts which don't have a 'current' variable. >>> Or we would have to call qemu_coroutine_self() to get the current >>> coroutine. >> Yes, I was thinking of using qemu_coroutine_self(). >> >> By the way, can you post the two assembly language outputs for just >> >> - CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_); >> + CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, current); >> >> which AIUI works and is enough to fix the bug? >> >> Paolo > > See disassembled code below. I removed compiler option -fstack-protector-all > to simplify the assembler code and tested that the result was not affected > by this removal. > > The C and assembler code from the test is also available at > http://qemu.weilnetz.de/test/coroutine-win32/. Here is the code with annotations broken works ------------------------------------------------------------------------- push %ebx sub $0x18,%esp sub $0x1c,%esp mov %ebx,0x14(%esp) mov %esi,0x18(%esp) movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp) mov 0x24(%esp),%ebx mov 0x24(%esp),%ebx ebx = to; call ___emutls_get_address call ___emutls_get_address eax = ¤t; mov (%eax),%esi esi = current; mov %ebx,(%eax) mov %ebx,(%eax) current = to; mov 0x28(%esp),%eax mov 0x28(%esp),%eax eax = action mov %eax,0x24(%ebx) mov %eax,0x24(%ebx) to->action = action mov 0x20(%ebx),%eax mov 0x20(%ebx),%eax eax = to->fiber mov %eax,(%esp) mov %eax,(%esp) "push" to->fiber call *0x835fc0 call *0x835fc0 SwitchToFiber(to->fiber) sub $0x4,%esp sub $0x4,%esp undo PASCAL calling convention ** mov 0x20(%esp),%eax eax = from mov 0x24(%eax),%eax mov 0x24(%esi),%eax eax = from->action mov 0x14(%esp),%ebx mov 0x18(%esp),%esi add $0x18,%esp add $0x1c,%esp pop %ebx ret ret I think the problem is that 0x20(%esp) gets somehow corrupted at the instruction I highlighted with **. The simplest fix then would be to add a barrier() before and after SwitchToFiber. Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-27 6:54 ` Paolo Bonzini @ 2013-10-27 10:44 ` Stefan Weil 2013-10-27 15:38 ` Stefan Weil 1 sibling, 0 replies; 14+ messages in thread From: Stefan Weil @ 2013-10-27 10:44 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Am 27.10.2013 07:54, schrieb Paolo Bonzini: > Here is the code with annotations > > broken works > ------------------------------------------------------------------------- > push %ebx > sub $0x18,%esp sub $0x1c,%esp > mov %ebx,0x14(%esp) > mov %esi,0x18(%esp) > > movl $0x6d62a8,(%esp) movl $0x6d62a8,(%esp) > mov 0x24(%esp),%ebx mov 0x24(%esp),%ebx ebx = to; > call ___emutls_get_address call ___emutls_get_address eax = ¤t; > > mov (%eax),%esi esi = current; > > mov %ebx,(%eax) mov %ebx,(%eax) current = to; > > mov 0x28(%esp),%eax mov 0x28(%esp),%eax eax = action > mov %eax,0x24(%ebx) mov %eax,0x24(%ebx) to->action = action > mov 0x20(%ebx),%eax mov 0x20(%ebx),%eax eax = to->fiber > mov %eax,(%esp) mov %eax,(%esp) "push" to->fiber > call *0x835fc0 call *0x835fc0 SwitchToFiber(to->fiber) > sub $0x4,%esp sub $0x4,%esp undo PASCAL calling convention > > ** mov 0x20(%esp),%eax eax = from > mov 0x24(%eax),%eax mov 0x24(%esi),%eax eax = from->action > > mov 0x14(%esp),%ebx > mov 0x18(%esp),%esi > add $0x18,%esp add $0x1c,%esp > pop %ebx > ret ret > > > I think the problem is that 0x20(%esp) gets somehow corrupted at the > instruction I highlighted with **. > > The simplest fix then would be to add a barrier() before and after > SwitchToFiber. > > Paolo I tried adding two barrier() statements around SwitchToFiber(). That change did not result in different assembler code (=> unchanged behaviour, QEMU still raises an assertion). Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] qemu 1.6.1 2013-10-27 6:54 ` Paolo Bonzini 2013-10-27 10:44 ` Stefan Weil @ 2013-10-27 15:38 ` Stefan Weil 1 sibling, 0 replies; 14+ messages in thread From: Stefan Weil @ 2013-10-27 15:38 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Am 27.10.2013 07:54, schrieb Paolo Bonzini: > > I think the problem is that 0x20(%esp) gets somehow corrupted at the > instruction I highlighted with **. > > The simplest fix then would be to add a barrier() before and after > SwitchToFiber. > > Paolo I added some debugging output (see code at http://qemu.weilnetz.de/test/coroutine-win32/). The result with pointer values replaced by function names is shown below. There are two threads (WinMain, qemu_tcg_cpu_thread_fn) and one coroutine (bdrv_rw_co_entry). Stefan qemu_coroutine_self:WinMain qemu_in_coroutine:WinMain,null qemu_coroutine_new:bdrv_rw_co_entry qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain coroutine_swap(WinMain,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=WinMain coroutine_swap(bdrv_rw_co_entry,WinMain) qemu_coroutine_enter(bdrv_rw_co_entry,...),self=WinMain coroutine_swap(WinMain,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,WinMain # active coroutine bdrv_rw_co_entry is deleted coroutine_delete(bdrv_rw_co_entry) qemu_coroutine_self:qemu_tcg_cpu_thread_fn qemu_in_coroutine:qemu_tcg_cpu_thread_fn,null qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=qemu_tcg_cpu_thread_fn coroutine_swap(bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn) qemu_coroutine_enter(bdrv_rw_co_entry,...),self=qemu_tcg_cpu_thread_fn coroutine_swap(qemu_tcg_cpu_thread_fn,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn qemu_in_coroutine:bdrv_rw_co_entry,qemu_tcg_cpu_thread_fn # active coroutine bdrv_rw_co_entry is deleted coroutine_delete(bdrv_rw_co_entry) # now we are still in deleted coroutine bdrv_rw_co_entry qemu_in_coroutine:bdrv_rw_co_entry,null qemu_coroutine_create,co=bdrv_rw_co_entry qemu_coroutine_enter(bdrv_rw_co_entry,...),self=bdrv_rw_co_entry coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry) qemu_coroutine_yield,self=bdrv_rw_co_entry,to=bdrv_rw_co_entry coroutine_swap(bdrv_rw_co_entry,bdrv_rw_co_entry) qemu_in_coroutine:bdrv_rw_co_entry,null ^ permalink raw reply [flat|nested] 14+ messages in thread
* [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) 2013-10-26 9:51 ` Stefan Weil 2013-10-27 6:54 ` Paolo Bonzini @ 2014-06-23 14:39 ` Paolo Bonzini 2014-06-24 1:41 ` Michael W. Bombardieri 1 sibling, 1 reply; 14+ messages in thread From: Paolo Bonzini @ 2014-06-23 14:39 UTC (permalink / raw) To: Stefan Weil; +Cc: Michael W. Bombardieri, qemu-devel, Stefan Hajnoczi Il 26/10/2013 11:51, Stefan Weil ha scritto: > unpatched QEMU, crash with assertion > > 00448670 <_qemu_coroutine_switch>: > 448670: 53 push %ebx > 448671: 83 ec 18 sub $0x18,%esp > 448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) > 44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx > 44867f: e8 ec 9e 27 00 call 6c2570 > <___emutls_get_address> > 448684: 89 18 mov %ebx,(%eax) > 448686: 8b 44 24 28 mov 0x28(%esp),%eax > 44868a: 89 43 24 mov %eax,0x24(%ebx) > 44868d: 8b 43 20 mov 0x20(%ebx),%eax > 448690: 89 04 24 mov %eax,(%esp) > 448693: ff 15 c0 5f 83 00 call *0x835fc0 > 448699: 83 ec 04 sub $0x4,%esp > 44869c: 8b 44 24 20 mov 0x20(%esp),%eax > 4486a0: 8b 40 24 mov 0x24(%eax),%eax > 4486a3: 83 c4 18 add $0x18,%esp > 4486a6: 5b pop %ebx > 4486a7: c3 ret > > > patched, works > > 00448620 <_qemu_coroutine_switch>: > 448620: 83 ec 1c sub $0x1c,%esp > 448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) > 44862a: 89 5c 24 14 mov %ebx,0x14(%esp) > 44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx > 448632: 89 74 24 18 mov %esi,0x18(%esp) > 448636: e8 25 9f 27 00 call 6c2560 > <___emutls_get_address> > 44863b: 8b 30 mov (%eax),%esi > 44863d: 89 18 mov %ebx,(%eax) > 44863f: 8b 44 24 28 mov 0x28(%esp),%eax > 448643: 89 43 24 mov %eax,0x24(%ebx) > 448646: 8b 43 20 mov 0x20(%ebx),%eax > 448649: 89 04 24 mov %eax,(%esp) > 44864c: ff 15 c0 5f 83 00 call *0x835fc0 > 448652: 8b 46 24 mov 0x24(%esi),%eax > 448655: 83 ec 04 sub $0x4,%esp > 448658: 8b 5c 24 14 mov 0x14(%esp),%ebx > 44865c: 8b 74 24 18 mov 0x18(%esp),%esi > 448660: 83 c4 1c add $0x1c,%esp > 448663: c3 ret The only difference here is basically that "from" is being saved in %esi across the calls to __emutls_get_address and SwitchToFiber. But as Peter found out, the fix really happens because now the compiler will not inline qemu_coroutine_switch anymore. Peter provided another dump, this time from Win64. It is the coroutine_trampoline with inlined qemu_coroutine_switch: 0: push %rdi 1: push %rsi 2: push %rbx 3: sub $0x30,%rsp 7: mov %rcx,%rbx a: lea ...(%rip),%rcx 11: mov ...(%rip),%rax 18: mov %rax,0x28(%rsp) 1d: xor %eax,%eax 1f: callq ___emutls_get_address 24: mov ...(%rip),%rsi 2b: mov %rax,%rdi 2e: nop2 30: mov 0x8(%rbx),%rcx # ecx = co->entry_arg 34: callq *(%rbx) # co->entry(co->entry_arg); 36: mov 0x10(%rbx),%rax # load co->caller 3a: mov %rax,(%rdi) # "current" = co->caller; (wrong current!) 3d: movl $0x2,0x48(%rax) # co->caller->action = COROUTINE_TERMINATE; 44: mov 0x40(%rax),%rcx # load co->caller->fiber 48: callq *%rsi 4a: jmp 30 <coroutine_trampoline+0x30> 4c: nopl 0x0(%rax) Some offsets are incomplete because this is from a .o, so not linked, but it's already enough to see that ___emutls_get_address (from "current = to_;" in qemu_coroutine_switch) is being hoisted outside the loop, which becomes: Coroutine *co = co_; Coroutine **p_current = ¤t; while (true) { co->entry(co->entry_arg); CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, co); CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, co->caller); *p_current = to_; to->action = action; SwitchToFiber(to->fiber); return from->action; } This is wrong if co->entry yields out of the coroutine which is then restarted on another thread. This can happen quite often. Typically. a coroutine is started when a VCPU thread does bdrv_aio_readv: VCPU thread main VCPU thread coroutine I/O coroutine bdrv_aio_readv -----> start I/O operation thread_pool_submit_co <------------ yields back to emulation Then I/O finishes and the thread-pool.c event notifier triggers in the I/O thread. event_notifier_ready calls thread_pool_co_cb, and the I/O coroutine now restarts *in another thread*: iothread main iothread coroutine I/O coroutine (formerly in VCPU thread) event_notifier_ready thread_pool_co_cb -----> current = I/O coroutine; call AIO callback But on Win32, because of the bug, the "current" being set here the current coroutine of the VCPU thread, not the iothread. noinline is a good-enough workaround, and quite unlikely to break in the future. Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) 2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini @ 2014-06-24 1:41 ` Michael W. Bombardieri 2014-06-24 5:22 ` Paolo Bonzini 0 siblings, 1 reply; 14+ messages in thread From: Michael W. Bombardieri @ 2014-06-24 1:41 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi Hi, Thanks for adding me to this thread. I am not familiar with the qemu source code but I am aware of the coroutine crash and I can test a patch if you have one. qemu 1.5.1 was the last version I was able to build and use on win32. Later versions build without error but exhibit the coroutine crash. qemu binaries built with the coroutine feature disabled are to slow to use. - Michael On Mon, Jun 23, 2014 at 04:39:30PM +0200, Paolo Bonzini wrote: > Il 26/10/2013 11:51, Stefan Weil ha scritto: > > unpatched QEMU, crash with assertion > > > > 00448670 <_qemu_coroutine_switch>: > > 448670: 53 push %ebx > > 448671: 83 ec 18 sub $0x18,%esp > > 448674: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) > > 44867b: 8b 5c 24 24 mov 0x24(%esp),%ebx > > 44867f: e8 ec 9e 27 00 call 6c2570 > > <___emutls_get_address> > > 448684: 89 18 mov %ebx,(%eax) > > 448686: 8b 44 24 28 mov 0x28(%esp),%eax > > 44868a: 89 43 24 mov %eax,0x24(%ebx) > > 44868d: 8b 43 20 mov 0x20(%ebx),%eax > > 448690: 89 04 24 mov %eax,(%esp) > > 448693: ff 15 c0 5f 83 00 call *0x835fc0 > > 448699: 83 ec 04 sub $0x4,%esp > > 44869c: 8b 44 24 20 mov 0x20(%esp),%eax > > 4486a0: 8b 40 24 mov 0x24(%eax),%eax > > 4486a3: 83 c4 18 add $0x18,%esp > > 4486a6: 5b pop %ebx > > 4486a7: c3 ret > > > > > > patched, works > > > > 00448620 <_qemu_coroutine_switch>: > > 448620: 83 ec 1c sub $0x1c,%esp > > 448623: c7 04 24 a8 62 6d 00 movl $0x6d62a8,(%esp) > > 44862a: 89 5c 24 14 mov %ebx,0x14(%esp) > > 44862e: 8b 5c 24 24 mov 0x24(%esp),%ebx > > 448632: 89 74 24 18 mov %esi,0x18(%esp) > > 448636: e8 25 9f 27 00 call 6c2560 > > <___emutls_get_address> > > 44863b: 8b 30 mov (%eax),%esi > > 44863d: 89 18 mov %ebx,(%eax) > > 44863f: 8b 44 24 28 mov 0x28(%esp),%eax > > 448643: 89 43 24 mov %eax,0x24(%ebx) > > 448646: 8b 43 20 mov 0x20(%ebx),%eax > > 448649: 89 04 24 mov %eax,(%esp) > > 44864c: ff 15 c0 5f 83 00 call *0x835fc0 > > 448652: 8b 46 24 mov 0x24(%esi),%eax > > 448655: 83 ec 04 sub $0x4,%esp > > 448658: 8b 5c 24 14 mov 0x14(%esp),%ebx > > 44865c: 8b 74 24 18 mov 0x18(%esp),%esi > > 448660: 83 c4 1c add $0x1c,%esp > > 448663: c3 ret > > The only difference here is basically that "from" is being saved in > %esi across the calls to __emutls_get_address and SwitchToFiber. But > as Peter found out, the fix really happens because now the compiler > will not inline qemu_coroutine_switch anymore. > > Peter provided another dump, this time from Win64. It is the > coroutine_trampoline with inlined qemu_coroutine_switch: > > 0: push %rdi > 1: push %rsi > 2: push %rbx > 3: sub $0x30,%rsp > 7: mov %rcx,%rbx > a: lea ...(%rip),%rcx > 11: mov ...(%rip),%rax > 18: mov %rax,0x28(%rsp) > 1d: xor %eax,%eax > 1f: callq ___emutls_get_address > 24: mov ...(%rip),%rsi > 2b: mov %rax,%rdi > 2e: nop2 > 30: mov 0x8(%rbx),%rcx # ecx = co->entry_arg > 34: callq *(%rbx) # co->entry(co->entry_arg); > 36: mov 0x10(%rbx),%rax # load co->caller > 3a: mov %rax,(%rdi) # "current" = co->caller; (wrong current!) > 3d: movl $0x2,0x48(%rax) # co->caller->action = COROUTINE_TERMINATE; > 44: mov 0x40(%rax),%rcx # load co->caller->fiber > 48: callq *%rsi > 4a: jmp 30 <coroutine_trampoline+0x30> > 4c: nopl 0x0(%rax) > > Some offsets are incomplete because this is from a .o, so not > linked, but it's already enough to see that ___emutls_get_address > (from "current = to_;" in qemu_coroutine_switch) is being hoisted > outside the loop, which becomes: > > Coroutine *co = co_; > Coroutine **p_current = ¤t; > > while (true) { > co->entry(co->entry_arg); > > CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, co); > CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, co->caller); > > *p_current = to_; > > to->action = action; > SwitchToFiber(to->fiber); > return from->action; > } > > This is wrong if co->entry yields out of the coroutine which is then > restarted on another thread. This can happen quite often. Typically. > a coroutine is started when a VCPU thread does bdrv_aio_readv: > > VCPU thread > > main VCPU thread coroutine I/O coroutine > bdrv_aio_readv -----> > start I/O operation > thread_pool_submit_co > <------------ yields > back to emulation > > Then I/O finishes and the thread-pool.c event notifier triggers in > the I/O thread. event_notifier_ready calls thread_pool_co_cb, and > the I/O coroutine now restarts *in another thread*: > > iothread > > main iothread coroutine I/O coroutine (formerly in VCPU thread) > event_notifier_ready > thread_pool_co_cb -----> current = I/O coroutine; > call AIO callback > > But on Win32, because of the bug, the "current" being set here the > current coroutine of the VCPU thread, not the iothread. > > noinline is a good-enough workaround, and quite unlikely to break in > the future. > > Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) 2014-06-24 1:41 ` Michael W. Bombardieri @ 2014-06-24 5:22 ` Paolo Bonzini 2014-06-25 6:48 ` Michael W. Bombardieri 0 siblings, 1 reply; 14+ messages in thread From: Paolo Bonzini @ 2014-06-24 5:22 UTC (permalink / raw) To: Michael W. Bombardieri Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi Il 24/06/2014 03:41, Michael W. Bombardieri ha scritto: > Hi, > > Thanks for adding me to this thread. > I am not familiar with the qemu source code but I am aware > of the coroutine crash and I can test a patch if you have > one. > > qemu 1.5.1 was the last version I was able to build and use > on win32. Later versions build without error but exhibit the > coroutine crash. > qemu binaries built with the coroutine feature disabled are > to slow to use. Peter posted one here: http://article.gmane.org/gmane.comp.emulators.qemu/282189/raw Paolo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) 2014-06-24 5:22 ` Paolo Bonzini @ 2014-06-25 6:48 ` Michael W. Bombardieri 0 siblings, 0 replies; 14+ messages in thread From: Michael W. Bombardieri @ 2014-06-25 6:48 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Stefan Weil, Prashant Satish, qemu-devel, Stefan Hajnoczi I applied the patch manually on qemu releases 1.7.1 and 2.0.0. Configuration * Host system is Lenovo ThinkPad T520 (windows 7) * qemu built with --disable-sdl and --disable-gtk * GCC version is 4.6.1 (MinGW win32) Test (same steps/results for both 1.7.1 and 2.0.0) * Created qcow2 disk image using qemu-img * Started qemu-system-i386 with new disk image and OpenBSD/i386 5.5 install ISO * Connected VM display using VNC * VM booted successfully and install script started * Install process too slow; gave up after >60 minutes Following the same testing steps against a qemu 1.5.1 binary, the OpenBSD 5.5 installation finished in 3 minutes on my system. I haven't tried to profile the code. - Michael On Tue, Jun 24, 2014 at 07:22:59AM +0200, Paolo Bonzini wrote: > Il 24/06/2014 03:41, Michael W. Bombardieri ha scritto: > >Hi, > > > >Thanks for adding me to this thread. > >I am not familiar with the qemu source code but I am aware > >of the coroutine crash and I can test a patch if you have > >one. > > > >qemu 1.5.1 was the last version I was able to build and use > >on win32. Later versions build without error but exhibit the > >coroutine crash. > >qemu binaries built with the coroutine feature disabled are > >to slow to use. > > Peter posted one here: > > http://article.gmane.org/gmane.comp.emulators.qemu/282189/raw > > Paolo > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-06-25 7:16 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-10-23 7:39 [Qemu-devel] qemu 1.6.1 Michael W. Bombardieri 2013-10-23 9:00 ` Paolo Bonzini 2013-10-23 20:26 ` Stefan Weil 2013-10-24 10:38 ` Paolo Bonzini 2013-10-24 16:37 ` Stefan Weil 2013-10-24 21:47 ` Paolo Bonzini 2013-10-26 9:51 ` Stefan Weil 2013-10-27 6:54 ` Paolo Bonzini 2013-10-27 10:44 ` Stefan Weil 2013-10-27 15:38 ` Stefan Weil 2014-06-23 14:39 ` [Qemu-devel] broken win32 coroutines (was Re: qemu 1.6.1) Paolo Bonzini 2014-06-24 1:41 ` Michael W. Bombardieri 2014-06-24 5:22 ` Paolo Bonzini 2014-06-25 6:48 ` Michael W. Bombardieri
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.