* [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
@ 2007-03-24 17:50 Axel Zeuner
2007-03-24 20:15 ` Anthony Liguori
2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
0 siblings, 2 replies; 17+ messages in thread
From: Axel Zeuner @ 2007-03-24 17:50 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 3980 bytes --]
Hi,
there were a lot of discussions about compiling qemu with gcc4 or higher. The
summary of the discussions were, as I understood, that compiling qemu with
gcc4 requires changing the code generation engine of the most of the
supported targets. These changes require a lot of work and time.
How about splitting the current static code generation process further?
Today gcc produces object code and dyngen adapts it for the purposes of qemu,
i.e produces the generation function, patches in parameters ..:
gcc -c op.o op.c ;dyngen -o op.h ... op.o .
The op_XXX functions generated by gcc may not contain more than one exit
and this exit must be at the end, no not intended jumps to external
functions may occur.
It is possible to split the transformation into the following steps:
Generate assembly output from the C-Sources: gcc -S -o op-0.s op.c.
Convert the assembly output: cvtasm op.s op-0.s.
Assemble the converted assembler sources: as -o op.o op.s.
Use dyngen as before: dyngen -o op.h ... op.o.
Nothing will change if cvtasm copies only the input to the output, i.e. this
additional pass will not break existing code.
A full featured converter (cvtasm) has a lot of dependencies: it has to
support all hosts (M) (with all assembler dialects M') and all targets N,
i.e. in the worst case one would end with M'x N variants of it, or M x N if
one supports only one assembler dialect per host. It is clear, that the
number of variants is one of the biggest disadvantages of such an approach.
Now I will focus on x86_64 host and x86_64-softmmu target.
cvtasm has to do the following tasks in this case:
0) convert repXXX; ret to ret only. (Not done yet, x86_64 only, but does not
harm).
1) append to all functions, where the last instruction is not a return a ret
instruction.
2) add a label to all functions with more than one return before the last
return.
3) replace all returns not at the end of a function with an unconditional jump
to the generated end label. Avoid touching op_exit_tb here.
4) check all jump instructions if they contain jumps to external labels,
replace jumps to external labels with calls to the labels.
The task 0-2 are easy, task 3 may, task 4 is definitely target/host dependent,
because there exist intentionally some jumps to external labels, i.e. outside
of the function, for instance op_goto_tb.
Please correct me, if I am wrong or something is not mentioned above.
The attached cvtasm.c allows compiling op.c/op.s/op.o without any disabled
optimisations in Makefile.target (patches for Makefile and Makefile.target are
attached). The program itself definitely needs a rewrite, is not failsafe and
produces to much output on stdout.
The macro OP_GOTO_TB from exec-all.h in the general case contains two nice
variables and label definitions to force a reference from a variable into the
op_goto_tbXXX functions. Unfortunately gcc4 detects that these variables and
lables are unused and suppresses their generation, as result dyngen does not
generate two lines in op.h:
case INDEX_op_goto_tb0:
...
label_offsets[0] = 8 + (gen_code_ptr - gen_code_buf); // <--
...
case INDEX_op_goto_tb1:
...
label_offsets[1] = 8 + (gen_code_ptr - gen_code_buf); // <--
...
and qemu produces a SIGSEGV on the first jump from one buffer to the next.
I was not able to force gcc4 to generate the two variables, therefore I had to
replace the general macro with a host dependent one for x86_64 similar to x86
but using the indirect branch method.
After the replacement qemu worked when compiled with gcc4.
I made my checks with the following compilers using Debian testing amd64: gcc
version 3.4.6 (Debian 3.4.6-5) and gcc version 4.1.2 20061115 (prerelease)
(Debian 4.1.1-21).
Please note: These patches work only for x86_64 hosts and x86_64 targets. They
will break all other architectures. I did not check i386-softmmu. It works
for me.
I apologise for the size of the attachments.
Kind regards
Axel
[-- Attachment #2: exec-all.h.diff.zip --]
[-- Type: application/x-zip, Size: 694 bytes --]
PK\x03\x04\x14\0\0\0\b\0x6·ðë-\x02\0\0\x06\x04\0\0\x0f\0\0\0exec-all.h.diff¥RkOÛ0\x14ý\x1cÿ£ IIÓôE)(©\x18\x04c\x1aý¶N8mXb\aÇ}LÓþûn\x1a
ñmWQ¬ûôñ=ç³Jä&ÜÈ8\x14yÞY°³ÿ7öíò\x1eiË\bÝJ/M,«î£,Íïå®ö\x19iM&WÃÐQeZ¡ß9ê±$KSK¦v÷\x01a¸ç:\x13|3\fz½c\f\x06Ñáq4ì#ì9ÛAA\x10¼*\x1fâV¦¼?Qoгñ\x18áááq{ >\x06#Ç\f\x03¸ýhª\ÿtÊð\aë\x05½\f^Ïg`Á̳\x14L3%\x13xoNF|4äÜg\x01\vº-\_^b\b]Ú¬È*YA¬Å/Ø
D.f2'?µÒl\x03sm5¢ÐC«»í,²
ôÕ¹Bd\x16$*ZOªúØjA\x13uz¯ \x1c±.J\x02`ÙÅnÎ< Aí\x1c48q}7¹ã\vOJ\x14²\r;+
\x11E^[Êwj² Ñøíì|t´`
K{æÕ\x1eZ¾çy\x13#T\vKÜ]ä:þIѧ~øÑθ^[û]ýðOw8\x17UÁ9+]÷åóçºÂÛI\x15S5µîë\x14±Ñ\x11y6W8\x01ò6y~\x7fËï¾òó«^[þåüöÊSm4\x0fõáFMÇ\x13\x17s;K îO|[RÖ
ÕË
ï£z(J´>ôÞOÖ\x02zï%"çß+\¯Ù²O;Ûe÷
\x170ð*I
\x041ü°¤«I<õ¢1ÛR Ki¶Tð
m$Jm¬Q\x7f¬\x13¢<Ѥ\x1c¥-õÈý\x05PK\x01\x02\x14\x03\x14\0\0\0\b\0x6·ðë-\x02\0\0\x06\x04\0\0\x0f\0 \0\0\0\0\0\0\0\0\0¤\0\0\0\0exec-all.h.diffUT\x05\0\a.Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0F\0\0\0Z\x02\0\0\0\0
[-- Attachment #3: Makefile.diff.zip --]
[-- Type: application/x-zip, Size: 631 bytes --]
PK\x03\x04\x14\0\0\0\b\0x6\v\x05,Éò\x01\0\0\x05\x04\0\0\r\0\0\0Makefile.diff¥Rm@\x10þ\x1c\x7fÅ@=ÑM4I«GÁÔó®GïÈ\x11SèÒ`tMS×ÛÕpã~{GI\x10Ú\x0fí;Ï2ϼ<ãÜe\x11}±á!x¢1K¨òéÿ²t}¨Ù0¼\x14!£gÍÕV2v
`tDz-\b4ñ\f¬¡eMÅ1\x12¨ß§ö\b!ÇGÏ!\x1606Í\x19XïmÓ´'3 &^F×õ\x13{<=cÏlszd+\x03d|iÌ@¯oÇQ@ IlPû«ÅâÞ×\x10\/\´¥ \x17ID raC´Ï¶4SûÞwÏÿv£)ú\x1f]\x10î@¦'*ôÔþÃü«§\x01q±\x10\x06Ê\x02\x0eá¡:\x1a4ÕÎWåAÑp/¶Qu<_Þz«õõÝÒ×´ZÛä£a WfZ«ëtÔ¶8\fë^¾,üÕÚu+ÑîÍýü¶ï>>\x1eñç¹ï[\x17á : þÄ\x06õ2û \x15óêÿWW LhÙ
¼`ÇY\x04<`S2¼sÁ7 M%lö8×Ûóf\x05Ã1íe!\x178¸¢\x0eª6A¢H\x02!äY̶Ã4xjá/à9aË\x0f5
[³!ç\x0e\x12ôÀhÐÐ<ù\aC_pÚ¢î&¬P)2r\x1eÁà\r\x06£Á^[\x0eª\x15ñß·¤ ²ÍXÐ\x11s\x01\x11*íüü+8üè¬\x1aUc}}\x05úÂ
°à
\x19¿\x01PK\x01\x02\x14\x03\x14\0\0\0\b\0x6\v\x05,Éò\x01\0\0\x05\x04\0\0\r\0 \0\0\0\0\0\0\0\0\0¤\0\0\0\0Makefile.diffUT\x05\0\a:Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0D\0\0\0\x1d\x02\0\0\0\0
[-- Attachment #4: Makefile.target.diff.zip --]
[-- Type: application/x-zip, Size: 888 bytes --]
PK\x03\x04\x14\0\0\0\b\0¸x6>o'\x17å\x02\0\0í\a\0\0\x14\0\0\0Makefile.target.diff¥UmoÚ0\x10þ\x7fÅ¢ \x18HZJ
ÆèÔ\x17\x04Ý´IªÔ8à58ÔNh·\x0fûí;'
BÛµÙê\x0fgû|Ͻ<ÏÇrÄï<8õ¯y B^}5æ1i½}Aw\bƧ\a5\x1d%q]»áÓ$\x13\x02Ús¢x¬\x04\v9\x06
\x16\x04§ê4\2\x12A\04\x01ªÒýd)¥u\x05·*\x05n½Þ\x04gÇÛnxn\x13h\x1dG!sjYÖSÐö\x1a¨±åmmg Òn\x03uö\x0eX(Ðn\x1389þ8l\x118êô{ËîÁIçpØ*³E
À§og½³VµZ^[ýc.KåÞ×ÞðóA
XÝ/\x17á©9bóØ×Ó#ØDs\x05ÈQèÇ\x02éO9Îà°wqÙ\x19t\±²E\bø\r×µ¶¯¦4ëæ®í8`5ßÉäÍ\x18f±¡·\x05¥²ð0D\x17¸ìv+(Ïû÷µS\x01:\x04\x1aAmÄç5 )½c+»ï\x04ÖÇþÊ¡»ÿÎÙxÂ%p6 h\x1c\x16÷/5®Ñ\x04\x02ö°\x12BÑ-,ùèd®\r4\x11U<R#®èU\x18±kmC±XÉ\x03^[3Ís^[ÇsÊ&¹íÍf*~qªÅU\x7f05Vùsc*ÒúG2!4\x03Y^[ÿGD\x1eÜ\x03\x13y¬×¨È\x03x<ø§dÀë ?\x14cICÿ/Bå\x06Pÿ \x12É\x12qÏØM\x1f\x19µê©ûì:6\x1c\x17ûÕp¶lÇMï#¶\x06ÄV'\x1e ÐaÖ80Âr\rtlna©\r¥\x0fæ\x18ËÔ\x11\v¥ÎÖæ\x0e§Å0]xöBwûýû5í\x19ÑÖ«Âxµ^\x05\x04s®®"Í)v¬û¶Âõ\x12\x05_hMbÄJ\x13ÌÕiI/\aa«ub7\k¯ ´i#\b°3\x1aÚ±·\0FFà\x0f\x13âÉ' c?\x16\fß±Ð1W\x19ñ»NJüî¢\x7f/Sùkt\x16r_z\x16\x14Ö^[À&~\x1d\x14>lþFT\x7fpn ¯õ±T¢·Á×¹\x0e
¥«`\x19yzp¶ÂÓýbºBbÝaè¥ÌgAfïBæÑ®?PK\x01\x02\x14\x03\x14\0\0\0\b\0¸x6>o'\x17å\x02\0\0í\a\0\0\x14\0 \0\0\0\0\0\0\0\0\0¤\0\0\0\0Makefile.target.diffUT\x05\0\amZ\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0K\0\0\0\x17\x03\0\0\0\0
[-- Attachment #5: cvtasm.c.zip --]
[-- Type: application/x-zip, Size: 4784 bytes --]
PK\x03\x04\x14\0\0\0\b\0Âx6a\x05\x135\x12\0\0ÇD\0\0\b\0\0\0cvtasm.cå\x1cksÛÆñ3õ+Ît\x15\x01\x14DrIMC©ãÚm:i\x1eÓ\x19Ç\x01I@\x02\r\x02(\x0eþ{w÷ÞxPÒöK3ILÜííîíûö\0?ÎU¾['ì9oÖY9½º8zl\x0fåÙ²3\x16×í±:+Zc«æç*Á¡#ÞÄM¶b×e¶fI]µ·*\vÞ°ÕU\OXºm\x026N}\x16Eq\x03»&"Ï+Ê:ivuáû\v#+`Ùu\x13Å|ë½þêÍ« \x04Lü*w\rB"È6Î
\x0f\x7f\0§«@\x10àõ\x7fôë\x11\x7fp Àâ ,[Ð\x10Â×ògÊ<ZË\x1e
ì)óÙ¯G#Áùø\a\x1e_&ÏØ1\aø)G¢SþS1\x0e\bû»Ù{_`¸(YÈÒ²J
¦çïq=ö-"\b\x11²·?¼ycSaãcþÊ]¾fEÙ0Ä\0$\x15¹K\x05xp¨\x03^[
@ô¹\x17\x1a¶¢D\x0fr\x13ÒÆt<ñ:\x03Y!
\x1aÙlqt{°1H5¡N2\x05\x01Ør¾ÏÎ?}/Æ®ã(Ï`ñu¬\x01\x7fÝx×q\0¨\bÂk^T`¡Mê\x01g¿$%ýô\x11*¸\x15ã7uÖ$\x04\x05ó\0,\x1d8sÉǬñú=Äy^®"±½&<¸Hn5v\r÷Æ4\x05e!Ķu\b#\x13Æ·´Öc¸¿¨a\x1cT-q\v*d\x12ÛÆQ\x19Ó8\x1aY1ÏÕ_ÕC²N\x04BI P´\vî;¤\v `\x11¬\0æ,\x16`Z°°B°ð\x18Ö¬4+µ\x17EÙÓÏ?"ýö\x1ad0ú\x11\x06?û\x14Æ\x1eQöâ\x1f/ÿ\x16ýíÅwÑË\x1f¿^|÷5\x1fé¹o¾^ÿðöeôíë\x7f²qYEc³ê»¯£\x7f¼ú\x05wôůþúvÎÆÓê<γˢgö\x1cf{æþòå_£7ß¼IØiwBÓxfyâNþýëo£7/¾|õ\x06¨F\x11°y\x14Q\x1e/|>\0x.\x017Ûj6\0òÔ\0\x16¥^Û' ü\a\x7fûâëWß½{\x1f
ÅaÅ8?c\b\x1aãM±LÌH"ôÈR\fÄ\x06f)ab="\x06~ÑÏ\x02 7k.å\äbàÒÀä\x12æR\x14rÄ`.Jó³¢ÉÊ\x1aá4b\x06$!Áå¯ÙÑn®0Øò«+H+ö7!;a'd¶êñ§¦õ\ÓóÑ\b\x17Ñâ\x04ÃÉÈrFM7*ëè¤.ï /x¬lb3ß(öú$n¢=»©<±\x0fV᣻(Cð?Ò\x04>ù¸~\ï£VïV\rËÁâ#Î~Õ9dÂr\x01BóDÑga+\x026aÉ
¼ÌÁ\v!Pñ¹M¶<i<\x1eÌTp_Æ/~vAÃùùçz$ªb¶\x020ð9Ô\x12a7e\x12Ùu\x02L?\x0fóÖI8sߦ6k1c!JhÃÊ)¶cMõô窬`\x03BãùZáo¯³ä%%+ çjS¬¿ô¸µL1³\fK4X#Í\0´¸\x02ÆpYÃâÿÅéÜÍDy¿¸Vy\x12×Þ¸8\x119*\x06â\x03¢¯vü*BíÊ_T´l¥«$Ì¡\x02{¨ÕËÎØj=eÙÖÔC\x04Þ\x104ÎiÃ1Jg$-4\x17?źÅvIH{@º¸\r)ðÓS®
éÌÏî²\b`bú\a\0\r\x16Æ0É]\a¨(âÂô2iVXú\x18B^}óZÄ-¡ç\x01:ãD»ÒWÄâ%°úÁ©÷Z
¨*{M=·\x0f/Êy\x05¼0A4ÐÍ\x03yºqc¾`¹©a«àOÛmR4CdF#F\x18ë²ö D
á©©WWµ`åäñ%è:r2Q\x061\v@Ëì\vøï\x19\x14\x17¡\x1aw]§vì@Ø&ñILTÄÛd .8ìm\x10¯\b®«N(ÜñWÿò²hmèÚ*o¢ÒiÌ1`\x1d\x06\x17Ð\x11#¤Ð²pg óPV\x13éñ*^%M\x13á*©I§¹\bиbu¬>À\x19®»t47 ËÄò¸X\x10ÊJ¢3\x19 ´ÝzR\v\0OYóQÁä\x01ælÜ#BÂCô\x10\x14OÙ|!¦\x10¥ò\x16\x19³ç¡8\x1fÉàI:¡Õ\x01?ÃÀ9\x1aá\x10D¹\f\x04#á:£Û®\x0fñ^#ϸ0!ËP*\aQÂù\x11,m]Q¾-ÇÏPp\x11·-\x0fll\x16 Î\x1f\x11£CvkHY±ò\x03Á\x17mwÔ¶KZâ³®ÁVd¬ð¯0V\x01g\x1dU v\x11º¬WC.î\x15\x1e5Å©y¿ï\x11û£ûy\x1d¶Âò`<Å\x19Õ¢p\x1a\0ÀQï?+^Ç{y\x17uGI²S\Ç.\x15a¦è \x1f:Ý5W SxiSäVí\x1d$Å:tÝ-°òó\vwU¸Bãé\x12Êï!j\x19hMp\x12h÷\x1dpWZ%÷?as´Ö7N]':£>'~\x14Î\x0eòcÁVÇÃ\az±kq\x03ÙÊîþÔá ¥æPj§îqÔ^k\aÉ\x1c`ë\x0fI38Þoð8³ÏtqÌ\x18î}Í\x14Ww͸$3ów©ú?5SÛT6\x1f?F`hP)\x1ePtúf[\x11;fò^F\x04ËûlHbÝ\x17þ\0µÊ\x15B\x16\x1f§°\x1c\x19TB
nó@.@¹«Þ\x03%!¬ï`Ú\x06Â`à $n4ê$UIƸ\x14ÖE\x17´c¹Ì\x142zo\x03fF
{Ú
I\x16.hK£\f i\f×.8ë\x1a\x14ÒCb\x1f\x18Õ°!õØ^[\x06û\rS\x05ÅÂCáf\aÚ°ñÜmß÷1a*_ܾa¿ÕÊcÙ¤²Oª\x1f\x06÷:©\x02â@Qöû91åÃÂ9¦¡¶tè@Mxî*5äÔf¾G^[}ê\0\0sG6´-\x04 ûBìVï\v\x11° qõx>Lª\x10¡PÎ{ã\0@Zq\0(\x0e¨3Äü.a·6OMò{l?\x1eÚ¼hÊ÷Ô\x04ÄB«s¿GF±è¢öS:G\vì >w®\0\x05Â!ÇJÞ¢À\x1fvlÉ:þ]^//ï!g\x1eÚ¿¼Ä\x18\x12uhßtì\x115P0¦=ô°UÒ¸\x1aïÈ\x1c\r\x1d&\x0f\x12<ÀY¢§û ¿·ò
\x1cõºW׬wT½\x11WE¸ëÖ÷\x13qPeU Ø\x12N\x13g\x05gòÆ1Y«9Ð 5HhÔ©ÙåX \8øWå\x0e¸·\x0eÙÅ)\0§ÞÁA§à5Y½nÞÚ5õþzq@¿¾Ó¤t×[O+È/`\x15V¥\v¾¯aÙâéfe_;ñNf°î\vÚnªwØìlLß¾Zðè\x17¨åµºÜ\x13_m\x13Ûåâ@\x02u¬f½Ûu\x01q\x02 ¦ÇzW!Ôªv±3æe`ò,{NË\x17§§hÛ
ÞeïCëb¡Ö£~vû´Öæõ}Æ^ëQ-w\x11AF\biD²ú\È,uÃâ6«/V¸Í%)ö¨êÁ¸\x01×'Ç`&?»\rø §\ ¨)tJ#=\x16Ýî\x15D4æög@\x10}^[¾ jÓ«
\x03^[>\f\x10J;g¬¡¨Ù\x17zì\x14¤ ®;R!Þá¯\x02>Ó¦kÜTÛ¼v-\x1e*ÛörwÝ;Äý>¬ÔõHJV?2:KµRT\x1eTR÷qùp\x13¤¢tµ¦)\rFæQÄÔÎÉÕj\b\x1f\x03jyÀ
YÞe(=÷^[\x1d\x1dÊ[\x0eÇ}`ÿÐéÊñõT¹\x0fsýGÐLµ\x03\x05eW£¡¬HK;\x11b½Î³\x19icÒI>Bö)âÜd\x1fB\x10ºÛ(JB\x01ÿàp '/$.V¦L\x03¦ý]!ÅäË´Z¿\x7f\x16É\x0e\x01°¿sSÅ%!!áË\x11o\f\x1c¸Ó(ð.\0`¶ñ%èDPåøêób¯²,\x11)vÛeR+\x04È,\f\x12ãYÃ<¥\x19+wãþtzà{ÅƸ\x16Iá
®ðD¢ÏÄ%
®gÆ-\x14n²æJ\b
}a\x03Ò\x06Õ\x044§ñ \x1aVe\x01i¿©ã=mµö©"¡êÁb,Z\x15;PM\x147\x11µÌÔ¸²\r:îñn\x1c\x10V61\x0f\x15ò\x16Ãó\x12Å\x1f]ëÊn\x03ÛçKΫqÁ&)\x14*îÔ¾ÅÞ÷.æ0ôz6\x11RÂýNáRt¬) YLÖ(¢\x1028ôñôª«e\x14El«UÄ\x02LLÚ\vBfBÎ\x05\x12\x16YL\x15ð\x16w\0?%\0,êÝu\x14-ö,Ãùð¬³\f\x0fôÃð\0¯<òú\x0e DS¿þÃX\aFô¹\x06 Äu\x19ͱÎ$·Vâ\x111·zmÖÉÍy»ù\x05Z-Ê^[¼<Ì\x13Ù²&ÁÀ4¬[\x11ç-ÅÚ#c\x1fu÷°É]²8Ð\x10ðða¼¼Â\x13J]'¼*u\x11¬.hEÖÝð½M\x1dýÞÄrnJû\reÊ¡eËy \0dêá"öÅë5ðï\x11ëØÎâ\x1arÈ\abGÈA4Ý\x01ÃÉño®£ì\x10<%[tDtkÑJ\x11½D·\x15\x12uß:ôÃðÔ÷¡=ÿï }úp´(+îf\x1eº!AÈÎr4\x16ñuv a}9Ç\x17À©Í,ðz`\x19\x17sk\0F\0hÃÏÎßáI*ß÷³'¾åzPó§òÅeÞ¬Ë]\x13Ðk¦¨89æ'àH»bMï /窾Ó(grOÃzqèL\x1cѼ¶\x17\r\x1c\x1fØm \x1eãÎ\x1c0Ô§§^[½\x05q/\x13\x19ÆÂ%j=íhg#\x03±r÷ö~\x7f·\x1c~ó^BªÖthû_û2a\x14þ|nü9wëdwåò<0ó}r\x15ì£X
@Ï{\fc¯Kcú#K\x14ZG\x03\x16»1÷Y\x14Cñ%\b¨ããrî[g»VõAïô\x1fMp¤ï<\x0f-?LqT¦ú\x18\bì]Å©J\x11ÖX;>ÍTnħ³\vìZËÐ\x7fzÚг(Úl UÖûÌì«Únm6\vHÝÙ\x05p{vö{øô\x04\x0f\x1ceÆÛ&Ã>]@)ÖÔÕÄaKC±ô×\0ì"µuá¡õo\x1fÙºÓì#kÝotmad\x15¸\x1aU\bÂHÔ\v²áþ¢\x121ÎÚµü³½j!Q5}µ8ó¡!\r\bFrRÕåz·#
¬\x02è\0¬ûbðÝüüsùµþ&TeM\x0f~0¾9æ\x11~1Ö\ah_ï¹Nª<\x06Jú|iÕ\x01\x10Çå¹P;\x0eú]\x0fµ·M~\x11\x1aS·ë!d}£w\x7fc@¼£@\x1aª]\f"µ_3â«b¶ðîÓÙ>#J¦{¡y©\r#êQ¼Þ,ý\x17Uhôf]ç)ËU6\x13x\x05¥cÖø1{ùã÷øYÇë¯þI\q\x05\x15\0B\x10ÖP?58Ð\x1cÓ'[Bsd\x16Ö'%Ì|S\x02{ 5\a¸GE\x1e\x7f+Ú.eÌ©gg\x17$\x11M\x18ÈÒ3ÑÅ\x0f¬Ä°%[µÅÑcpç,m3d>gùϱÄwË\x7fýÔüáó\x1d\x1f×¼êçÖA¹ê\x1f¶\x11aYâÅè\U\x16^[UKlÀ\x066Ïù\x02+\x06+á:¯á¢mmÜ>î°1^[#NK&X$7\x14¼ZÕ\x01·º§Y¦2~\v»\x04wá¤\x05\x12[&°\x15q\x05¸©£!EXiI=÷kÊÁ4üù3[â"\x02±Ü¼â\x7fË\x1cÓ=ü Gx\x18{Ç\bû¸%'kð\vÛL`|\v_ZT\x0eæ\x04]\bU×Èjè\x1ci^[#1ÁÉ> ~ÍésQLô¼«\x15¬^[þVÐ¥+%\fº5vÈ(ænäQ\bPR«.äwÚ\x0eÅÆ\x10ÜÉE"S÷\x1dQ{å\x12\x0f6\x18$®-+ù*pTD:Ëh\x1eâZº^[ö_6Nì"CµP½
ÊþÙB¿+î\x12mP»»'ûwny4Aê«\x06©MÀ8ÖBùzhMP}Ì
\0ð¦·X\x16lw´\x03ì¸\x05ÕÔ\x1c@\v¬èuÛÜú0xêSÝç!\x13\v\x05Ö,ÙUÌÙV¸`eÐ{´ÕñÓDå\x0eÿ\x13´Ó\x16ªô©¬Å1
ÏUÞ>\x13É\x04<¦³f$k\x0eeàG<F{²m,î3ÎL\x12^³hÈXÝÒ^V÷¦oJ2D²\x01\K\0¦\x16\x7f \x1fªF\x19+à\x17Ùí
ÙïeN±ÒRÅo¿uyÓÍÀÎv¾<¢oÖÝeæ¥ê\fë\x1c"È\x14
ëæE½Ð5so²Ó.\x01í¯÷Yßçû\x03\x1fý,ö\x04Ð
\x0fë²0WÂ}sõB }]D¤Ë6ü\jæv*R>õü[Ð\x03Hj\x15#ì×ÐG\x0eÝ\rÎTª^ä5
,]ePk\x14\rÌ\0
\r$\x12yï
(\b\x1d2¼Ó;\x18l\x13&ó! %üìòª1wY\b}ÔâõaþÞ~\x19\x17»10CAYZ`7Ø\vCb(y¿²s_ª¿C 3//
aLa3G£¹EÇèÁC©ó*°Æ©tCb\x03®od¤o+\vR ¥°~eWÚµ6Òw*<ÁKÿ\x05x\x1ahm[\x06îJÐe}·%äÀ\x0el¹Ú5BGÖ}h®d¡ö){B¦HÔ_Ó¢í\x0fJR4z\x16Îny¯ørOÖót2y¤&í¯ê}ÇÉ÷ø¸õµ^ï·yâÓ<Ìø\x17#¬¼õ7NØ|I¶þ\rPK\x01\x02\x14\x03\x14\0\0\0\b\0Âx6a\x05\x135\x12\0\0ÇD\0\0\b\0 \0\0\0\0\0\0\0\0\0¤\0\0\0\0cvtasm.cUT\x05\0\a}Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0?\0\0\0[\x12\0\0\0\0
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
@ 2007-03-24 20:15 ` Anthony Liguori
2007-03-25 10:15 ` Axel Zeuner
` (2 more replies)
2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
1 sibling, 3 replies; 17+ messages in thread
From: Anthony Liguori @ 2007-03-24 20:15 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 5357 bytes --]
Axel Zeuner wrote:
> Hi,
>
Hi Axel,
By adding some GCC4 fixes on top of your patch, I was able to get qemu
for i386 (on i386) to compile and run. So far, I've only tested a win2k
guest.
The big problem (which pbrook helped me with) was GCC4 freaking out over
some stq's. Splitting up the 64bit ops into 32bit ops seemed to address
most of the problems.
The tricky thing I still can't figure out is how to get ASM_SOFTMMU
working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
deal with the register pressure. The problem I can't seem to fix though
is that GCC sticks %1 in %esi because we're only using an "r"
constraint, not a "q" constraint. This results in the generation of
%sib which is an invalid register. However, refactoring the code to not
require a "q" constraint doesn't seem to help either.
The attached patch is what I have so far. Some help with people more
familiar with gcc asm foo would be appreciated!
Regards,
Anthony Liguori
> there were a lot of discussions about compiling qemu with gcc4 or higher. The
> summary of the discussions were, as I understood, that compiling qemu with
> gcc4 requires changing the code generation engine of the most of the
> supported targets. These changes require a lot of work and time.
>
> How about splitting the current static code generation process further?
> Today gcc produces object code and dyngen adapts it for the purposes of qemu,
> i.e produces the generation function, patches in parameters ..:
> gcc -c op.o op.c ;dyngen -o op.h ... op.o .
> The op_XXX functions generated by gcc may not contain more than one exit
> and this exit must be at the end, no not intended jumps to external
> functions may occur.
>
> It is possible to split the transformation into the following steps:
> Generate assembly output from the C-Sources: gcc -S -o op-0.s op.c.
> Convert the assembly output: cvtasm op.s op-0.s.
> Assemble the converted assembler sources: as -o op.o op.s.
> Use dyngen as before: dyngen -o op.h ... op.o.
> Nothing will change if cvtasm copies only the input to the output, i.e. this
> additional pass will not break existing code.
>
> A full featured converter (cvtasm) has a lot of dependencies: it has to
> support all hosts (M) (with all assembler dialects M') and all targets N,
> i.e. in the worst case one would end with M'x N variants of it, or M x N if
> one supports only one assembler dialect per host. It is clear, that the
> number of variants is one of the biggest disadvantages of such an approach.
>
> Now I will focus on x86_64 host and x86_64-softmmu target.
> cvtasm has to do the following tasks in this case:
> 0) convert repXXX; ret to ret only. (Not done yet, x86_64 only, but does not
> harm).
> 1) append to all functions, where the last instruction is not a return a ret
> instruction.
> 2) add a label to all functions with more than one return before the last
> return.
> 3) replace all returns not at the end of a function with an unconditional jump
> to the generated end label. Avoid touching op_exit_tb here.
> 4) check all jump instructions if they contain jumps to external labels,
> replace jumps to external labels with calls to the labels.
>
> The task 0-2 are easy, task 3 may, task 4 is definitely target/host dependent,
> because there exist intentionally some jumps to external labels, i.e. outside
> of the function, for instance op_goto_tb.
> Please correct me, if I am wrong or something is not mentioned above.
>
> The attached cvtasm.c allows compiling op.c/op.s/op.o without any disabled
> optimisations in Makefile.target (patches for Makefile and Makefile.target are
> attached). The program itself definitely needs a rewrite, is not failsafe and
> produces to much output on stdout.
>
> The macro OP_GOTO_TB from exec-all.h in the general case contains two nice
> variables and label definitions to force a reference from a variable into the
> op_goto_tbXXX functions. Unfortunately gcc4 detects that these variables and
> lables are unused and suppresses their generation, as result dyngen does not
> generate two lines in op.h:
> case INDEX_op_goto_tb0:
> ...
> label_offsets[0] = 8 + (gen_code_ptr - gen_code_buf); // <--
> ...
> case INDEX_op_goto_tb1:
> ...
> label_offsets[1] = 8 + (gen_code_ptr - gen_code_buf); // <--
> ...
> and qemu produces a SIGSEGV on the first jump from one buffer to the next.
> I was not able to force gcc4 to generate the two variables, therefore I had to
> replace the general macro with a host dependent one for x86_64 similar to x86
> but using the indirect branch method.
> After the replacement qemu worked when compiled with gcc4.
>
> I made my checks with the following compilers using Debian testing amd64: gcc
> version 3.4.6 (Debian 3.4.6-5) and gcc version 4.1.2 20061115 (prerelease)
> (Debian 4.1.1-21).
>
> Please note: These patches work only for x86_64 hosts and x86_64 targets. They
> will break all other architectures. I did not check i386-softmmu. It works
> for me.
>
> I apologise for the size of the attachments.
>
> Kind regards
> Axel
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
>
[-- Attachment #2: as-postprocesss.diff --]
[-- Type: text/x-patch, Size: 26257 bytes --]
diff -r 83ff8e3c6392 Makefile
--- a/Makefile Thu Mar 22 12:36:53 2007 +0000
+++ b/Makefile Sat Mar 24 15:08:16 2007 -0500
@@ -28,7 +28,7 @@ LIBS+=$(AIOLIBS)
all: $(TOOLS) $(DOCS) recurse-all
-subdir-%: dyngen$(EXESUF)
+subdir-%: dyngen$(EXESUF) cvtasm$(EXESUF)
$(MAKE) -C $(subst subdir-,,$@) all
recurse-all: $(patsubst %,subdir-%, $(TARGET_DIRS))
@@ -39,10 +39,14 @@ dyngen$(EXESUF): dyngen.c
dyngen$(EXESUF): dyngen.c
$(HOST_CC) $(CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -o $@ $^
+cvtasm$(EXESUF): cvtasm.c
+ $(HOST_CC) $(CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -o $@ $^
+
clean:
# avoid old build problems by removing potentially incorrect old files
rm -f config.mak config.h op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
rm -f *.o *.a $(TOOLS) dyngen$(EXESUF) TAGS *.pod *~ */*~
+ rm -rf cvtasm$(EXESUF)
$(MAKE) -C tests clean
for d in $(TARGET_DIRS); do \
$(MAKE) -C $$d $@ || exit 1 ; \
diff -r 83ff8e3c6392 Makefile.target
--- a/Makefile.target Thu Mar 22 12:36:53 2007 +0000
+++ b/Makefile.target Sat Mar 24 15:08:16 2007 -0500
@@ -27,6 +27,7 @@ LIBS=
LIBS=
HELPER_CFLAGS=$(CFLAGS)
DYNGEN=../dyngen$(EXESUF)
+CVTASM=../cvtasm$(EXESUF)
# user emulator name
TARGET_ARCH2=$(TARGET_ARCH)
ifeq ($(TARGET_ARCH),arm)
@@ -78,11 +79,11 @@ cc-option = $(shell if $(CC) $(OP_CFLAGS
cc-option = $(shell if $(CC) $(OP_CFLAGS) $(1) -S -o /dev/null -xc /dev/null \
> /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;)
-OP_CFLAGS+=$(call cc-option, -fno-reorder-blocks, "")
-OP_CFLAGS+=$(call cc-option, -fno-gcse, "")
-OP_CFLAGS+=$(call cc-option, -fno-tree-ch, "")
-OP_CFLAGS+=$(call cc-option, -fno-optimize-sibling-calls, "")
-OP_CFLAGS+=$(call cc-option, -fno-crossjumping, "")
+#OP_CFLAGS+=$(call cc-option, -fno-reorder-blocks, "")
+#OP_CFLAGS+=$(call cc-option, -fno-gcse, "")
+#OP_CFLAGS+=$(call cc-option, -fno-tree-ch, "")
+#OP_CFLAGS+=$(call cc-option, -fno-optimize-sibling-calls, "")
+#OP_CFLAGS+=$(call cc-option, -fno-crossjumping, "")
OP_CFLAGS+=$(call cc-option, -fno-align-labels, "")
OP_CFLAGS+=$(call cc-option, -fno-align-jumps, "")
OP_CFLAGS+=$(call cc-option, -fno-align-functions, $(call cc-option, -malign-functions=0, ""))
@@ -512,7 +513,12 @@ gen-op.h: op.o $(DYNGEN)
gen-op.h: op.o $(DYNGEN)
$(DYNGEN) -g -o $@ $<
-op.o: op.c
+op.s: op.c $(CVTASM)
+# $(CC) $(OP_CFLAGS) $(CPPFLAGS) -E -o op-0.i $<
+ $(CC) $(OP_CFLAGS) $(CPPFLAGS) -fverbose-asm -S -o op-0.s $<
+ $(CVTASM) op-0.s op.s
+
+op.o: op.s
$(CC) $(OP_CFLAGS) $(CPPFLAGS) -c -o $@ $<
# HELPER_CFLAGS is used for all the code compiled with static register
@@ -581,7 +587,7 @@ endif
$(CC) $(CPPFLAGS) -c -o $@ $<
clean:
- rm -f *.o *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o
+ rm -f *.o op-0.s op.s *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o
install: all
ifneq ($(PROGS),)
diff -r 83ff8e3c6392 cpu-all.h
--- a/cpu-all.h Thu Mar 22 12:36:53 2007 +0000
+++ b/cpu-all.h Sat Mar 24 15:08:16 2007 -0500
@@ -339,7 +339,9 @@ static inline void stl_le_p(void *ptr, i
static inline void stq_le_p(void *ptr, uint64_t v)
{
- *(uint64_t *)ptr = v;
+ uint8_t *p = ptr;
+ stl_le_p(p, (uint32_t)v);
+ stl_le_p(p + 4, v >> 32);
}
/* float access */
diff -r 83ff8e3c6392 cvtasm.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/cvtasm.c Sat Mar 24 15:08:16 2007 -0500
@@ -0,0 +1,802 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <ctype.h>
+
+static void error(const char* fmt, ...) __attribute__((noreturn));
+
+static int cvt_asm(FILE* in, FILE* out);
+
+int main(int argc, char** argv)
+{
+ FILE *in,*out;
+ int r;
+ if ( argc != 3 ) {
+ error("Usage: %s in.s out.s\n", argv[0]);
+ }
+ in = fopen(argv[1],"r");
+ if ( in == NULL ) {
+ error( "%s: could not open %s", argv[1]);
+ }
+ out= fopen(argv[2],"w");
+ if ( out == NULL ) {
+ error("%s: could not open %s", argv[1]);
+ }
+ r = cvt_asm(in,out);
+ fclose(out);
+ fclose(in);
+ return 0;
+}
+
+static void error(const char* fmt, ...)
+{
+ int i;
+ char buf[1024];
+ va_list va;
+ va_start(va,fmt);
+ i=vsnprintf(buf,sizeof(buf),fmt,va);
+ fwrite(buf,i,1,stderr);
+ exit(3);
+}
+
+static void alloc_error() __attribute__((noreturn));
+static void alloc_error()
+{
+ fputs("allocation error\n",stderr);
+ exit(3);
+}
+
+static void* smalloc( size_t s )
+{
+ void* p= malloc(s);
+ if ( p == NULL )
+ alloc_error ();
+ return p;
+}
+
+static void* srealloc(void* p, size_t ns)
+{
+ void* np= realloc( p, ns );
+ if ( np== NULL)
+ alloc_error ();
+ return np;
+}
+
+#if defined(__i386__) || defined (__x86_64__)
+#define ARCH_HAS_CVT_ASM 1
+
+#define OP_FUNC_PFX "op_"
+#define ASM_RET "ret"
+#define ASM_ALIGN1 ".p2align"
+#define ASM_ALIGN2 ".align"
+#define ASM_DBG_LOC ".loc"
+#define ASM_DBG_FILE ".file"
+#define ASM_JMP_LABEL1 "__op_gen_label1"
+#define ASM_JMP_LABEL2 "__op_jmp0"
+#define ASM_JMP_LABEL3 "__op_jmp1"
+
+static const char* ASM_JUMP_NAMES[]={
+ "jmp",
+ "ja", "jnbe",
+ "jae", "jnb",
+ "jb", "jnae",
+ "jbe", "jna",
+ "je", "jz",
+ "jg", "jnle",
+ "jge", "jnl",
+ "jl", "jnge",
+ "jle", "jng",
+ "jne", "jnz",
+ "jno",
+ "jnp", "jpo",
+ "jns", "jo",
+ "jp", "jpe",
+ "js",
+ 0
+};
+
+static int white(const char* p)
+{
+ return ( (*p == ' ') || (*p == '\t') || (*p == '\r') ||
+ (*p == '\n'));
+
+}
+
+static int white_or_zero(const char* p)
+{
+ return ( white(p) || (*p == 0));
+}
+
+static const char* eat_white(const char* p)
+{
+ while ( white(p) )
+ ++p;
+ return p;
+}
+
+static const char* eat_non_white(const char* p)
+{
+ while ( !white(p) && (*p != 0) )
+ ++p;
+ return p;
+}
+
+struct line_s {
+ char* l;
+ size_t n;
+ size_t alloc;
+};
+
+static void line_init( struct line_s* s)
+{
+ memset(s,0,sizeof(*s));
+ s->alloc=128;
+ s->l = smalloc(s->alloc);
+ s->l[0]=0;
+}
+
+static void line_destroy( struct line_s* s)
+{
+ free(s->l);
+ s->l =0;
+ s->alloc=0;
+ s->n=0;
+}
+
+static void line_free( struct line_s* s)
+{
+ line_destroy(s);
+ free(s);
+}
+
+static struct line_s* line_copy( const struct line_s* r)
+{
+ struct line_s* l= smalloc(sizeof(*l));
+ l->alloc= r->alloc;
+ l->l= smalloc( l->alloc );
+ l->n = r->n;
+ memcpy(l->l, r->l, r->n+1);
+ return l;
+}
+
+static void line_clear(struct line_s* s)
+{
+ s->n =0;
+ s->l[0] = 0;
+}
+
+static void line_push_char( struct line_s* s, char c)
+{
+ if ( s->n == s->alloc - 1) {
+ size_t a= s->alloc * 2;
+ char* l= srealloc(s->l, a );
+ s->l = l;
+ s->alloc = a;
+ }
+ s->l[s->n] = c;
+ ++s->n;
+ s->l[s->n] = 0;
+}
+
+static struct line_s* line_read(FILE* in)
+{
+ struct line_s* l= smalloc(sizeof(*l));
+ int c;
+ line_init(l);
+ while ( (c=fgetc(in)) != EOF ) {
+ line_push_char(l,c);
+ if ( c == '\n')
+ break;
+ }
+ return l;
+}
+
+static void line_write(const struct line_s* l, FILE* out)
+{
+ fwrite(l->l, l->n, 1, out);
+}
+
+static int line_ptr_in_comment(const struct line_s* l,
+ const char* p)
+{
+ const char* comment = strchr(l->l,'#');
+ int r= p==0 || ( comment == 0 ? 0 : p >= comment);
+ return r;
+}
+
+static size_t line_label_name( const struct line_s* l,
+ char* name, size_t bufsize)
+{
+ const char* colon= strchr(l->l,':');
+ size_t s=0;
+ if (colon != 0) {
+ const char *start= l->l;
+ const char *p;
+ /* eat white spaces */
+ start = eat_white(l->l);
+ /* check for no white spaces between start and colon */
+ p = eat_non_white(start);
+ if ( (p >= colon) && !line_ptr_in_comment(l,p) ) {
+ s= colon - start + 1;
+ if ((name) && (s <= bufsize)) {
+ memcpy(name,start,s-1);
+ name[s-1] = 0;
+ }
+ }
+ }
+ return s;
+}
+
+static int line_is_label( const struct line_s* l)
+{
+ /* asm labels: white spaces Non#: */
+ int r= line_label_name(l, 0, 0) != 0;
+ return r;
+}
+
+static const char* line_find_token( const struct line_s* l,
+ const char* token) {
+ const char* p= strstr(l->l,token);
+ if ( line_ptr_in_comment(l,p) )
+ p=0;
+ return p;
+}
+
+static size_t line_function_start_name( const struct line_s* l,
+ char* name, size_t bufsize)
+{
+ const char* type= line_find_token(l,".type");
+ const char* func= line_find_token(l,"@function");
+ size_t s=0;
+ if ( (type != NULL) && (func > type)) {
+ /* extract the function name */
+ const char* typeend=eat_non_white(type);
+ ++typeend;
+ const char* fname=eat_white(typeend);
+ const char* fnend=strchr(fname,',');
+ if ( !line_ptr_in_comment(l,fnend) && (fnend > fname)) {
+ s = fnend -fname + 1;
+ if ((name!=0) && (s <= bufsize)) {
+ memcpy(name,fname,s-1);
+ name[s-1]=0;
+ }
+ }
+ }
+ return s;
+}
+
+static int line_is_function_start(const struct line_s* l)
+{
+ int r=line_function_start_name(l,0,0) !=0;
+ return r;
+}
+
+static size_t line_function_end_name( const struct line_s* l,
+ char* name, size_t bufsize)
+{
+ const char* size= line_find_token(l,".size");
+ size_t s=0;
+ if (size != NULL) {
+ /* extract the function name */
+ const char* sizeend=eat_non_white(size);
+ ++sizeend;
+ const char* fname=eat_white(sizeend);
+ const char* fnend=strchr(fname,',');
+ if ( !line_ptr_in_comment(l,fnend) && (fnend > fname)) {
+ s = fnend -fname + 1;
+ if ((name!=0) && (s <= bufsize)) {
+ memcpy(name,fname,s-1);
+ name[s-1]=0;
+ }
+ }
+ }
+ return s;
+}
+
+static size_t line_jxx_target_name( const struct line_s* l,
+ const char* jmpname,
+ char* name, size_t bufsize)
+{
+ const char* jmp= line_find_token(l,jmpname);
+ size_t s=0;
+ if ( jmp ) {
+ const char* jmp_end=jmp+strlen(jmpname);
+ if ( white_or_zero(jmp_end) &&
+ ((l->l == jmp) || white(jmp-1))) {
+ const char* start=eat_white(jmp_end);
+ const char* end=eat_non_white (start);
+ if ( !line_ptr_in_comment(l,end) ) {
+ s= end -start +1;
+ if ((name!=0) && (s <= bufsize)) {
+ memcpy(name,start,s-1);
+ name[s-1]=0;
+ }
+ }
+ }
+ }
+ return s;
+}
+
+static int line_is_jxx( const struct line_s* l, const char* jmpname)
+{
+ int r= line_jxx_target_name(l,jmpname,0,0)!=0;
+ return r;
+}
+
+static size_t line_jump_target_name(const struct line_s* l,
+ char* name, size_t bufsize)
+{
+ const char** p= ASM_JUMP_NAMES;
+ size_t s=0;
+ while (*p) {
+ size_t k= line_jxx_target_name(l,*p,name,bufsize);
+ if (k ) {
+ s = k;
+ break;
+ }
+ ++p;
+ }
+ return s;
+}
+
+static int line_is_jump( const struct line_s* l)
+{
+ int r=line_jump_target_name(l,0,0)!=0;
+ return r;
+}
+
+
+static size_t line_is_ret(const struct line_s* l)
+{
+ const char* ret= line_find_token(l,ASM_RET);
+ size_t s=0;
+ if ( ret ) {
+ if ( white_or_zero(ret+strlen(ASM_RET)+1) &&
+ ((l->l == ret) || white(ret-1)))
+ s=1;
+ }
+ return s;
+}
+
+static size_t line_is_align(const struct line_s* l)
+{
+ const char* a= line_find_token(l,ASM_ALIGN1);
+ const char* token = ASM_ALIGN1;
+ size_t s=0;
+ if ( a == 0) {
+ a= line_find_token(l,ASM_ALIGN2);
+ token = ASM_ALIGN2;
+ }
+ if ( a ) {
+ if ( white_or_zero(a+strlen(token)+1) &&
+ ((l->l == a) || white(a-1)))
+ s=1;
+ }
+ return s;
+}
+
+static size_t line_is_dbg(const struct line_s* l)
+{
+ const char* dbg= line_find_token(l,ASM_DBG_LOC);
+ const char* token= ASM_DBG_LOC;
+ size_t s=0;
+ if ( dbg == 0) {
+ dbg = line_find_token(l,ASM_DBG_FILE);
+ token = ASM_DBG_FILE;
+ }
+ if ( dbg ) {
+ if ( white_or_zero(dbg+strlen(token)+1) &&
+ ((l->l == dbg) || white(dbg-1)))
+ s=1;
+ }
+ return s;
+}
+
+static int line_is_function_end(const struct line_s* l)
+{
+ int r=line_function_end_name(l,0,0) !=0;
+ return r;
+}
+
+struct func_lines_s {
+ /* line pointer, contains allocated pointer to allocated lines */
+ struct line_s** line;
+ /* line count */
+ int n;
+ /* function name. contains allocated pointer to function name */
+ char* fname;
+};
+
+static void func_lines_init(struct func_lines_s* s)
+{
+ memset(s,0,sizeof(*s));
+}
+
+static struct func_lines_s* func_lines_create()
+{
+ struct func_lines_s* l= smalloc(sizeof(*l));
+ func_lines_init (l);
+ return l;
+}
+
+static struct func_lines_s* func_lines_copy(const struct func_lines_s* r)
+{
+ struct func_lines_s* l= func_lines_create ();
+ int i;
+ l->line = (struct line_s**)smalloc(r->n* sizeof(struct line_s*));
+ l->n = r->n;
+ l->fname = strdup(r->fname);
+ for (i=0; i< l->n;++i) {
+ l->line[i]= line_copy(r->line[i]);
+ }
+ return l;
+}
+
+static void func_lines_destroy(struct func_lines_s* s)
+{
+ if ( s->line) {
+ int i;
+ for ( i = 0; i< s->n; ++i) {
+ line_free(s->line[i]);
+ }
+ free(s->line);
+ s->line =0; // sanity
+ }
+ s->n=0;
+ if ( s->fname ) {
+ free(s->fname);
+ s->fname =0;
+ }
+}
+
+static void func_lines_delete( struct func_lines_s* s)
+{
+ func_lines_destroy(s);
+ free(s);
+}
+
+static void func_lines_append_line( struct func_lines_s* f,
+ const struct line_s* l)
+{
+ struct line_s** line= (struct line_s**)
+ srealloc(f->line, (f->n + 1) * sizeof(l));
+ f->line = line;
+ struct line_s* p= line_copy(l);
+ f->line[f->n]=p;
+ ++f->n;
+
+ if ( f->fname == 0) {
+ size_t s;
+ if ( (s=line_function_start_name(p,0,0))!=0 ) {
+ f->fname = (char*)smalloc(s);
+ line_function_start_name (p,f->fname,s);
+ }
+ }
+}
+
+static void func_lines_write(const struct func_lines_s* f, FILE* o)
+{
+ if ( f->line ) {
+ int i=0;
+ for (i=0; i< f->n; ++i ) {
+ line_write(f->line[i],o);
+ }
+ }
+}
+
+struct line_info_s {
+ /* jump < 0 line[i] jumps to external function jump[i] == 0 no
+ jump, jump[i] > 0 line of target. Jumps to local labels
+ may point to the wrong line. Jumps to contents of registers
+ and to magic targets (__op_gen_label1) contain the number
+ of the line itself
+ */
+ int jump;
+ /* ret[i] != 0 line[i] contains an return instruction */
+ int ret;
+ /* line with label ? */
+ int label;
+ /* contains the line an instruction */
+ int instr;
+};
+
+struct transform_s {
+ int name;
+ int ret_cnt;
+ int ret_not_at_end;
+ int external_jumps;
+ struct line_info_s* line_info;
+};
+
+void transform_init( struct transform_s* s,
+ const struct func_lines_s* f)
+{
+ int i,j,n;
+ n= f->n;
+ memset(s,0,sizeof(*s));
+ s->line_info=(struct line_info_s*)
+ smalloc(n * sizeof(*s->line_info));
+ memset(s->line_info,0,n*sizeof(*s->line_info));
+
+ /* initialise the static information */
+ for (i = 0; i< n; ++i) {
+ const struct line_s* l= f->line[i];
+ if ( line_is_label (l) )
+ s->line_info[i].label=1;
+ if ( line_is_jump (l) )
+ s->line_info[i].jump=-1;
+ if ( line_is_ret(l) )
+ s->line_info[i].ret=1;
+ if ( !(line_is_function_end (l) ||
+ line_is_function_start (l) ||
+ line_is_label(l) ||
+ line_is_align (l) ||
+ line_is_dbg(l))) {
+ s->line_info[i].instr=1;
+ }
+ }
+ /* now collect the jump target information */
+ for (i=0; i<n; ++i) {
+ if ( s->line_info[i].jump == 0)
+ continue;
+ const struct line_s* l= f->line[i];
+ /* find the corresponding label */
+ size_t js=line_jump_target_name(l,0,0);
+ char* b1= smalloc(js);
+ line_jump_target_name(l,b1,js);
+ /* jumps with address in register are ok */
+ if ( strchr(b1,'%') != 0) {
+ s->line_info[i].jump = i;
+ continue;
+ }
+ /* jumps to __op_gen_label1 are ok */
+ if ( strcmp(b1,ASM_JMP_LABEL1)==0) {
+ s->line_info[i].jump = i;
+ continue;
+ }
+ if ( strcmp(b1,ASM_JMP_LABEL2)==0) {
+ s->line_info[i].jump = i;
+ continue;
+ }
+ if ( strcmp(b1,ASM_JMP_LABEL3)==0) {
+ s->line_info[i].jump = i;
+ continue;
+ }
+ /* js contains the size with trailing 0 */
+ if ((isdigit(b1[0])) &&
+ ((js>1) &&
+ ((b1[js-2]=='f') || (b1[js-2]=='b')))) {
+ // fprintf(stdout, "jmp to local '%s' found\n", b1);
+ b1[js-2]=0;
+ }
+ // fprintf(stdout, "jmp to '%s' found\n", b1);
+ int label_found = 0;
+ for ( j =0 ; j< n && label_found == 0; ++j) {
+ if ( j == i )
+ continue;
+ if ( s->line_info[j].label == 0)
+ continue;
+ const struct line_s* ll= f->line[j];
+ size_t ls=line_label_name(ll,0,0);
+ char* b2= smalloc(ls);
+ line_label_name(ll,b2,ls);
+ // fprintf(stdout, "label '%s'\n", b2);
+ if ( strcmp(b1,b2)==0) {
+ label_found=1;
+ s->line_info[i].jump = j;
+ }
+ free(b2);
+ }
+ free(b1);
+ }
+}
+
+void transform_check( struct transform_s* t,
+ const struct func_lines_s* f)
+{
+ int i;
+ int n=f->n;
+ for (i=0; i<n; ++i) {
+ const struct line_info_s* info= t->line_info+i;
+ if ( info->ret )
+ ++t->ret_cnt;
+ if ( info->jump < 0)
+ ++t->external_jumps;
+ }
+ for (i=n-1;i>=0;--i) {
+ const struct line_info_s* info= t->line_info+i;
+ if ( (info->instr != 0) && (info->ret !=0) )
+ break;
+ if ( (info->instr != 0) && (info->ret ==0) ) {
+ ++t->ret_not_at_end;
+ break;
+ }
+ }
+}
+
+static void transform_fix ( const struct transform_s* t,
+ struct func_lines_s* f)
+{
+ int i;
+ int n= f->n;
+ int last_instr=0;
+ // find the last instruction:
+ for (i=n-1;i>=0;--i) {
+ if ( t->line_info[i].instr != 0 ) {
+ last_instr=i;
+ break;
+ }
+ }
+ // produce a label name
+ char label[128];
+ snprintf(label,sizeof(label),".L%s_exit",f->fname);
+ // replace external jumps with calls
+ for (i=0;i<n;++i) {
+ if (t->line_info[i].jump >= 0)
+ continue;
+ char jmp_target[512];
+ struct line_s* l= f->line[i];
+ line_jump_target_name(l,jmp_target,sizeof(jmp_target));
+ char call[4096];
+ size_t s;
+ char jmp_ret[512];
+ jmp_ret[0]=0;
+ if ( i != last_instr ) {
+ snprintf(jmp_ret,sizeof(jmp_ret),
+ "# CVTASM FIX ret after jmp\n"
+ "\tjmp\t%s\n",label);
+ }
+#if defined (__i386__)
+ s=snprintf(call,sizeof(call),
+ "# CVTASM FIX jmp --> call\n"
+ "\tcall\t%s\n%s",
+ jmp_target,jmp_ret);
+#endif
+#if defined (__x86_64__)
+ s=snprintf(call,sizeof(call),
+ "# CVTASM FIX jmp --> call\n"
+ "\tsubq\t$8, %%rsp\n"
+ "\tcall\t%s\n"
+ "\taddq\t$8, %%rsp\n%s",
+ jmp_target,jmp_ret);
+#endif
+ line_clear(l);
+ int j;
+ for (j=0;j<s;++j)
+ line_push_char(l,call[j]);
+ }
+ struct line_s* l= f->line[last_instr];
+ char newlines[4096];
+ size_t s;
+ if ( t->line_info[last_instr].ret !=0 ) {
+ // insert label before the ret.
+ s=snprintf(newlines,sizeof(newlines),
+ "# CVTASM FIX label before ret \n"
+ "%s:\n%s",
+ label, l->l);
+ } else {
+ // insert label after the ret.
+ s=snprintf(newlines,sizeof(newlines),
+ "%s%s:\n"
+ "# CVTASM FIX ret at end\n"
+ "\tret\n",l->l,label);
+ }
+ // convert the line
+ line_clear(l);
+ for (i =0; i< (int)s; ++i)
+ line_push_char(l,newlines[i]);
+ // replace internal rets with jmps to the generated label.
+ for (i=0;i<last_instr;++i) {
+ if ( t->line_info[i].ret == 0)
+ continue;
+ l= f->line[i];
+ line_clear(l);
+ s=snprintf(newlines,sizeof(newlines),
+ "# CVTASM FIX ret --> jmp to end\n"
+ "\tjmp %s\n",
+ label);
+ int j;
+ for (j=0;j<s;++j)
+ line_push_char(l,newlines[j]);
+ }
+}
+
+struct func_lines_s* transform_function ( const struct func_lines_s* f)
+{
+ struct func_lines_s* fn= 0;
+ int n;
+ struct transform_s t;
+ transform_init (&t,f);
+ n=f->n;
+ // do the transformation checks.
+ do {
+ // Name must start with op.
+ if ( strncmp(f->fname,OP_FUNC_PFX,3)!=0)
+ break;
+ // exit tb has more than one exit
+ if ( strcmp(f->fname,"op_exit_tb")==0)
+ break;
+ // exit tb has more than one exit
+ if ( strcmp(f->fname,"op_exit_tb")==0)
+ break;
+ transform_check (&t,f);
+ if ( t.ret_cnt != 1 )
+ fprintf(stdout,
+ "'%s' needs fixing (return count %i)\n",
+ f->fname, t.ret_cnt);
+ if ( t.external_jumps != 0 )
+ fprintf(stdout,
+ "'%s' needs fixing (external jmps %i)\n",
+ f->fname, t.external_jumps);
+ if ( t.ret_not_at_end != 0 )
+ fprintf(stdout,
+ "'%s' needs fixing "
+ "(return not last instruction))\n",
+ f->fname);
+ if (t.ret_cnt != 1 || t.ret_not_at_end ||
+ t.external_jumps) {
+ fn = func_lines_copy (f);
+ transform_fix(&t,fn);
+ }
+ } while (0);
+ return fn;
+}
+
+int cvt_asm( FILE* in, FILE* out)
+{
+ struct line_s* l;
+ struct func_lines_s* f=0;
+ int done =0;
+ do {
+ l = line_read(in);
+ if ( l->n == 0) {
+ if ( f != 0 )
+ error("Not terminated function");
+ done = 1;
+ }
+ if ( f != NULL ) {
+ /* collecting into f */
+ func_lines_append_line (f, l);
+ if ( line_is_function_end (l) ) {
+ /* check for the right function end here */
+ /* Transformation is done here */
+ struct func_lines_s* fn=
+ transform_function(f);
+ if ( fn ) {
+ func_lines_write(fn,out);
+ func_lines_delete(fn);
+ } else {
+ func_lines_write(f,out);
+ }
+ func_lines_delete(f);
+ f=0;
+ }
+ } else {
+ /* check if we must collecting into new f */
+ if ( line_is_function_start(l) ) {
+ f= func_lines_create();
+ func_lines_append_line(f,l);
+ } else {
+ /* otherwise copy to output */
+ line_write(l,out);
+ }
+ }
+ free(l);
+ } while ( !done );
+ if ( f )
+ free (f);
+ return 0;
+}
+
+#endif
+
+#if !defined (ARCH_HAS_CVT_ASM)
+int cvt_asm(FILE* in, FILE* out)
+{
+ int c;
+ while ( (c=fgetc(in))!= EOF)
+ fputc(c,out);
+ return 0;
+}
+#endif
diff -r 83ff8e3c6392 exec-all.h
--- a/exec-all.h Thu Mar 22 12:36:53 2007 +0000
+++ b/exec-all.h Sat Mar 24 15:08:16 2007 -0500
@@ -337,6 +337,26 @@ do {\
"1:\n");\
} while (0)
+#elif defined (__x86_64__)
+
+/* GCC 4 optimises away the labels after the goto :-( */
+/* This is the main reason for the crashes of qemu if compiled with */
+/* gcc 4 */
+#define GOTO_TB(opname, tbparam, n) \
+do { \
+ void* target=(void *)(((TranslationBlock *)tbparam)->tb_next[n]); \
+ __asm__ __volatile__ \
+ ( ".data\n\t" \
+ ".align 8 \n" \
+ ASM_OP_LABEL_NAME(n, opname) ":\n" \
+ ".quad 1f\n" \
+ ".previous \n\t" \
+ "jmp *%0\n\t" \
+ "1:\n\t" \
+ : \
+ :"a"(target)); \
+} while (0)
+
#else
/* jump to next block operations (more portable code, does not need
diff -r 83ff8e3c6392 target-i386/exec.h
--- a/target-i386/exec.h Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/exec.h Sat Mar 24 15:08:16 2007 -0500
@@ -231,10 +231,14 @@ static inline void stfq(target_ulong ptr
{
union {
double d;
- uint64_t i;
+ struct {
+ uint32_t lo;
+ uint32_t hi;
+ } i;
} u;
u.d = v;
- stq(ptr, u.i);
+ stl(ptr, u.i.lo);
+ stl(ptr + 4, u.i.hi);
}
static inline float ldfl(target_ulong ptr)
@@ -316,7 +320,13 @@ typedef union {
typedef union {
long double d;
struct {
- unsigned long long lower;
+ union {
+ unsigned long long lower;
+ struct {
+ uint32_t lo;
+ uint32_t hi;
+ } split;
+ };
unsigned short upper;
} l;
} CPU86_LDoubleU;
@@ -444,7 +454,8 @@ static inline void helper_fstt(CPU86_LDo
CPU86_LDoubleU temp;
temp.d = f;
- stq(ptr, temp.l.lower);
+ stl(ptr, temp.l.split.lo);
+ stl(ptr + 4, temp.l.split.hi);
stw(ptr + 8, temp.l.upper);
}
@@ -501,6 +512,7 @@ void helper_hlt(void);
void helper_hlt(void);
void helper_monitor(void);
void helper_mwait(void);
+void helper_pshufw(uint16_t *dst, uint16_t *src, int order);
extern const uint8_t parity_table[256];
extern const uint8_t rclw_table[32];
diff -r 83ff8e3c6392 target-i386/helper.c
--- a/target-i386/helper.c Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/helper.c Sat Mar 24 15:08:16 2007 -0500
@@ -3452,8 +3452,10 @@ void helper_fxrstor(target_ulong ptr, in
nb_xmm_regs = 8 << data64;
addr = ptr + 0xa0;
for(i = 0; i < nb_xmm_regs; i++) {
- env->xmm_regs[i].XMM_Q(0) = ldq(addr);
- env->xmm_regs[i].XMM_Q(1) = ldq(addr + 8);
+ env->xmm_regs[i].XMM_L(0) = ldl(addr);
+ env->xmm_regs[i].XMM_L(1) = ldl(addr + 4);
+ env->xmm_regs[i].XMM_L(2) = ldl(addr + 8);
+ env->xmm_regs[i].XMM_L(3) = ldl(addr + 12);
addr += 16;
}
}
diff -r 83ff8e3c6392 target-i386/helper2.c
--- a/target-i386/helper2.c Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/helper2.c Sat Mar 24 15:08:16 2007 -0500
@@ -1034,3 +1034,11 @@ void save_native_fp_state(CPUState *env)
env->native_fp_regs = 0;
}
#endif
+
+void helper_pshufw(uint16_t *dst, uint16_t *src, int order)
+{
+ dst[0] = src[order & 3];
+ dst[1] = src[(order >> 2) & 3];
+ dst[2] = src[(order >> 4) & 3];
+ dst[3] = src[(order >> 6) & 3];
+}
diff -r 83ff8e3c6392 target-i386/op.c
--- a/target-i386/op.c Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/op.c Sat Mar 24 15:08:16 2007 -0500
@@ -18,7 +18,7 @@
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
-#define ASM_SOFTMMU
+//#define ASM_SOFTMMU
#include "exec.h"
/* n must be a constant to be efficient */
diff -r 83ff8e3c6392 target-i386/ops_sse.h
--- a/target-i386/ops_sse.h Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/ops_sse.h Sat Mar 24 15:08:16 2007 -0500
@@ -580,16 +580,9 @@ void OPPROTO glue(op_movq_T0_mm, SUFFIX)
#if SHIFT == 0
void OPPROTO glue(op_pshufw, SUFFIX) (void)
{
- Reg r, *d, *s;
- int order;
- d = (Reg *)((char *)env + PARAM1);
- s = (Reg *)((char *)env + PARAM2);
- order = PARAM3;
- r.W(0) = s->W(order & 3);
- r.W(1) = s->W((order >> 2) & 3);
- r.W(2) = s->W((order >> 4) & 3);
- r.W(3) = s->W((order >> 6) & 3);
- *d = r;
+ helper_pshufw((uint16_t *)((char *)env + PARAM1),
+ (uint16_t *)((char *)env + PARAM2),
+ PARAM3);
}
#else
void OPPROTO op_shufps(void)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-24 20:15 ` Anthony Liguori
@ 2007-03-25 10:15 ` Axel Zeuner
2007-03-25 23:46 ` Anthony Liguori
2007-03-25 12:12 ` Axel Zeuner
2007-04-20 16:57 ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
2 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-25 10:15 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> Axel Zeuner wrote:
> > Hi,
>
> Hi Axel,
>
> By adding some GCC4 fixes on top of your patch, I was able to get qemu
> for i386 (on i386) to compile and run. So far, I've only tested a win2k
> guest.
Hi Anthony,
thank you for the test, I like to hear about your success. I have applied your
patches, compiled and checked qemu-i386-softmmu on i386 without kqemu with
FreeDos. It works also.
> The big problem (which pbrook helped me with) was GCC4 freaking out over
> some stq's. Splitting up the 64bit ops into 32bit ops seemed to address
> most of the problems.
>
> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
> deal with the register pressure. The problem I can't seem to fix though
> is that GCC sticks %1 in %esi because we're only using an "r"
> constraint, not a "q" constraint. This results in the generation of
> %sib which is an invalid register. However, refactoring the code to not
> require a "q" constraint doesn't seem to help either.
In the past I made some patches (not published yet) to speed up the helpers
for 64 operations in target-i386/helper.c on x86_64 and i386 using gcc inline
assembly. x86_64 was really easy, but for i386 I had to use "m" and "=m"
constraints and as less inputs and outputs as possible.
> The attached patch is what I have so far. Some help with people more
> familiar with gcc asm foo would be appreciated!
May I suggest some changes?
I would like to try not to split the 64 bit accesses on hosts supporting it
native, i.e. something like this:
===================================================================
--- cpu-all.h (revision 16)
+++ cpu-all.h (working copy)
@@ -339,7 +339,13 @@
static inline void stq_le_p(void *ptr, uint64_t v)
{
- *(uint64_t *)ptr = v;
+#if (HOST_LONG_BITS < 64)
+ uint8_t *p = ptr;
+ stl_le_p(p, (uint32_t)v);
+ stl_le_p(p + 4, v >> 32);
+#else
+ *(uint64_t*)ptr = v;
+#endif
}
Furthermore I think one should move helper_pshufw() from target-i386/helper2.c
into target-i386/helper.c where all the other helper methods reside.
Kind Regards
Axel
> Regards,
>
> Anthony Liguori
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-24 20:15 ` Anthony Liguori
2007-03-25 10:15 ` Axel Zeuner
@ 2007-03-25 12:12 ` Axel Zeuner
2007-03-25 23:44 ` Anthony Liguori
2007-04-20 16:57 ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
2 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-25 12:12 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 677 bytes --]
On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
> deal with the register pressure. The problem I can't seem to fix though
> is that GCC sticks %1 in %esi because we're only using an "r"
> constraint, not a "q" constraint. This results in the generation of
> %sib which is an invalid register. However, refactoring the code to not
> require a "q" constraint doesn't seem to help either.
Hi Anthony,
could you please try the attached patch for softmmu_header.h? Allows compiling
with gcc4 and ASM_SOFTMMU.
Kind regards
Axel
[-- Attachment #2: softmmu.h.diff.zip --]
[-- Type: application/x-zip, Size: 813 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
2007-03-24 20:15 ` Anthony Liguori
@ 2007-03-25 13:40 ` Avi Kivity
2007-03-26 17:14 ` Axel Zeuner
1 sibling, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2007-03-25 13:40 UTC (permalink / raw)
To: qemu-devel
Axel Zeuner wrote:
> A full featured converter (cvtasm) has a lot of dependencies: it has to
> support all hosts (M) (with all assembler dialects M') and all targets N,
> i.e. in the worst case one would end with M'x N variants of it, or M x N if
> one supports only one assembler dialect per host. It is clear, that the
> number of variants is one of the biggest disadvantages of such an approach.
>
Perhaps a mixed approach can be made for gradual conversion: for
combinations where cvtasm has been written, use that. Where that's
still to be done, have dyngen generate call instructions to the ops
instead of pasting the ops text directly.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-25 12:12 ` Axel Zeuner
@ 2007-03-25 23:44 ` Anthony Liguori
2007-03-26 6:16 ` Axel Zeuner
0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-25 23:44 UTC (permalink / raw)
To: Axel Zeuner; +Cc: qemu-devel
Axel Zeuner wrote:
> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>
>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
>> deal with the register pressure. The problem I can't seem to fix though
>> is that GCC sticks %1 in %esi because we're only using an "r"
>> constraint, not a "q" constraint. This results in the generation of
>> %sib which is an invalid register. However, refactoring the code to not
>> require a "q" constraint doesn't seem to help either.
>>
> Hi Anthony,
> could you please try the attached patch for softmmu_header.h? Allows compiling
> with gcc4 and ASM_SOFTMMU.
>
That did the trick. Could you explain what your changes did?
Regards,
Anthony Liguori
> Kind regards
> Axel
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-25 10:15 ` Axel Zeuner
@ 2007-03-25 23:46 ` Anthony Liguori
2007-03-26 5:49 ` Axel Zeuner
0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-25 23:46 UTC (permalink / raw)
To: Axel Zeuner; +Cc: qemu-devel
Axel Zeuner wrote:
> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>
>> Axel Zeuner wrote:
>>
>>> Hi,
>>>
>> Hi Axel,
>>
>> By adding some GCC4 fixes on top of your patch, I was able to get qemu
>> for i386 (on i386) to compile and run. So far, I've only tested a win2k
>> guest.
>>
> Hi Anthony,
>
> thank you for the test, I like to hear about your success. I have applied your
> patches, compiled and checked qemu-i386-softmmu on i386 without kqemu with
> FreeDos. It works also.
>
>
>> The big problem (which pbrook helped me with) was GCC4 freaking out over
>> some stq's. Splitting up the 64bit ops into 32bit ops seemed to address
>> most of the problems.
>>
>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
>> deal with the register pressure. The problem I can't seem to fix though
>> is that GCC sticks %1 in %esi because we're only using an "r"
>> constraint, not a "q" constraint. This results in the generation of
>> %sib which is an invalid register. However, refactoring the code to not
>> require a "q" constraint doesn't seem to help either.
>>
> In the past I made some patches (not published yet) to speed up the helpers
> for 64 operations in target-i386/helper.c on x86_64 and i386 using gcc inline
> assembly. x86_64 was really easy, but for i386 I had to use "m" and "=m"
> constraints and as less inputs and outputs as possible.
>
>> The attached patch is what I have so far. Some help with people more
>> familiar with gcc asm foo would be appreciated!
>>
>
> May I suggest some changes?
> I would like to try not to split the 64 bit accesses on hosts supporting it
> native, i.e. something like this:
> ===================================================================
> --- cpu-all.h (revision 16)
> +++ cpu-all.h (working copy)
> @@ -339,7 +339,13 @@
>
> static inline void stq_le_p(void *ptr, uint64_t v)
> {
> - *(uint64_t *)ptr = v;
> +#if (HOST_LONG_BITS < 64)
> + uint8_t *p = ptr;
> + stl_le_p(p, (uint32_t)v);
> + stl_le_p(p + 4, v >> 32);
> +#else
> + *(uint64_t*)ptr = v;
> +#endif
> }
>
Yes, I think the proper thing to do is to use a configure check for GCC
version to determine whether or not to use the 32 bit or 64 version of
stq_le_p.
There is already a function in cpu-all.h that does the 32 bit version.
> Furthermore I think one should move helper_pshufw() from target-i386/helper2.c
> into target-i386/helper.c where all the other helper methods reside.
>
I moved to helper2.c because AFAICT helper.c is compiled with the same
sort of restrictions as op.c which leads to the compile failure.
Regards,
Anthony Liguori
> Kind Regards
> Axel
>
>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-25 23:46 ` Anthony Liguori
@ 2007-03-26 5:49 ` Axel Zeuner
2007-03-26 22:53 ` Paul Brook
0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26 5:49 UTC (permalink / raw)
To: Anthony Liguori, qemu-devel
Hi Anthony,
On Monday 26 March 2007 01:46, you wrote:
> Axel Zeuner wrote:
> > On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >> Axel Zeuner wrote:
> >>> Hi,
> >>
>
> > Furthermore I think one should move helper_pshufw() from
> > target-i386/helper2.c into target-i386/helper.c where all the other
> > helper methods reside.
>
> I moved to helper2.c because AFAICT helper.c is compiled with the same
> sort of restrictions as op.c which leads to the compile failure.
Yes, helper.c is compiled with the global register variables and the code is
called directly from the op_xxx functions, but one needs the global register
variables to access global data, these contain the required environment for
the emulation. AFAIK helper2.c is used by the CODE_COPY branch on i386 with
even stronger restrictions, but I may be wrong here.
Kind Regards
Axel Zeuner
>
> Regards,
>
> Anthony Liguori
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-25 23:44 ` Anthony Liguori
@ 2007-03-26 6:16 ` Axel Zeuner
2007-03-29 2:07 ` Anthony Liguori
0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26 6:16 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
Hi Anthony,
On Monday 26 March 2007 01:44, you wrote:
> Axel Zeuner wrote:
> > On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> >> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
> >> deal with the register pressure. The problem I can't seem to fix though
> >> is that GCC sticks %1 in %esi because we're only using an "r"
> >> constraint, not a "q" constraint. This results in the generation of
> >> %sib which is an invalid register. However, refactoring the code to not
> >> require a "q" constraint doesn't seem to help either.
> >
> > Hi Anthony,
> > could you please try the attached patch for softmmu_header.h? Allows
> > compiling with gcc4 and ASM_SOFTMMU.
>
> That did the trick. Could you explain what your changes did?
QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA A0).
This makes inline assembly really ugly. The called external C functions in
ASM_SOFTMMU are REGPARM(1,2), i.e. require their first arguments in eax, edx.
In the two ld functions three registers (eax, edx, ecx) are required and
destroyed because an external C function may be called. We relax the register
pressure a little bit by forcing the return value (res) into eax , because
the return value is returned in a destroyed register. Furthermore the called
C function returns its value in eax anyway (call %7).
The st functions are a little more tricky: we need three registers and the
assembly code requires a reload of %0 (ptr) after the check if the external
function must be called. In the external function the three remaining
registers are destroyed. After the call a need also to reload of %1 (v) into
register is needed, i.e. we need more registers. Register saving on the stack
does not work, because there exist already 2 "m" constraints: if the code is
compiled with -fomit-frame-pointers these are expressed as offsets relative
to %esp, i.e X(%esp) and would become invalid after pushes onto the stack.
One solution was to force all inputs to the asm block onto the stack, thats
what the replacement of the "r" constraints into "m" constraints do: they
force a memory reference. Because i386 can not do direct memory memory moves
one has to reload "m"(v) into ecx again, otherwise the generated assembler
code is invalid.
It must be mentioned, that the generated code is a little bit slower than the
original one.
Kind Regards
Axel
>
> Regards,
>
> Anthony Liguori
>
> > Kind regards
> > Axel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
@ 2007-03-26 17:14 ` Axel Zeuner
2007-04-06 21:04 ` Rob Landley
0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26 17:14 UTC (permalink / raw)
To: qemu-devel
Hi Avi,
On Sunday 25 March 2007 15:40, Avi Kivity wrote:
> Axel Zeuner wrote:
> > A full featured converter (cvtasm) has a lot of dependencies: it has to
> > support all hosts (M) (with all assembler dialects M') and all targets N,
> > i.e. in the worst case one would end with M'x N variants of it, or M x N
> > if one supports only one assembler dialect per host. It is clear, that
> > the number of variants is one of the biggest disadvantages of such an
> > approach.
>
> Perhaps a mixed approach can be made for gradual conversion: for
> combinations where cvtasm has been written, use that. Where that's
> still to be done, have dyngen generate call instructions to the ops
> instead of pasting the ops text directly.
Perhaps, but I am not sure, if the changes required for generating calls with
parameters to functions instead of copied code in dyngen are much smaller
than hand written code generators. Furthermore one would surely lose some
performance.
Kind Regards
Axel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-26 5:49 ` Axel Zeuner
@ 2007-03-26 22:53 ` Paul Brook
2007-03-27 5:48 ` Axel Zeuner
0 siblings, 1 reply; 17+ messages in thread
From: Paul Brook @ 2007-03-26 22:53 UTC (permalink / raw)
To: qemu-devel; +Cc: Axel Zeuner
> > I moved to helper2.c because AFAICT helper.c is compiled with the same
> > sort of restrictions as op.c which leads to the compile failure.
>
> Yes, helper.c is compiled with the global register variables and the code
> is called directly from the op_xxx functions, but one needs the global
> register variables to access global data, these contain the required
> environment for the emulation. AFAIK helper2.c is used by the CODE_COPY
> branch on i386 with even stronger restrictions, but I may be wrong here.
helper.c is compiled with the same setting as op.c, so has direct access to
the dyngen state ("T0", "env" etc). helper2.c is regular code. Either may be
used from op.c, the difference is whether all arguments are explicit. Also,
if a helper throws an exception it must be in helper.c to avoid clobbering
CPU state before calling raise_exception.
Note that some targets use a different naming scheme. They use helper.c for
regular code and op_helper.c for op.c-like code. IMHO this is a much better
naming scheme.
Paul
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-26 22:53 ` Paul Brook
@ 2007-03-27 5:48 ` Axel Zeuner
0 siblings, 0 replies; 17+ messages in thread
From: Axel Zeuner @ 2007-03-27 5:48 UTC (permalink / raw)
To: qemu-devel
Hi Paul,
On Tuesday 27 March 2007 00:53, Paul Brook wrote:
> > > I moved to helper2.c because AFAICT helper.c is compiled with the same
> > > sort of restrictions as op.c which leads to the compile failure.
> >
> > Yes, helper.c is compiled with the global register variables and the code
> > is called directly from the op_xxx functions, but one needs the global
> > register variables to access global data, these contain the required
> > environment for the emulation. AFAIK helper2.c is used by the CODE_COPY
> > branch on i386 with even stronger restrictions, but I may be wrong here.
>
> helper.c is compiled with the same setting as op.c, so has direct access to
> the dyngen state ("T0", "env" etc). helper2.c is regular code. Either may
> be used from op.c, the difference is whether all arguments are explicit.
> Also, if a helper throws an exception it must be in helper.c to avoid
> clobbering CPU state before calling raise_exception.
Thank you for the clarification, I was wrong.
Kind regards
Axel
> Note that some targets use a different naming scheme. They use helper.c for
> regular code and op_helper.c for op.c-like code. IMHO this is a much better
> naming scheme.
>
> Paul
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-26 6:16 ` Axel Zeuner
@ 2007-03-29 2:07 ` Anthony Liguori
2007-03-29 6:03 ` Axel Zeuner
0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-29 2:07 UTC (permalink / raw)
To: Axel Zeuner; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]
Axel Zeuner wrote:
> Hi Anthony,
>
> On Monday 26 March 2007 01:44, you wrote:
>
>> Axel Zeuner wrote:
>>
>>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>>>
>>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>>>> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
>>>> deal with the register pressure. The problem I can't seem to fix though
>>>> is that GCC sticks %1 in %esi because we're only using an "r"
>>>> constraint, not a "q" constraint. This results in the generation of
>>>> %sib which is an invalid register. However, refactoring the code to not
>>>> require a "q" constraint doesn't seem to help either.
>>>>
>>> Hi Anthony,
>>> could you please try the attached patch for softmmu_header.h? Allows
>>> compiling with gcc4 and ASM_SOFTMMU.
>>>
>> That did the trick. Could you explain what your changes did?
>>
>
> QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
> because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA A0).
> This makes inline assembly really ugly. The called external C functions in
> ASM_SOFTMMU are REGPARM(1,2), i.e. require their first arguments in eax, edx.
>
Based on some feedback from Paul Brook, I wrote another patch that just
disables the use of register variables for GCC4. I think this is a
considerably less hackish way to go about this.
The generated code won't be as nice of course but at least it works.
The patch applies against your cvtasm patches.
Regards,
Anthony Liguori
> In the two ld functions three registers (eax, edx, ecx) are required and
> destroyed because an external C function may be called. We relax the register
> pressure a little bit by forcing the return value (res) into eax , because
> the return value is returned in a destroyed register. Furthermore the called
> C function returns its value in eax anyway (call %7).
>
> The st functions are a little more tricky: we need three registers and the
> assembly code requires a reload of %0 (ptr) after the check if the external
> function must be called. In the external function the three remaining
> registers are destroyed. After the call a need also to reload of %1 (v) into
> register is needed, i.e. we need more registers. Register saving on the stack
> does not work, because there exist already 2 "m" constraints: if the code is
> compiled with -fomit-frame-pointers these are expressed as offsets relative
> to %esp, i.e X(%esp) and would become invalid after pushes onto the stack.
>
> One solution was to force all inputs to the asm block onto the stack, thats
> what the replacement of the "r" constraints into "m" constraints do: they
> force a memory reference. Because i386 can not do direct memory memory moves
> one has to reload "m"(v) into ecx again, otherwise the generated assembler
> code is invalid.
> It must be mentioned, that the generated code is a little bit slower than the
> original one.
>
> Kind Regards
> Axel
>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>>> Kind regards
>>> Axel
>>>
>
>
[-- Attachment #2: gcc4-register-pressure.diff --]
[-- Type: text/x-patch, Size: 2053 bytes --]
diff -r d19a5903d749 softmmu_header.h
--- a/softmmu_header.h Tue Mar 27 13:23:10 2007 -0500
+++ b/softmmu_header.h Tue Mar 27 13:23:21 2007 -0500
@@ -240,9 +240,13 @@ static inline void glue(glue(st, SUFFIX)
"2:\n"
:
: "r" (ptr),
+#ifdef USE_REGISTER_VARIABLES
/* NOTE: 'q' would be needed as constraint, but we could not use it
with T1 ! */
"r" (v),
+#else
+ "q" (v),
+#endif
"i" ((CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS),
"i" (TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS),
"i" (TARGET_PAGE_MASK | (DATA_SIZE - 1)),
diff -r d19a5903d749 target-i386/cpu.h
--- a/target-i386/cpu.h Tue Mar 27 13:23:10 2007 -0500
+++ b/target-i386/cpu.h Tue Mar 27 13:23:21 2007 -0500
@@ -26,6 +26,10 @@
#define TARGET_LONG_BITS 64
#else
#define TARGET_LONG_BITS 32
+#endif
+
+#if TARGET_LONG_BITS <= HOST_LONG_BITS && __GNUC__ < 4
+#define USE_REGISTER_VARIABLES
#endif
/* target supports implicit self modifying code */
@@ -424,7 +428,7 @@ typedef union {
#endif
typedef struct CPUX86State {
-#if TARGET_LONG_BITS > HOST_LONG_BITS
+#ifndef USE_REGISTER_VARIABLES
/* temporaries if we cannot store them in host registers */
target_ulong t0, t1, t2;
#endif
diff -r d19a5903d749 target-i386/exec.h
--- a/target-i386/exec.h Tue Mar 27 13:23:10 2007 -0500
+++ b/target-i386/exec.h Tue Mar 27 13:23:21 2007 -0500
@@ -27,12 +27,16 @@
#define TARGET_LONG_BITS 32
#endif
+#if TARGET_LONG_BITS <= HOST_LONG_BITS && __GNUC__ < 4
+#define USE_REGISTER_VARIABLES
+#endif
+
#include "cpu-defs.h"
/* at least 4 register variables are defined */
register struct CPUX86State *env asm(AREG0);
-#if TARGET_LONG_BITS > HOST_LONG_BITS
+#ifndef USE_REGISTER_VARIABLES
/* no registers can be used */
#define T0 (env->t0)
@@ -88,7 +92,7 @@ register target_ulong EDI asm(AREG11);
#define reg_EDI
#endif
-#endif /* ! (TARGET_LONG_BITS > HOST_LONG_BITS) */
+#endif /* ! USE_REGISTER_VARIABLES */
#define A0 T2
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-29 2:07 ` Anthony Liguori
@ 2007-03-29 6:03 ` Axel Zeuner
2007-03-29 15:51 ` Anthony Liguori
0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-29 6:03 UTC (permalink / raw)
To: Anthony Liguori, qemu-devel
Hi Anthony,
On Thursday 29 March 2007 04:07, you wrote:
> Axel Zeuner wrote:
> > Hi Anthony,
> >
> > On Monday 26 March 2007 01:44, you wrote:
> >> Axel Zeuner wrote:
> >>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> >>>> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
> >>>> deal with the register pressure. The problem I can't seem to fix
> >>>> though is that GCC sticks %1 in %esi because we're only using an "r"
> >>>> constraint, not a "q" constraint. This results in the generation of
> >>>> %sib which is an invalid register. However, refactoring the code to
> >>>> not require a "q" constraint doesn't seem to help either.
> >>>
> >>> Hi Anthony,
> >>> could you please try the attached patch for softmmu_header.h? Allows
> >>> compiling with gcc4 and ASM_SOFTMMU.
> >>
> >> That did the trick. Could you explain what your changes did?
> >
> > QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
> > because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA
> > A0). This makes inline assembly really ugly. The called external C
> > functions in ASM_SOFTMMU are REGPARM(1,2), i.e. require their first
> > arguments in eax, edx.
>
> Based on some feedback from Paul Brook, I wrote another patch that just
> disables the use of register variables for GCC4. I think this is a
> considerably less hackish way to go about this.
>
> The generated code won't be as nice of course but at least it works.
> The patch applies against your cvtasm patches.
Looks good to me, sorry I had no time yet to test your patch. Did you check
the performance impact of your changes?
Perhaps it is possible to use register variables in dependence of the register
count of the host processor.
Kind Regards
Axel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-29 6:03 ` Axel Zeuner
@ 2007-03-29 15:51 ` Anthony Liguori
0 siblings, 0 replies; 17+ messages in thread
From: Anthony Liguori @ 2007-03-29 15:51 UTC (permalink / raw)
To: Axel Zeuner; +Cc: qemu-devel
Axel Zeuner wrote:
> Hi Anthony,
>
> On Thursday 29 March 2007 04:07, you wrote:
>
>> Axel Zeuner wrote:
>>
>>> Hi Anthony,
>>>
>>> On Monday 26 March 2007 01:44, you wrote:
>>>
>>>> Axel Zeuner wrote:
>>>>
>>>>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>>>>>
>>>>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>>>>>> working. The problem is GLUE(st, SUFFIX) function. First GCC cannot
>>>>>> deal with the register pressure. The problem I can't seem to fix
>>>>>> though is that GCC sticks %1 in %esi because we're only using an "r"
>>>>>> constraint, not a "q" constraint. This results in the generation of
>>>>>> %sib which is an invalid register. However, refactoring the code to
>>>>>> not require a "q" constraint doesn't seem to help either.
>>>>>>
>>>>> Hi Anthony,
>>>>> could you please try the attached patch for softmmu_header.h? Allows
>>>>> compiling with gcc4 and ASM_SOFTMMU.
>>>>>
>>>> That did the trick. Could you explain what your changes did?
>>>>
>>> QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
>>> because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA
>>> A0). This makes inline assembly really ugly. The called external C
>>> functions in ASM_SOFTMMU are REGPARM(1,2), i.e. require their first
>>> arguments in eax, edx.
>>>
>> Based on some feedback from Paul Brook, I wrote another patch that just
>> disables the use of register variables for GCC4. I think this is a
>> considerably less hackish way to go about this.
>>
>> The generated code won't be as nice of course but at least it works.
>> The patch applies against your cvtasm patches.
>>
> Looks good to me, sorry I had no time yet to test your patch. Did you check
> the performance impact of your changes?
> Perhaps it is possible to use register variables in dependence of the register
> count of the host processor.
>
Yes, I need to update the patch to include a && defined(__i386__) and
also to add the proper guards to the other architectures.
Regards,
Anthony Liguori
> Kind Regards
> Axel
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
2007-03-26 17:14 ` Axel Zeuner
@ 2007-04-06 21:04 ` Rob Landley
0 siblings, 0 replies; 17+ messages in thread
From: Rob Landley @ 2007-04-06 21:04 UTC (permalink / raw)
To: qemu-devel; +Cc: Axel Zeuner
On Monday 26 March 2007 1:14 pm, Axel Zeuner wrote:
> Hi Avi,
> On Sunday 25 March 2007 15:40, Avi Kivity wrote:
> > Axel Zeuner wrote:
> > > A full featured converter (cvtasm) has a lot of dependencies: it has to
> > > support all hosts (M) (with all assembler dialects M') and all targets
N,
> > > i.e. in the worst case one would end with M'x N variants of it, or M x N
> > > if one supports only one assembler dialect per host. It is clear, that
> > > the number of variants is one of the biggest disadvantages of such an
> > > approach.
> >
> > Perhaps a mixed approach can be made for gradual conversion: for
> > combinations where cvtasm has been written, use that. Where that's
> > still to be done, have dyngen generate call instructions to the ops
> > instead of pasting the ops text directly.
> Perhaps, but I am not sure, if the changes required for generating calls
with
> parameters to functions instead of copied code in dyngen are much smaller
> than hand written code generators. Furthermore one would surely lose some
> performance.
>
> Kind Regards
> Axel
On a related note, I have this vague urge from time to time to get qemu to
build with tcc.
Haven't even come close to making it work yet, of course... :)
Rob
--
Penguicon 5.0 Apr 20-22, Linux Expo/SF Convention. Bruce Schneier, Christine
Peterson, Steve Jackson, Randy Milholland, Elizabeth Bear, Charlie Stross...
^ permalink raw reply [flat|nested] 17+ messages in thread
* qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works)
2007-03-24 20:15 ` Anthony Liguori
2007-03-25 10:15 ` Axel Zeuner
2007-03-25 12:12 ` Axel Zeuner
@ 2007-04-20 16:57 ` Gwenole Beauchesne
2 siblings, 0 replies; 17+ messages in thread
From: Gwenole Beauchesne @ 2007-04-20 16:57 UTC (permalink / raw)
To: qemu-devel
Hi,
> By adding some GCC4 fixes on top of your patch, I was able to get qemu for
> i386 (on i386) to compile and run. So far, I've only tested a win2k guest.
For op_pshufw(), please keep the temporary destination register as S and D
may reference the same register.
FYI, I am experimenting with an alternate gcc4 patch (inlined hereunder).
<http://svn.mandriva.com/svn/packages/cooker/qemu/current/SOURCES/qemu-0.9.0-gcc4.patch>
I have only tested the following configurations with -no-kvm -no-kqemu
- compiler: gcc 4.1.2-1mdv
- guest OS: { winXPsp2, linux }
- guest CPU: { i386, x86_64 (linux-only) }
- host CPU (compiled as): { i386, x86_64 }
PS: I have not tested yet on MacOS X.
Regards,
Gwenole
2007-04-20 Gwenole Beauchesne <gbeauchesne@mandriva.com>
* gcc4 host support.
--- qemu-0.9.0/target-i386/ops_template.h.gcc4 2005-02-21 20:23:59.000000000 +0000
+++ qemu-0.9.0/target-i386/ops_template.h 2007-04-20 14:53:32.000000000 +0000
@@ -268,7 +268,7 @@ static int glue(compute_all_mul, SUFFIX)
/* various optimized jumps cases */
-void OPPROTO glue(op_jb_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jb_sub, SUFFIX),
{
target_long src1, src2;
src1 = CC_DST + CC_SRC;
@@ -277,23 +277,23 @@ void OPPROTO glue(op_jb_sub, SUFFIX)(voi
if ((DATA_TYPE)src1 < (DATA_TYPE)src2)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jz_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jz_sub, SUFFIX),
{
if ((DATA_TYPE)CC_DST == 0)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jnz_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jnz_sub, SUFFIX),
{
if ((DATA_TYPE)CC_DST != 0)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jbe_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jbe_sub, SUFFIX),
{
target_long src1, src2;
src1 = CC_DST + CC_SRC;
@@ -302,16 +302,16 @@ void OPPROTO glue(op_jbe_sub, SUFFIX)(vo
if ((DATA_TYPE)src1 <= (DATA_TYPE)src2)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_js_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_js_sub, SUFFIX),
{
if (CC_DST & SIGN_MASK)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jl_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jl_sub, SUFFIX),
{
target_long src1, src2;
src1 = CC_DST + CC_SRC;
@@ -320,10 +320,9 @@ void OPPROTO glue(op_jl_sub, SUFFIX)(voi
if ((DATA_STYPE)src1 < (DATA_STYPE)src2)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jle_sub, SUFFIX)(void)
-{
+DEFINE_OP(glue(op_jle_sub, SUFFIX), {
target_long src1, src2;
src1 = CC_DST + CC_SRC;
src2 = CC_SRC;
@@ -331,39 +330,39 @@ void OPPROTO glue(op_jle_sub, SUFFIX)(vo
if ((DATA_STYPE)src1 <= (DATA_STYPE)src2)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
/* oldies */
#if DATA_BITS >= 16
-void OPPROTO glue(op_loopnz, SUFFIX)(void)
+DEFINE_OP(glue(op_loopnz, SUFFIX),
{
if ((DATA_TYPE)ECX != 0 && !(T0 & CC_Z))
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_loopz, SUFFIX)(void)
+DEFINE_OP(glue(op_loopz, SUFFIX),
{
if ((DATA_TYPE)ECX != 0 && (T0 & CC_Z))
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jz_ecx, SUFFIX)(void)
+DEFINE_OP(glue(op_jz_ecx, SUFFIX),
{
if ((DATA_TYPE)ECX == 0)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
-void OPPROTO glue(op_jnz_ecx, SUFFIX)(void)
+DEFINE_OP(glue(op_jnz_ecx, SUFFIX),
{
if ((DATA_TYPE)ECX != 0)
GOTO_LABEL_PARAM(1);
FORCE_RET();
-}
+})
#endif
--- qemu-0.9.0/target-i386/op.c.gcc4 2007-02-02 12:45:51.000000000 +0000
+++ qemu-0.9.0/target-i386/op.c 2007-04-20 15:20:55.000000000 +0000
@@ -18,7 +18,9 @@
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
+#if __GNUC__ < 4
#define ASM_SOFTMMU
+#endif
#include "exec.h"
/* n must be a constant to be efficient */
@@ -250,6 +252,7 @@ void OPPROTO op_imulb_AL_T0(void)
EAX = (EAX & ~0xffff) | (res & 0xffff);
CC_DST = res;
CC_SRC = (res != (int8_t)res);
+ FORCE_RET();
}
void OPPROTO op_mulw_AX_T0(void)
@@ -270,6 +273,7 @@ void OPPROTO op_imulw_AX_T0(void)
EDX = (EDX & ~0xffff) | ((res >> 16) & 0xffff);
CC_DST = res;
CC_SRC = (res != (int16_t)res);
+ FORCE_RET();
}
void OPPROTO op_mull_EAX_T0(void)
@@ -290,6 +294,7 @@ void OPPROTO op_imull_EAX_T0(void)
EDX = (uint32_t)(res >> 32);
CC_DST = res;
CC_SRC = (res != (int32_t)res);
+ FORCE_RET();
}
void OPPROTO op_imulw_T0_T1(void)
@@ -299,6 +304,7 @@ void OPPROTO op_imulw_T0_T1(void)
T0 = res;
CC_DST = res;
CC_SRC = (res != (int16_t)res);
+ FORCE_RET();
}
void OPPROTO op_imull_T0_T1(void)
@@ -308,6 +314,7 @@ void OPPROTO op_imull_T0_T1(void)
T0 = res;
CC_DST = res;
CC_SRC = (res != (int32_t)res);
+ FORCE_RET();
}
#ifdef TARGET_X86_64
--- qemu-0.9.0/target-i386/exec.h.gcc4 2006-09-24 18:40:46.000000000 +0000
+++ qemu-0.9.0/target-i386/exec.h 2007-04-20 15:14:38.000000000 +0000
@@ -501,6 +501,7 @@ void update_fp_status(void);
void helper_hlt(void);
void helper_monitor(void);
void helper_mwait(void);
+void helper_pshufw(MMXReg *dst, MMXReg *src, int order);
extern const uint8_t parity_table[256];
extern const uint8_t rclw_table[32];
--- qemu-0.9.0/target-i386/helper.c.gcc4 2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/target-i386/helper.c 2007-04-20 15:00:02.000000000 +0000
@@ -3522,8 +3522,15 @@ void helper_fxrstor(target_ulong ptr, in
nb_xmm_regs = 8 << data64;
addr = ptr + 0xa0;
for(i = 0; i < nb_xmm_regs; i++) {
+#if __GNUC__ < 4
env->xmm_regs[i].XMM_Q(0) = ldq(addr);
env->xmm_regs[i].XMM_Q(1) = ldq(addr + 8);
+#else
+ env->xmm_regs[i].XMM_L(0) = ldl(addr);
+ env->xmm_regs[i].XMM_L(1) = ldl(addr + 4);
+ env->xmm_regs[i].XMM_L(2) = ldl(addr + 8);
+ env->xmm_regs[i].XMM_L(3) = ldl(addr + 12);
+#endif
addr += 16;
}
}
--- qemu-0.9.0/target-i386/ops_sse.h.gcc4 2007-01-16 19:28:58.000000000 +0000
+++ qemu-0.9.0/target-i386/ops_sse.h 2007-04-20 15:11:19.000000000 +0000
@@ -581,14 +581,9 @@ void OPPROTO glue(op_movq_T0_mm, SUFFIX)
void OPPROTO glue(op_pshufw, SUFFIX) (void)
{
Reg r, *d, *s;
- int order;
d = (Reg *)((char *)env + PARAM1);
s = (Reg *)((char *)env + PARAM2);
- order = PARAM3;
- r.W(0) = s->W(order & 3);
- r.W(1) = s->W((order >> 2) & 3);
- r.W(2) = s->W((order >> 4) & 3);
- r.W(3) = s->W((order >> 6) & 3);
+ helper_pshufw(&r, s, PARAM3);
*d = r;
}
#else
--- qemu-0.9.0/target-i386/helper2.c.gcc4 2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/target-i386/helper2.c 2007-04-20 15:15:22.000000000 +0000
@@ -1038,3 +1038,11 @@ void save_native_fp_state(CPUState *env)
env->native_fp_regs = 0;
}
#endif
+
+void helper_pshufw(MMXReg *dst, MMXReg *src, int order)
+{
+ dst->MMX_W(0) = src->MMX_W(order & 3);
+ dst->MMX_W(1) = src->MMX_W((order >> 2) & 3);
+ dst->MMX_W(2) = src->MMX_W((order >> 4) & 3);
+ dst->MMX_W(3) = src->MMX_W((order >> 6) & 3);
+}
--- qemu-0.9.0/dyngen-exec.h.gcc4 2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/dyngen-exec.h 2007-04-20 14:54:50.000000000 +0000
@@ -279,4 +279,24 @@ extern int __op_jmp0, __op_jmp1, __op_jm
#define EXIT_TB() asm volatile ("rts")
#endif
+#if defined __i386__ || defined __x86_64__
+#define DEFINE_OP(NAME, ...) \
+static void OPPROTO glue(impl_, NAME)(void) __attribute__((used)); \
+void OPPROTO glue(impl_, NAME)(void) \
+{ \
+ asm volatile (".globl " ASM_NAME(NAME)); \
+ asm volatile (".type " ASM_NAME(NAME) ", @function"); \
+ asm volatile (ASM_NAME(NAME) ":"); \
+ __VA_ARGS__; \
+ asm volatile ("ret"); \
+ asm volatile (".size " ASM_NAME(NAME) ", .-" ASM_NAME(NAME)); \
+}
+#else
+#define DEFINE_OP(NAME, ...) \
+void OPPROTO NAME(void) \
+{ \
+ __VA_ARGS__; \
+}
+#endif
+
#endif /* !defined(__DYNGEN_EXEC_H__) */
--- qemu-0.9.0/cpu-all.h.gcc4 2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/cpu-all.h 2007-04-20 14:58:38.000000000 +0000
@@ -339,7 +339,13 @@ static inline void stl_le_p(void *ptr, i
static inline void stq_le_p(void *ptr, uint64_t v)
{
+#if __GNUC__ < 4
*(uint64_t *)ptr = v;
+#else
+ uint8_t *p = ptr;
+ stl_le_p(p, (uint32_t)v);
+ stl_le_p(p + 4, v >> 32);
+#endif
}
/* float access */
--- qemu-0.9.0/cpu-exec.c.gcc4 2007-04-20 15:43:06.000000000 +0000
+++ qemu-0.9.0/cpu-exec.c 2007-04-20 15:50:20.000000000 +0000
@@ -737,6 +737,18 @@ int cpu_exec(CPUState *env1)
);
}
}
+#elif defined(__i386__) || defined(__x86_64__)
+ asm volatile ("call *%0"
+ : /* no outputs */
+ : "r" (gen_func)
+ : AREG0, AREG1, AREG2, AREG3
+#ifdef AREG4
+ , AREG4
+#endif
+#ifdef AREG5
+ , AREG5
+#endif
+ );
#elif defined(__ia64)
struct fptr {
void *ip;
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-04-20 17:02 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
2007-03-24 20:15 ` Anthony Liguori
2007-03-25 10:15 ` Axel Zeuner
2007-03-25 23:46 ` Anthony Liguori
2007-03-26 5:49 ` Axel Zeuner
2007-03-26 22:53 ` Paul Brook
2007-03-27 5:48 ` Axel Zeuner
2007-03-25 12:12 ` Axel Zeuner
2007-03-25 23:44 ` Anthony Liguori
2007-03-26 6:16 ` Axel Zeuner
2007-03-29 2:07 ` Anthony Liguori
2007-03-29 6:03 ` Axel Zeuner
2007-03-29 15:51 ` Anthony Liguori
2007-04-20 16:57 ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
2007-03-26 17:14 ` Axel Zeuner
2007-04-06 21:04 ` Rob Landley
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.