All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
@ 2007-03-24 17:50 Axel Zeuner
  2007-03-24 20:15 ` Anthony Liguori
  2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
  0 siblings, 2 replies; 17+ messages in thread
From: Axel Zeuner @ 2007-03-24 17:50 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3980 bytes --]

Hi,

there were a lot of discussions about compiling qemu with gcc4 or higher. The 
summary of the discussions were, as I understood, that compiling qemu with 
gcc4 requires changing the code generation engine of the most of the 
supported targets. These changes require a lot of work and time.

How about splitting the current static code generation process further? 
Today gcc produces object code and dyngen adapts it for the purposes of qemu, 
i.e produces the generation function, patches in parameters ..:
gcc -c op.o op.c ;dyngen -o op.h ... op.o . 
The op_XXX functions generated by gcc may not contain more than one exit
and this exit must be at the end, no not intended jumps to external
functions may occur.

It is possible to split the transformation into the following steps: 
Generate assembly output from the C-Sources: gcc -S -o op-0.s op.c.
Convert the assembly output: cvtasm op.s op-0.s. 
Assemble the converted assembler sources: as -o op.o op.s. 
Use dyngen as before: dyngen -o op.h ... op.o. 
Nothing will change if cvtasm copies only the input to the output, i.e. this 
additional pass will not break existing code.

A full featured converter (cvtasm) has a lot of dependencies: it has to 
support all hosts (M) (with all assembler dialects M') and all targets N, 
i.e. in the worst case one would end with M'x N variants of it, or M x N if 
one supports only one assembler dialect per host.  It is clear, that the 
number of variants is one of the biggest disadvantages of such an approach.

Now I will focus on x86_64 host and x86_64-softmmu target.
cvtasm has to do the following tasks in this case:
0) convert repXXX; ret to ret only. (Not done yet, x86_64 only, but does not 
harm).
1) append to all functions, where the last instruction is not a return a ret 
instruction.
2) add a label to all functions with more than one return before the last  
return.
3) replace all returns not at the end of a function with an unconditional jump 
to the generated end label. Avoid touching op_exit_tb here.
4) check all jump instructions if they contain jumps to external labels,  
replace jumps to external labels with calls to the labels.

The task 0-2 are easy, task 3 may, task 4 is definitely target/host dependent, 
because there exist intentionally some jumps to external labels, i.e. outside 
of the function, for instance op_goto_tb. 
Please correct me, if I am wrong or something is not mentioned above. 

The attached cvtasm.c allows compiling op.c/op.s/op.o without any disabled 
optimisations in Makefile.target (patches for Makefile and Makefile.target are 
attached). The program itself definitely needs a rewrite, is not failsafe and 
produces to much output on stdout. 

The macro OP_GOTO_TB from exec-all.h in the general case contains two nice 
variables and label definitions to force a reference from a variable into the 
op_goto_tbXXX functions. Unfortunately gcc4 detects that these variables and 
lables are unused and suppresses their generation, as result dyngen does not 
generate two lines in op.h:
case INDEX_op_goto_tb0: 
	...
	label_offsets[0] = 8 + (gen_code_ptr - gen_code_buf); // <--
	...
case INDEX_op_goto_tb1: 
	...
	label_offsets[1] = 8 + (gen_code_ptr - gen_code_buf); // <-- 
	...
and qemu produces a SIGSEGV on the first jump from one buffer to the next.
I was not able to force gcc4 to generate the two variables, therefore I had to 
replace the general macro with a host dependent one for x86_64 similar to x86 
but using the indirect branch method. 
After the replacement qemu worked when compiled with gcc4.

I made my checks with the following compilers using Debian testing amd64: gcc 
version 3.4.6 (Debian 3.4.6-5) and gcc version 4.1.2 20061115 (prerelease) 
(Debian 4.1.1-21).

Please note: These patches work only for x86_64 hosts and x86_64 targets. They 
will break all other architectures. I did not check i386-softmmu. It works 
for me. 

I apologise for the size of the attachments.

Kind regards
Axel

[-- Attachment #2: exec-all.h.diff.zip --]
[-- Type: application/x-zip, Size: 694 bytes --]

PK\x03\x04\x14\0\0\0\b\0—x6·Žðë-\x02\0\0\x06\x04\0\0\x0f\0\0\0exec-all.h.diff¥RkOÛ0\x14ý\x1cÿŠ£ IIÓôE)(ˆ©€\x18š\x04c\x1aý¶N–›8mXb\aÇ}LÓþûn\x1a
…ñmWQ¬ûôñ=ç³Jä&‚ÜÈ8\x14yÞY°³ÿ7öíò\x1ei–Ë\bÝJ/M,«î£,–Íïå®öŠ\x19iM&W™šÃÐQeZ¡ß9ê±$KS„K„¦v÷\x01†a¸ç:ƒ\x13|’3\fz½c\f\x06Ñáq4ì#ì‘9ÛAA\x10¼*\x1fâV˜¦¼?ŠŽ†QoД³ñ\x18áááq{„ >\x06#ŒÇ\fŽ\x03¸ýhª\ÿtÊð\aë\x05½\f^Ïg`ÁÌ³\x14‰L3%\x13xœoNF|4äÜg\x01\vº-\_^b\b]Ú¬È*YA¬Å/Ø…D.f2'?µÒl\x03sm5¢ÐC«»íœ,²
ôÕ¹BdŠ\x16$*ZOª›ú؈jA\x13uŠz¯ \x1c±.J\x02—`ÙÅnÎ<Ž	Aí\x1c48q}7¹ã“\vO—J\x14²\r;+…\x11E^[Êwj›² ÑøíìŒ|­t–´`…™K{æÕ\x1eZ¾çy\x13#T•\vKÜ]ä:þIѧ‰~øÑθ’^[û]ýðOwƒ8\x17UÁ9+]÷å’ó盜ºÂƒÛI„\x15S5µîë\x14±Ñ\x11y6W8\x01‘ò6y~\x7fËï¾ò›ó‹«^[þåüöÊSm4\x0fõáFMÇ\x13Ž\x17s;K‘ ŸîO|[RÖ
ÕË
ï£z(J´>ôÞOÖ\x02zŽï%"çߐ+\¯Ù²O;Ûe÷…\x170ð*I
\x041ü°¤«I<õ¢1ÛR Ki¶ŒTð
m$Jm¬˜Q\x7f¬\x13¢<Ѥ\x1c¥-õÈ„ý\x05PK\x01\x02\x14\x03\x14\0\0\0\b\0—x6·Žðë-\x02\0\0\x06\x04\0\0\x0f\0	\0\0\0\0\0\0\0\0\0¤\0\0\0\0exec-all.h.diffUT\x05\0\a.Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0F\0\0\0Z\x02\0\0\0\0

[-- Attachment #3: Makefile.diff.zip --]
[-- Type: application/x-zip, Size: 631 bytes --]

PK\x03\x04\x14\0\0\0\b\0x6\v\x05,Éò\x01\0\0\x05\x04\0\0\r\0\0\0Makefile.diff¥Rm‹›@\x10þ\x1c\x7fÅ@=ˆÑM4I›«GÁÔó®GïÈ\x11Sè‡Ò`tM—S×ÛÕpã~{GI\x10Ú\x0fí‚;Ï2ϼ<ãÜe\x11}±á!x¢1K¨òéÿ²t}¨’Ù0’¼\x14!•£gš–ÍÕV2vŠ …`tDz-\b4’ñ\f¬¡eM•ˆÅ1\x12ˆ¨ß§ö\b!ÇGÏš!\x1606Í\x19XïmÓ´'3 &ž^“F×õ\x13{<=cÏlszd+Ž\x03d|iÌ@¯oÇQ@ IlPû«ÅâÞ×\x10\/\´‚†¥” \x17ID–›ˆ	raC´Ï¶4SûÞwÏÿv£)ú\x1f]\x10îŠ@¦'*ôÔþÃü«§\x01q±\x10\x06Ê\x02\x0eᆡ:\x1a4Õ΋W­åAÑp/Œ¶˜Qu<_Þz«õõÝÒ×´ZÛä£a™ WfZ«ëtÔ¶8\fë^¾,üÕÚu+ÑîÍýü¶–ï>>\x1eñç¹ï­[\x17á : þÄ\x06õŽ2û \x15óêÿžWW LhÙ
¼ƒ`ÇY\x04<‰`S2¼sÁ7	M%lö8ž”×Û”ó‚f\x05Ã1íe!\x178¸¢\x0eª6A¢H‘\x02‰!äY̶Ã4xjá/à9a“Ë\x0f5
[ˆ³!ç\x0e\x12ˆôÀhЁÐ<Žù\aCŽ_pÚ¢î&¬P)2r\x1eÁà\r\x06£Á^[\x0eªŠ\x15ñß·¤ ²ÍXÐ\x11s\x01\x11*íüü+ˆ8ü謗\x1aUc}}\x05úÂ
°à
\x19¿\x01PK\x01\x02\x14\x03\x14\0\0\0\b\0x6\v\x05,Éò\x01\0\0\x05\x04\0\0\r\0	\0\0\0\0\0\0\0\0\0¤\0\0\0\0Makefile.diffUT\x05\0\a:Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0D\0\0\0\x1d\x02\0\0\0\0

[-- Attachment #4: Makefile.target.diff.zip --]
[-- Type: application/x-zip, Size: 888 bytes --]

PK\x03\x04\x14\0\0\0\b\0¸x6>o'\x17å\x02\0\0í\a\0\0\x14\0\0\0Makefile.target.diff¥UmoÚ0\x10þŒ\x7fʼn¢	š\x18HZJ—Ž
Æè‹Ô\x17\x04Ý´I“ªÔ8à58ÔNh·\x0fûí;'…BÛµÙê\x0fgû|Ͻ<‰ÏÇrÄï<8õ¯y B^}5æ1i½}Aw\bƧ\a5\x1d%Šq]»áÓ$\x13\x02Ús¢x¬\x04Ÿ\v9\x06…“\x16‘\x04§ê4\2\x12A\04\x01ªÒý“d)¥u\x05·Ž*\x05n½Þ\x04gÇÛnxn\x13h\x1dG!sjYÖSÐö\x1a¨±åmmg Òn\x03u›ö\x0eX(›Ðn\x1389þ8l\x118êô{ƒËîÁIçpØ*•³E…À§og‡½³VµZ^[ý”c.KåÞ×ÞðóA…XÝ/\x17á©9bóØ×Ó‡#Ø€Ds\x05ÈQèÇ‘\x02éO9‹Îà°wqÙ\x19t\Œ±²Eˆ\bø\r”×µ¶¯¦•4ëæ®í8`5ß›ÉäÍ\x18f±¡·\x05¥²žð0D\x17¸ìv+(Ïû÷µ˜S\x01:\x04\x1aAmÄç5™ )½c+»ï\x04ÖÇþÊ¡»ÿÎÙƒxÂ%p6‰ h\x1c\x16÷€‡š/5®Ñ\x04\x02ö°\x12B—Ñ-,“ùèd™®\r4\x11U<R#®èU\x18±kmC±XÉ\x03^[3Ís^[ÇŠsÊ&¹íÍf*~qªÅUˆ\x7f05Vùsc*ÒúG2!4\x03Y^[ÿGD\x1eÜ\x03\x13y¬×¨È\x03x‰‹<ø§dÀë ?\x14cICÿŠ/Bå\x06™PÿŠ	\x12ÉŒ\x12qÏØM\x1f\x19µê©ûì:6\x1c\x17û‡Õp¶lÇMï#¶\x06ÄV'\x1e ŒÐaÖ80Âr\rtlna©\r¥\x0fæš\x18ËÔž\x11\v¥ÎÖæ\x0e§Å0]xöBwûýû5í\x19ÑŒÖ«Âxµ^\x05\x04s®®"Í)v¬û¶Âõ\x12ž\x05_hMbÄJ\x13Ì’ÕiI/\aa«ub7\k¯ ´i#\b°3\x1aÚ±·\0‹FFà\x0f\x13âÉ­ˆ' c?\x16\fß‘±Ð1W\x19ñ»NJüî¢\x7f/Sùkt\x16r_z„\x16\x14Ö^[À&~\x1d\x14>lþFT\x7fpn ‹¯‡õ±T¢·ÁŒ×Œ¹\x0e…š¥«`–˜\x19yzp¶ÂÓ›ýbºBbÝaè¥Ì˜gAfïBæÑ®?PK\x01\x02\x14\x03\x14\0\0\0\b\0¸x6>o'\x17å\x02\0\0í\a\0\0\x14\0	\0\0\0\0\0\0\0\0\0¤\0\0\0\0Makefile.target.diffUT\x05\0\amZ\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0K\0\0\0\x17\x03\0\0\0\0

[-- Attachment #5: cvtasm.c.zip --]
[-- Type: application/x-zip, Size: 4784 bytes --]

PK\x03\x04\x14\0\0\0\b\0x6a”\x05\x135\x12\0\0ÇD\0\0\b\0\0\0cvtasm.cå\x1cksÛÆñ3õ+Ît\x15\x01\x14D“ršIMC©ãÚm:Ž“i\x1eÓ\x19ǃ\x01I@\x02\r\x02(\x0e”œ‡þ{w÷ÞxP”ÒöK3ILÜííîíûö\0?ΊU¾['ì9oÖY9½º8zl\x0fåÙ²3\x16×—í±:+Zc«æç*Á¡#ÞÄM¶b×e¶fI]—µ·*\vÞ°ÕU\OXºm\x026N}\x16Eq\x03ˆ–»&‰"Ï+Ê:ivuáû\v#+`Ùu\x13Å|ë½þêÍ«	Œ\x04Lü*w\rB"È6Î
\x0f\x7f\0§«@\x10šàõ\x7fôë\x11ƒ\x7fp	›Àâ	,[Ð\x10Â×ògÊ<ZË\x1e…ì)óÙ¯G#Áùø\a\x1e_&ÏØ1\aø)G¢SþS1\x0e\bû»Ù{_`¸•(YÈÒ²J
¦çïƒq=ö-"\b\x11²·?¼ycSaãcþŒ­Ê]¾fEÙ0Ä\0$\x15•¹K\x05xp¨œ\x03•^[‡
@ô‘¹\x17•\x1a¶¢D\x0fr\x13ÒƉt•—<ñ:\x03Y!Ÿ…\x1aÙlqt{°1H5¡N2\x05\x01Ør—¾›ÏÎ?}/Æ®ã(Ï`ñu¬Ÿ\x01\x7fÝx×q\0¨”\bÂk^T`¡Mê\x01‚€g¿$%ýô\x11*¸Ž\x15ã7uÖ$\x04“\x05ó\0,\x1d8”sÉǬñžú=Äy^®"±“½&<¸Hn5­v\r÷Æ4\x05e!ă¶u\b#\x13Æ·´Öc¸¿¨a\x1cT-q\v€*d\x12„ÛÆQ\x19Ó8\x1aYœ1ÏÕ_ÕC²N\x04BI P´\vî;¤\v ­`\x11¬\0æ,\x16`Z°°ŸƒB°ð\x18Ö¬“4+’µ\x17EÙÓÏ?‹"Ÿýö›\x1ad0ú\x11\x06?û\x14Ə\x1e‹Qöâ\x1f/ÿ\x16ýíÅwÑË\x1f¿^|÷5›\x1fé¹o¾^ÿðöeôíë\x7f²qYEc³ê»¯£\x7f¼úž\x05wôÅ›¯þúvÎÆÓê<γˢgö\x1cf{æþòå_£7ß¼„IØiwŠBÓxšfyâNþýëo£7/¾|õ\x06¨F\x11°y™\x14Q\x1e/“|>\0x.\x017Ûj6\0òÔ€\0\x16¥^Û'	ü\a€\x7fûâëWß½{\x1f
ÅŽaÅ8?c\b\x1aãM±LÌH"‡ôÈR\fÄ\x06f)ab="\x06~ÑÏ—\x02 7k.åš\äbàÒÀä\x12æR\x14rÄ`.Jó³¢ÉÊ\x1aá4b\x06$„!Áå¯ÙÑ­›n®0„Øò«”+H+ö˜7!Ÿ;a'd¶êñ§¦õ\ÓóÑ\b\x17›Ñâ\x04ÃÉÈrFM7*ëè—¤.ï /x¬lb3߉(öú$n¢=»‚©<±‘‚\x0fŸžVᣍ»(‹Cð?Ò\x04>ù„¸~„\ï£VïV\rËÁâ#Î~Õ9dÂr\x01­B•óDÑga+•\x026aÉ
ˆ¼ÌÁ\v!Pñ¹M¶<i<\x1eÌT‚™p_Æ/~vAˆÃùùçz$‡„ªb¶\x020ð9Ô\x12a7e\x12Ùu\x02L”?\x0fó’ÖI‚8sߦ6k1cž‹!J„hŒÃŒÊ)‚¶cM­õô窬`\x03Bã­ùZáo¯³ä%%œ+	çjS¬–¿ô¸µLƒ1³\fK4X#Í\0´¸\x02ÆpY€ÃâÿÅéÜÍDy¿¸Vy\x12×ސ¸8\x11›9*\x06â\x03¢¯vü*BƒíÊ_T´l¥«$Ì¡\x02{¨ÕËÎØœj=eÙÖÔ„C\x04‘Þ\x10š4ÎiÃ1Jg$-4\x17?źÅvIH{@º¸\r)ðÓS®…éÌÏî²\b`b­‹ú\a˜\0\r\x16Æ0É]•\a¨(â­Âô2iVXú\x18B^}óZ”Ä-¡ç\x01:ãˆD»ÒWÄâ%°úÁ©÷Z…¨*{M=·\x0f/Êy\x05¼0A4ЀÍ\x03yºqc¾`¹©a«àOÛmR4CdŽF#‘F\x18덲ö D…á©©WWµ`åäñ‰%è:„r2œQ\x061\v@Ëì\vøï\x19\x14•\x17¡\x1aw]§vì@Ø&ñI•LTÄÛd .€8ìm\x10¯\b®«N(ÜñWÿžò²hmè™Ú*˜•o¢ÒiÌ1`\x1d\x06\x17›Ð\x11#¤Ð²pg óŒžPV\x13éñ*^%œMž€\x13á*©I§¹\bиbu•¬>À\x19®†ƒ˜»t™47	œËÄò¸X‹­\x10ÊJ¢3\x19” ´ÝzR\v\0OYóQŸÁä\x01ælÜ#‰BÂCô\x10\x14OÙ|!¦\x10¥‡ò\x16\x19˜³ç¡–8­\x1fÉàI:¡Õ\x01?ÃÀ9\x1aá\x10D‚¹\f\x04#á:£Û®\x0fñ^#ϸ0!ËP*\aQÂù”\x11,œm]Q¾-‹ÇÏPp–\x11·-\x0fll\x16 Î\x1f\x11£Cvk›–H”Y±Žšò\x03œÁ‡Œ\x17mwÔ¶KZâ³®ÁVd¬ð¯0V\x01g\x1d™†U	v\x11º¬WC.—îŠ\x15\x1e5Å©y¿ï\x11û£ûy\x1d¶€ÂŽ€ò`<Å\x19Õ¢p\x1a\0ÀQïŠ?+^Ç{œ–yˆ\x17uG‡I²S\Ç.ˆ\x15a¦è \x1f›:†Ý5W	SxiSäVí\x1d$Å:tÝŒ-°ò”ó\vwUŠ¸Bãé\x12Êï€!j\x19“hMp\x12œh÷\x1dpWZ%÷†?as´Ö7NŒ]'š:£™>'~\x14Î\x0eòcÁVǏÃ\az±kq\x03ÙÊîþÔá ¥æPj“§îqÔ^k\aÉ\x1c`ë\x0fI38Þoð8³ÏtqÌ\x18î}Í\x14WwÍ”¸$3•ów˜©„ú?5SÛT6\x1f?F`hP)\x1ePt‚úf[\x11;fò^F\x04ËûlHbÝ\x17þ\0„µÊ\x15B‡\x16\x1fŸ§°\x1c\x19TB…nó@.@¹«Þ\x03%!¬ï`ŽÚ\x06Â`à	$n4ê˜$UIƸ\x14ÖE\x17´c¹Ì\x142zo\x03fFŒ
{€Ú…ŒI\x16.hK£ƒ\f	i\f×.8ë\x1a\x14˜ÒCb\x1f\x18Õ°!õ؏^[\x06û\rS›\x05ŃCáf\aÚ°ñÜmß÷1a*_ܾa¿ÕÊcÙ¤²Oª\x1f\x06÷:©\x02â@Q–öû91åÃÂ9¦¡¶t‹è@Mxî*5­äÔ‘f¾G^[}ê\0’\0sG6´-\x04 ûB„ìVï\v\x11°’	qõx>Lª\x10¡PÎ{ã\0@Zq\0ž(\x0e¨3Äü.a·6OMò{l?\x1eÚ¼hÊ÷Ô•\x04ÄB«s¿GF±èˆ¢öS:G\vì >w®\0\x05Â!‰ÇJÞ¢À\x1fvlÉ:þ]’^//ï!g€\x1eÚ¿¼Ä\x18\x12uhßtì\x115P0¦‡=ô°UÒ–¸\x1aïÈ\x1c‘\rŠ\x1d&\x0f\x12<ÀY¢‡§û	¿·ò…\x1cõ€ºW׬wT½„\x11WE¸žëÖ÷\x13qPeU	Ø“š\x12N\x13g\x05gòÆ1Y«9Р5HhÔ©ÙåX \8øWå\x0e¸·\x0eÙÅ‚)\0§†ÞÁA§à5YˆŠ½n“ÞÚ5õþz„q@¿¾Ó¤t×[O+È/`\x15­V¥\v¾¯aÙâ—éfe_;ñNf°Ÿî˜•\vÚnªwØìlLß¾Z—ðè\x17¨åµºÜ\x13_m\x13Ûåâ@\x02›u¬f½Ûu‡\x01q\x02 ¦ÇzW!–Ôªv±3æe`ò,{NË\x17§§™hÛ
ŽÞeïCëb¡Ö£~vû´Öæõ}Æ^ëQ-wœ\x11AF\biDœ²Œú\È,uÙâ–6«/V¸Í%–)ö¨êÁ“¸\x01ד'ŒÇ`&?»\røÂ	§\	’¨)tJ’#=­\x16Ýî\x15Dž4æög@\x10}’^[¾	jÓˆ«
\x03^[>\f\x10J;g¬¡¨Ù\x17›zì\x14¤ ®;R!Þ€á¯\x02Ÿ>Ó¦kÜTÛ¼‰v-\x1e*ÛörwÝ;Äý>¬ÔõHJV?2:KµRT\x1eT™R÷qùp\x13¤¢t€µ¦)‚\rFæQÄÔÎÉÕ‘j\b\x1f\x03„jyÀ…YÞe(=÷^[\x1d\x1dÊ[\x0eÇ}”`ÿÐéÊñõT¹\x0fsýGÐLµ\x03\x05e‡W£¡¬HK;\x11b½Îž³\x19“‹i€cÒI>Bö)âÜd\x1fœB\x10ºÛ(JB\x01ÿàp '/$.V¦Lœ\x03¦Œý]!Åä–Ëž´Z¿\x7f\x16É\x0e\x01°¿sS—Å%!™š•˜!á˜Ë\x11o\f\x1c¸Ó(ð.\0`¶ñ%èDPåøê‹óbˆ¯²,\x11)vÛeR+\x04È,\f\x12ãYÓ<¥\x19+wãþtz‡à‰{ÅƸ\x16œIá…
®ðD¢ÏÄ%…®g’Æ-\x14n²æJ\b…}a\x03Ґ\x06Õ\x044§ñ 	\x1aVe\x01i¿©ã‚ƒ=mµö©"¡êÁb,Z\x15;P”M\x147\x11µÌÔ¸²\r:îñn\x1c\x10V61\x0f‚\x15ò\x16Èó\x12Å\x1f]ëÊn\x03ÛçKΫqÁ&)\x14*îÔ¾ÅÞ÷.‚æ0ôz6€\x11R†‹Â„ɽNáRt¬) YL†Ö(­¢\x1028ôñ„ôª«e\x14El«UÄ\x02L­LÚŸ\vBfB‚ŠžÎ\x05\x12\x16YL\x15ð\x16w\0?%\0,êÝu\x14-ö,Ãùð¬³\f\x0fôËð\0¯—<òú\x0e	DS¿þÃX\aFô¹\x06 Äu\x19ͱÎ$·‡Vâ\x111·šzmÖÉ͈y»ù\x05Z-Ê^[¼<Ì\x13Ù²&Á‰À4¬[\x11ç-ÅÚ••#c‘\x1fu÷°ÉŠ]²8Ð\x10ðð‘a¼¼Â\x13J]'¼*‹u†\x11—¬‚.hEÖÝð½M\x1dýÞÄrnJû\reÊ¡eËy \0dêá"öÅë5ðï\x11ëØÎâ\x1arÈ\abGÈA4Ý\x01ÃÉñ‰o®£‡ì\x10<%[tDtk‘†ŒÑJ\x11½D·\x15\x12uß:ôÃðÔ÷¡=ÿï }úp´(+îf\x1eº!­AÈÎr4œ™\x16˜—ñuv	a}9Ç\x17À©Í,ðz`\x19\x17sk\0F\0hÃÏÎ߇áI*ß÷³†–'¾åzPó§òÅeÞ¬Ë]\x13Ðk—¦¨89æ'àH»bMï /窾Ó(grOØzqèL\x1cÑŒ¼¶\x17‡›\r\x1c\x1fØ‚m \x1eãÎ\x1c0ŠÔ§§^[½\x05q/\x13\x19Æ‘%j=íhg#\x03±r÷öŠ~\x7f·\x1c~ó^BªÖ—thû–_û2a\x14þ|nü9wëdwåò<0ó}r\x15ì£X…@ϝ{\fc¯Kcú#Kˆ\x14ZG\x03\x16»1÷Y\x14Cñˆ%\b¨ããrî[g»VõAï™ô—\x1fMp¤ï<\x0f-?LqT„¦ú\x18\bì]Å©Š‰J\x11ÖX;>ÍTnħ³\vìZËÐ\x7fzÚг(Úl UÖûÌ‚ì«Únm6\vHÝÙ\x05p{vö{øô\x04\x0f”\x1ceœÆÛ&Ã>]@)ÖÔÕÄaKC±ô×–\0ì"µuá¡õo\x1fÙŒºÓì#kÝotmad\x15¸‡\x1aUƒŠ\b›HÔ\v²á€þ¢\x121ÎÚµü³½j!Q5}µˆ8’˜ó¡!š\r\bFrRÕåz·‚#…¬\x02è\0¬ûŒbðÝüüsùµ‰þˆ„&TeM\x0f~0ž¾9æ\x11~œ1Ö\ah_ï¹Nª<\x06Jú|iÕ\x01\x10Çå¹P;\x0eú]\x0fµ·M–~\x11\x1aS·ë!d}£‹w\x7fœŸc@¼£@\x1aª]\f"µ_3â«bˆ¶ðîÓÙŸ>#J¦{¡y©\r#êQ¼Þ,ý\x17Uhôf]ç)‘ËU6\x13x\x05¥cÖø1{ùã÷øYÇë¯þI—\qŠ\x05\x15\0B\x10ÖP?58Ð\x1cÓ'[Bsd\x16Ö'%Ì|S\x02{	5\a¸GE\x1e\x7f+Ú.eÌ©gg\x17$\x11M\x18ÈÒ3ÑÅ\x0f¬Ä°%[µŸÅÑcpç,m3d>gùϱÄwË\x7fýÔüáó€\x1d\x1f×¼êçÖŒA¹ê‚\x1f¶\x11aYâÅè\U\x16^[UKlÀ\x066Ïù\x02+\x06+á:¯á¢mmÜ>î°1^[#’NK&X$7\x14¼”ZÕ\x01·º§Ž›Y˜¦2~\v»\x04w†á¤–\x05\x12[&°\x15q\x05¸©£!EXiI=÷kÊÁ‡4üù3[â"\x02±Ü¼â\x7fË’\x1cŽÓ=ü	Gx\x18{Çœ\bû¸%'kð•\vÛL`”|\v_ZT\x0eæ\x04]\bU×Èœjè\x1ci­^[#1ÁÉ> ~ÍésQLô¼«­\x15¬^[þVÐ¥+%\fº5vÈ(ænäQ\b™€“PR‹«.äwÚ\x0eÅÆ\x10œ˜ÜÉE"S÷\x1dQ{ƒ­å\x12\x0f6\x18$‰®-+ù–*pTD:Ëh\x1eâ‚Zº–^[ö_6Nì"CµP½ƒ
ÊþÙB¿†+î\x12mP»»'ûwnŸyŸ4Aê«\x06™©MÀ8ÖBùzhMP}Ì…	\0„ð¦·X\x16lw´\x03츐\x05•ÕÔ\x1c@\v¬èuÛÜúš0xêSÝç–!€\x13\v\x05Ö,ÙUÌÙV„¸`e‘ÐŒ{´ÕˆñÓDœŽšå˜Ž\x0eÿ\x13´­Óƒ\x16ªô‚©¬Å1…ÏU‘Þ>\x13É\x04<¦³f‘$k\x0eeàG<F{²m,3ŸÎL\x12^³hÈX„ÝÒ^V÷¦oJ2D²‡\x01—\K\0¦\x16\x7f \x1fªF\x19+à\x17Ùí
ÙïeN±ÒRÅo¿uyÓÍÀÎv˜¾<¢oÖÝ‹eæ¥ê\fë\x1c"È\x14
ëæE½Ð5so²Ó‚.\x01í¯÷Yßçû\x03\x1fý,ö\x04ˆÐ
\x0fë²0W›Â}sõB	}]DŸ¤Ë6­ü\jæv*R­>õü[Ð\x03Hj›\x15”#ì×ÐG‚\x0eÝ\rέ–Tª^ä5…,]ePk\x14•\rÌ–\0…\r$\x12yï…(\b\x1d2¼Ó;\x18l\x13‹&‚ó!	%üìòª1wY\b}•Ôâõa„þލ~\x19\x17»10CAYZ`7Ø\vC‘b(˜y¿²s_˜ª¿C 3//ž…aLa3€G£¹EÇèÁC“©ó*°Æ©tCb\x03®od¤o+\vR ¥°~eˆWÚµ6Òžw*<ÁKÿ\x05x\x1ahm[›\x06îJÐe}ƒ·%äŒÀ\x0el¹Ú5BGÖ}h®d¡ö){B¦HÔ_Ó’¢í\x0fJR4z„\x16Îny¯ørOÖót2y¤&í¯ê}ÇÉ÷ø¸õµ^ï·yâÓ<Ìœø\x17#¬¼•õ7NØ|I¶þ\rPK\x01\x02\x14\x03\x14\0\0\0\b\0x6a”\x05\x135\x12\0\0ÇD\0\0\b\0	\0\0\0\0\0\0\0\0\0¤\0\0\0\0cvtasm.cUT\x05\0\a}Z\x05FPK\x05\x06\0\0\0\0\x01\0\x01\0?\0\0\0[\x12\0\0\0\0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
@ 2007-03-24 20:15 ` Anthony Liguori
  2007-03-25 10:15   ` Axel Zeuner
                     ` (2 more replies)
  2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
  1 sibling, 3 replies; 17+ messages in thread
From: Anthony Liguori @ 2007-03-24 20:15 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5357 bytes --]

Axel Zeuner wrote:
> Hi,
>   

Hi Axel,

By adding some GCC4 fixes on top of your patch, I was able to get qemu 
for i386 (on i386) to compile and run.  So far, I've only tested a win2k 
guest.

The big problem (which pbrook helped me with) was GCC4 freaking out over 
some stq's.  Splitting up the 64bit ops into 32bit ops seemed to address 
most of the problems.

The tricky thing I still can't figure out is how to get ASM_SOFTMMU 
working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot 
deal with the register pressure.  The problem I can't seem to fix though 
is that GCC sticks %1 in %esi because we're only using an "r" 
constraint, not a "q" constraint.  This results in the generation of 
%sib which is an invalid register.  However, refactoring the code to not 
require a "q" constraint doesn't seem to help either.

The attached patch is what I have so far.  Some help with people more 
familiar with gcc asm foo would be appreciated!

Regards,

Anthony Liguori

> there were a lot of discussions about compiling qemu with gcc4 or higher. The 
> summary of the discussions were, as I understood, that compiling qemu with 
> gcc4 requires changing the code generation engine of the most of the 
> supported targets. These changes require a lot of work and time.
>
> How about splitting the current static code generation process further? 
> Today gcc produces object code and dyngen adapts it for the purposes of qemu, 
> i.e produces the generation function, patches in parameters ..:
> gcc -c op.o op.c ;dyngen -o op.h ... op.o . 
> The op_XXX functions generated by gcc may not contain more than one exit
> and this exit must be at the end, no not intended jumps to external
> functions may occur.
>
> It is possible to split the transformation into the following steps: 
> Generate assembly output from the C-Sources: gcc -S -o op-0.s op.c.
> Convert the assembly output: cvtasm op.s op-0.s. 
> Assemble the converted assembler sources: as -o op.o op.s. 
> Use dyngen as before: dyngen -o op.h ... op.o. 
> Nothing will change if cvtasm copies only the input to the output, i.e. this 
> additional pass will not break existing code.
>
> A full featured converter (cvtasm) has a lot of dependencies: it has to 
> support all hosts (M) (with all assembler dialects M') and all targets N, 
> i.e. in the worst case one would end with M'x N variants of it, or M x N if 
> one supports only one assembler dialect per host.  It is clear, that the 
> number of variants is one of the biggest disadvantages of such an approach.
>
> Now I will focus on x86_64 host and x86_64-softmmu target.
> cvtasm has to do the following tasks in this case:
> 0) convert repXXX; ret to ret only. (Not done yet, x86_64 only, but does not 
> harm).
> 1) append to all functions, where the last instruction is not a return a ret 
> instruction.
> 2) add a label to all functions with more than one return before the last  
> return.
> 3) replace all returns not at the end of a function with an unconditional jump 
> to the generated end label. Avoid touching op_exit_tb here.
> 4) check all jump instructions if they contain jumps to external labels,  
> replace jumps to external labels with calls to the labels.
>
> The task 0-2 are easy, task 3 may, task 4 is definitely target/host dependent, 
> because there exist intentionally some jumps to external labels, i.e. outside 
> of the function, for instance op_goto_tb. 
> Please correct me, if I am wrong or something is not mentioned above. 
>
> The attached cvtasm.c allows compiling op.c/op.s/op.o without any disabled 
> optimisations in Makefile.target (patches for Makefile and Makefile.target are 
> attached). The program itself definitely needs a rewrite, is not failsafe and 
> produces to much output on stdout. 
>
> The macro OP_GOTO_TB from exec-all.h in the general case contains two nice 
> variables and label definitions to force a reference from a variable into the 
> op_goto_tbXXX functions. Unfortunately gcc4 detects that these variables and 
> lables are unused and suppresses their generation, as result dyngen does not 
> generate two lines in op.h:
> case INDEX_op_goto_tb0: 
> 	...
> 	label_offsets[0] = 8 + (gen_code_ptr - gen_code_buf); // <--
> 	...
> case INDEX_op_goto_tb1: 
> 	...
> 	label_offsets[1] = 8 + (gen_code_ptr - gen_code_buf); // <-- 
> 	...
> and qemu produces a SIGSEGV on the first jump from one buffer to the next.
> I was not able to force gcc4 to generate the two variables, therefore I had to 
> replace the general macro with a host dependent one for x86_64 similar to x86 
> but using the indirect branch method. 
> After the replacement qemu worked when compiled with gcc4.
>
> I made my checks with the following compilers using Debian testing amd64: gcc 
> version 3.4.6 (Debian 3.4.6-5) and gcc version 4.1.2 20061115 (prerelease) 
> (Debian 4.1.1-21).
>
> Please note: These patches work only for x86_64 hosts and x86_64 targets. They 
> will break all other architectures. I did not check i386-softmmu. It works 
> for me. 
>
> I apologise for the size of the attachments.
>
> Kind regards
> Axel
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
>   


[-- Attachment #2: as-postprocesss.diff --]
[-- Type: text/x-patch, Size: 26257 bytes --]

diff -r 83ff8e3c6392 Makefile
--- a/Makefile	Thu Mar 22 12:36:53 2007 +0000
+++ b/Makefile	Sat Mar 24 15:08:16 2007 -0500
@@ -28,7 +28,7 @@ LIBS+=$(AIOLIBS)
 
 all: $(TOOLS) $(DOCS) recurse-all
 
-subdir-%: dyngen$(EXESUF)
+subdir-%: dyngen$(EXESUF) cvtasm$(EXESUF)
 	$(MAKE) -C $(subst subdir-,,$@) all
 
 recurse-all: $(patsubst %,subdir-%, $(TARGET_DIRS))
@@ -39,10 +39,14 @@ dyngen$(EXESUF): dyngen.c
 dyngen$(EXESUF): dyngen.c
 	$(HOST_CC) $(CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -o $@ $^
 
+cvtasm$(EXESUF): cvtasm.c
+	$(HOST_CC) $(CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -o $@ $^
+
 clean:
 # avoid old build problems by removing potentially incorrect old files
 	rm -f config.mak config.h op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h 
 	rm -f *.o *.a $(TOOLS) dyngen$(EXESUF) TAGS *.pod *~ */*~
+	rm -rf cvtasm$(EXESUF)
 	$(MAKE) -C tests clean
 	for d in $(TARGET_DIRS); do \
 	$(MAKE) -C $$d $@ || exit 1 ; \
diff -r 83ff8e3c6392 Makefile.target
--- a/Makefile.target	Thu Mar 22 12:36:53 2007 +0000
+++ b/Makefile.target	Sat Mar 24 15:08:16 2007 -0500
@@ -27,6 +27,7 @@ LIBS=
 LIBS=
 HELPER_CFLAGS=$(CFLAGS)
 DYNGEN=../dyngen$(EXESUF)
+CVTASM=../cvtasm$(EXESUF)
 # user emulator name
 TARGET_ARCH2=$(TARGET_ARCH)
 ifeq ($(TARGET_ARCH),arm)
@@ -78,11 +79,11 @@ cc-option = $(shell if $(CC) $(OP_CFLAGS
 cc-option = $(shell if $(CC) $(OP_CFLAGS) $(1) -S -o /dev/null -xc /dev/null \
               > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;)
 
-OP_CFLAGS+=$(call cc-option, -fno-reorder-blocks, "")
-OP_CFLAGS+=$(call cc-option, -fno-gcse, "")
-OP_CFLAGS+=$(call cc-option, -fno-tree-ch, "")
-OP_CFLAGS+=$(call cc-option, -fno-optimize-sibling-calls, "")
-OP_CFLAGS+=$(call cc-option, -fno-crossjumping, "")
+#OP_CFLAGS+=$(call cc-option, -fno-reorder-blocks, "")
+#OP_CFLAGS+=$(call cc-option, -fno-gcse, "")
+#OP_CFLAGS+=$(call cc-option, -fno-tree-ch, "")
+#OP_CFLAGS+=$(call cc-option, -fno-optimize-sibling-calls, "")
+#OP_CFLAGS+=$(call cc-option, -fno-crossjumping, "")
 OP_CFLAGS+=$(call cc-option, -fno-align-labels, "")
 OP_CFLAGS+=$(call cc-option, -fno-align-jumps, "")
 OP_CFLAGS+=$(call cc-option, -fno-align-functions, $(call cc-option, -malign-functions=0, ""))
@@ -512,7 +513,12 @@ gen-op.h: op.o $(DYNGEN)
 gen-op.h: op.o $(DYNGEN)
 	$(DYNGEN) -g -o $@ $<
 
-op.o: op.c
+op.s: op.c $(CVTASM)
+#	$(CC) $(OP_CFLAGS) $(CPPFLAGS) -E -o op-0.i $<
+	$(CC) $(OP_CFLAGS) $(CPPFLAGS) -fverbose-asm -S -o op-0.s $<
+	$(CVTASM) op-0.s op.s
+
+op.o: op.s
 	$(CC) $(OP_CFLAGS) $(CPPFLAGS) -c -o $@ $<
 
 # HELPER_CFLAGS is used for all the code compiled with static register
@@ -581,7 +587,7 @@ endif
 	$(CC) $(CPPFLAGS) -c -o $@ $<
 
 clean:
-	rm -f *.o  *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o
+	rm -f *.o op-0.s op.s *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o
 
 install: all 
 ifneq ($(PROGS),)
diff -r 83ff8e3c6392 cpu-all.h
--- a/cpu-all.h	Thu Mar 22 12:36:53 2007 +0000
+++ b/cpu-all.h	Sat Mar 24 15:08:16 2007 -0500
@@ -339,7 +339,9 @@ static inline void stl_le_p(void *ptr, i
 
 static inline void stq_le_p(void *ptr, uint64_t v)
 {
-    *(uint64_t *)ptr = v;
+    uint8_t *p = ptr;
+    stl_le_p(p, (uint32_t)v);
+    stl_le_p(p + 4, v >> 32);
 }
 
 /* float access */
diff -r 83ff8e3c6392 cvtasm.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/cvtasm.c	Sat Mar 24 15:08:16 2007 -0500
@@ -0,0 +1,802 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <ctype.h>
+
+static void error(const char* fmt, ...) __attribute__((noreturn));
+
+static int cvt_asm(FILE* in, FILE* out);
+
+int main(int argc, char** argv)
+{
+    FILE *in,*out;
+    int r;
+    if ( argc != 3 ) {
+	error("Usage: %s in.s out.s\n", argv[0]);
+    }
+    in = fopen(argv[1],"r");
+    if ( in == NULL ) {
+	error( "%s: could not open %s", argv[1]);
+    }
+    out= fopen(argv[2],"w");
+    if ( out == NULL ) {
+	error("%s: could not open %s", argv[1]);
+    }
+    r = cvt_asm(in,out);
+    fclose(out);
+    fclose(in);
+    return 0;
+}
+
+static void error(const char* fmt, ...)
+{
+    int i;
+    char buf[1024];
+    va_list va;
+    va_start(va,fmt);
+    i=vsnprintf(buf,sizeof(buf),fmt,va);
+    fwrite(buf,i,1,stderr);
+    exit(3);
+}
+
+static void alloc_error() __attribute__((noreturn));
+static void alloc_error()
+{
+    fputs("allocation error\n",stderr);
+    exit(3);
+}
+
+static void* smalloc( size_t s ) 
+{
+    void* p= malloc(s);
+    if ( p == NULL )
+	alloc_error ();
+    return p;
+}
+
+static void* srealloc(void* p, size_t ns)
+{
+    void* np= realloc( p, ns );
+    if ( np== NULL)
+	alloc_error ();
+    return np;
+}
+
+#if defined(__i386__) || defined (__x86_64__)
+#define ARCH_HAS_CVT_ASM 1
+
+#define OP_FUNC_PFX "op_"
+#define ASM_RET "ret"
+#define ASM_ALIGN1 ".p2align"
+#define ASM_ALIGN2 ".align"
+#define ASM_DBG_LOC ".loc"
+#define ASM_DBG_FILE ".file"
+#define ASM_JMP_LABEL1 "__op_gen_label1"
+#define ASM_JMP_LABEL2 "__op_jmp0"
+#define ASM_JMP_LABEL3 "__op_jmp1"
+
+static const char* ASM_JUMP_NAMES[]={
+    "jmp",
+    "ja", "jnbe",
+    "jae", "jnb",
+    "jb", "jnae",
+    "jbe", "jna",
+    "je", "jz",
+    "jg", "jnle",
+    "jge", "jnl",
+    "jl", "jnge",
+    "jle", "jng",
+    "jne", "jnz",
+    "jno",
+    "jnp", "jpo",
+    "jns", "jo",
+    "jp", "jpe",
+    "js",
+    0
+};
+
+static int white(const char* p)
+{
+    return ( (*p == ' ') || (*p == '\t') || (*p == '\r') ||
+	     (*p == '\n'));
+	
+}
+
+static int white_or_zero(const char* p)
+{
+    return ( white(p) || (*p == 0));
+}
+
+static const char* eat_white(const char* p)
+{
+    while ( white(p) )
+	++p;
+    return p;
+}
+
+static const char* eat_non_white(const char* p)
+{
+    while ( !white(p) && (*p != 0) )
+	++p;
+    return p;
+}
+
+struct line_s {
+    char* l;
+    size_t n;
+    size_t alloc;
+};
+
+static void line_init( struct line_s* s)
+{
+    memset(s,0,sizeof(*s));
+    s->alloc=128;
+    s->l = smalloc(s->alloc);
+    s->l[0]=0;
+}
+
+static void line_destroy( struct line_s* s)
+{
+    free(s->l);
+    s->l =0;
+    s->alloc=0;
+    s->n=0;
+}
+
+static void line_free( struct line_s* s)
+{
+    line_destroy(s);
+    free(s);
+}
+
+static struct line_s* line_copy( const struct line_s* r)
+{
+    struct line_s* l= smalloc(sizeof(*l));
+    l->alloc= r->alloc;
+    l->l= smalloc( l->alloc );
+    l->n = r->n;
+    memcpy(l->l, r->l, r->n+1);
+    return l;
+}
+
+static void line_clear(struct line_s* s)
+{
+    s->n =0;
+    s->l[0] = 0;
+}
+
+static void line_push_char( struct line_s* s, char c)
+{
+    if ( s->n == s->alloc - 1) {
+	size_t a= s->alloc * 2;
+	char* l= srealloc(s->l, a );
+	s->l = l;
+	s->alloc = a;
+    }
+    s->l[s->n] = c;
+    ++s->n;
+    s->l[s->n] = 0;
+}
+
+static struct line_s* line_read(FILE* in)
+{
+    struct line_s* l= smalloc(sizeof(*l));
+    int c;
+    line_init(l);
+    while ( (c=fgetc(in)) != EOF ) {
+	line_push_char(l,c);
+	if ( c == '\n')
+	    break;
+    }
+    return l;
+}
+
+static void line_write(const struct line_s* l, FILE* out)
+{
+    fwrite(l->l, l->n, 1, out);
+}
+
+static int line_ptr_in_comment(const struct line_s* l,
+			       const char* p)
+{
+    const char* comment = strchr(l->l,'#');
+    int r= p==0 || ( comment == 0 ? 0 : p >= comment);
+    return r;
+}
+
+static size_t line_label_name( const struct line_s* l, 
+			       char* name, size_t bufsize)
+{
+    const char* colon= strchr(l->l,':');
+    size_t s=0;
+    if (colon != 0) {
+	const char *start= l->l;
+	const char *p;
+	/* eat white spaces */
+	start = eat_white(l->l);
+	/* check for no white spaces between start and colon */
+	p = eat_non_white(start);
+	if ( (p >= colon) && !line_ptr_in_comment(l,p) ) {
+	    s= colon - start + 1;
+	    if ((name) && (s <= bufsize)) {
+		memcpy(name,start,s-1);
+		name[s-1] = 0;
+	    }
+	}
+    }
+    return s;
+}
+
+static int line_is_label( const struct line_s* l)
+{
+    /* asm labels:  white spaces Non#: */
+    int r= line_label_name(l, 0, 0) != 0;
+    return r;
+}
+
+static const char* line_find_token( const struct line_s* l,
+				    const char* token) {
+    const char* p= strstr(l->l,token);
+    if ( line_ptr_in_comment(l,p) )
+	p=0;
+    return p;
+}
+
+static size_t line_function_start_name( const struct line_s* l,
+					char* name, size_t bufsize)
+{
+    const char* type= line_find_token(l,".type");
+    const char* func= line_find_token(l,"@function");
+    size_t s=0;
+    if ( (type != NULL) && (func > type)) {
+	/* extract the function name */
+	const char* typeend=eat_non_white(type);
+	++typeend;
+	const char* fname=eat_white(typeend);
+	const char* fnend=strchr(fname,',');
+	if ( !line_ptr_in_comment(l,fnend) && (fnend > fname)) {
+	    s = fnend -fname + 1;
+	    if ((name!=0) && (s <= bufsize)) {
+		memcpy(name,fname,s-1);
+		name[s-1]=0;
+	    }
+	}
+    }
+    return s;
+}
+
+static int line_is_function_start(const struct line_s* l)
+{
+    int r=line_function_start_name(l,0,0) !=0;
+    return r;
+}
+
+static size_t line_function_end_name( const struct line_s* l,
+				      char* name, size_t bufsize)
+{
+    const char* size= line_find_token(l,".size");
+    size_t s=0;
+    if (size != NULL) {
+	/* extract the function name */
+	const char* sizeend=eat_non_white(size);
+	++sizeend;
+	const char* fname=eat_white(sizeend);
+	const char* fnend=strchr(fname,',');
+	if ( !line_ptr_in_comment(l,fnend) && (fnend > fname)) {
+	    s = fnend -fname + 1;
+	    if ((name!=0) && (s <= bufsize)) {
+		memcpy(name,fname,s-1);
+		name[s-1]=0;
+	    }
+	}
+    }
+    return s;
+}
+
+static size_t line_jxx_target_name( const struct line_s* l, 
+				    const char* jmpname,
+				    char* name, size_t bufsize)
+{
+    const char* jmp= line_find_token(l,jmpname);
+    size_t s=0;
+    if ( jmp ) {
+	const char* jmp_end=jmp+strlen(jmpname);
+	if ( white_or_zero(jmp_end) &&
+	     ((l->l == jmp) || white(jmp-1))) {
+	    const char* start=eat_white(jmp_end);
+	    const char* end=eat_non_white (start);
+	    if ( !line_ptr_in_comment(l,end) ) {
+		s= end -start +1;
+		if ((name!=0) && (s <= bufsize)) {
+		    memcpy(name,start,s-1);
+		    name[s-1]=0;
+		}
+	    }
+	}
+    }
+    return s;
+}
+
+static int line_is_jxx( const struct line_s* l, const char* jmpname)
+{
+    int r= line_jxx_target_name(l,jmpname,0,0)!=0;
+    return r;
+}
+
+static size_t line_jump_target_name(const struct line_s* l, 
+				    char* name, size_t bufsize)
+{
+    const char** p= ASM_JUMP_NAMES;
+    size_t s=0;
+    while (*p) {
+	size_t k= line_jxx_target_name(l,*p,name,bufsize);
+	if (k ) {
+	    s = k;
+	    break;
+	}
+	++p;
+    }
+    return s;
+}
+
+static int line_is_jump( const struct line_s* l)
+{
+    int r=line_jump_target_name(l,0,0)!=0;
+    return r;
+}
+
+
+static size_t line_is_ret(const struct line_s* l)
+{
+    const char* ret= line_find_token(l,ASM_RET);
+    size_t s=0;
+    if ( ret  ) {
+	if ( white_or_zero(ret+strlen(ASM_RET)+1) &&
+	     ((l->l == ret) || white(ret-1)))
+	    s=1;
+    }
+    return s;
+}
+
+static size_t line_is_align(const struct line_s* l)
+{
+    const char* a= line_find_token(l,ASM_ALIGN1);
+    const char* token = ASM_ALIGN1;
+    size_t s=0;
+    if ( a == 0) {
+	a= line_find_token(l,ASM_ALIGN2);
+	token = ASM_ALIGN2;
+    }
+    if ( a ) {
+	if ( white_or_zero(a+strlen(token)+1) &&
+	     ((l->l == a) || white(a-1)))
+	    s=1;
+    }
+    return s;
+}
+
+static size_t line_is_dbg(const struct line_s* l)
+{
+    const char* dbg= line_find_token(l,ASM_DBG_LOC);
+    const char* token= ASM_DBG_LOC;
+    size_t s=0;
+    if ( dbg == 0) {
+	dbg = line_find_token(l,ASM_DBG_FILE);
+	token = ASM_DBG_FILE;
+    }
+    if ( dbg  ) {
+	if ( white_or_zero(dbg+strlen(token)+1) &&
+	     ((l->l == dbg) || white(dbg-1)))
+	    s=1;
+    }
+    return s;
+}
+
+static int line_is_function_end(const struct line_s* l)
+{
+    int r=line_function_end_name(l,0,0) !=0;
+    return r;
+}
+
+struct func_lines_s {
+    /* line pointer, contains allocated pointer to allocated lines */
+    struct line_s** line;
+    /* line count */
+    int n; 
+    /* function name. contains allocated pointer to function name */
+    char* fname;
+};
+
+static void func_lines_init(struct func_lines_s* s)
+{
+    memset(s,0,sizeof(*s));
+}
+
+static struct func_lines_s* func_lines_create()
+{
+    struct func_lines_s* l= smalloc(sizeof(*l));
+    func_lines_init (l);
+    return l;
+}
+
+static struct func_lines_s* func_lines_copy(const struct func_lines_s* r)
+{
+    struct func_lines_s* l= func_lines_create ();
+    int i;
+    l->line = (struct line_s**)smalloc(r->n* sizeof(struct line_s*));
+    l->n = r->n;
+    l->fname = strdup(r->fname);
+    for (i=0; i< l->n;++i) {
+	l->line[i]= line_copy(r->line[i]);
+    }
+    return l;
+}
+
+static void func_lines_destroy(struct func_lines_s* s)
+{
+    if ( s->line) {
+	int i;
+	for ( i = 0; i< s->n; ++i) {
+	    line_free(s->line[i]);
+	}
+	free(s->line);
+	s->line =0; // sanity
+    }
+    s->n=0;
+    if ( s->fname ) {
+	free(s->fname);
+	s->fname =0;
+    }
+}
+
+static void func_lines_delete( struct func_lines_s* s)
+{
+    func_lines_destroy(s);
+    free(s);
+}
+
+static void func_lines_append_line( struct func_lines_s* f, 
+				    const struct line_s* l)
+{
+    struct line_s** line= (struct line_s**)
+	srealloc(f->line, (f->n + 1) * sizeof(l));
+    f->line = line;
+    struct line_s* p= line_copy(l);
+    f->line[f->n]=p;
+    ++f->n;
+	
+    if ( f->fname == 0) {
+	size_t s;
+	if ( (s=line_function_start_name(p,0,0))!=0 ) {
+	    f->fname = (char*)smalloc(s);
+	    line_function_start_name (p,f->fname,s);
+	}
+    }
+}
+
+static void func_lines_write(const struct func_lines_s* f, FILE* o)
+{
+    if ( f->line ) {
+	int i=0;
+	for (i=0; i< f->n; ++i ) {
+	    line_write(f->line[i],o);
+	}
+    }
+}
+
+struct line_info_s {
+    /* jump < 0 line[i] jumps to external function jump[i] == 0 no
+       jump, jump[i] > 0 line of target.  Jumps to local labels
+       may point to the wrong line. Jumps to contents of registers
+       and to magic targets (__op_gen_label1) contain the number
+       of the line itself
+    */
+    int jump;
+    /* ret[i] != 0 line[i] contains an return instruction */
+    int ret;
+    /* line with label ? */
+    int label;
+    /* contains the line an instruction */
+    int instr;
+};
+
+struct transform_s {
+    int name;
+    int ret_cnt;
+    int ret_not_at_end;
+    int external_jumps;
+    struct line_info_s* line_info;
+};
+
+void transform_init( struct transform_s* s, 
+		     const struct func_lines_s* f)
+{
+    int i,j,n;
+    n= f->n;
+    memset(s,0,sizeof(*s));
+    s->line_info=(struct line_info_s*)
+	smalloc(n * sizeof(*s->line_info));
+    memset(s->line_info,0,n*sizeof(*s->line_info));
+
+    /* initialise the static information */
+    for (i = 0; i< n; ++i) {
+	const struct line_s* l= f->line[i];
+	if ( line_is_label (l) )
+	    s->line_info[i].label=1;
+	if ( line_is_jump (l) )
+	    s->line_info[i].jump=-1;
+	if ( line_is_ret(l) )
+	    s->line_info[i].ret=1;
+	if ( !(line_is_function_end (l) ||
+	       line_is_function_start (l) ||
+	       line_is_label(l) || 
+	       line_is_align (l) ||
+	       line_is_dbg(l))) {
+	    s->line_info[i].instr=1;
+	}
+    }
+    /* now collect the jump target information */
+    for (i=0; i<n; ++i) {
+	if ( s->line_info[i].jump == 0)
+	    continue;
+	const struct line_s* l= f->line[i];
+	/* find the corresponding label */
+	size_t js=line_jump_target_name(l,0,0);
+	char* b1= smalloc(js);
+	line_jump_target_name(l,b1,js);
+	/* jumps with address in register are ok */
+	if ( strchr(b1,'%') != 0) {
+	    s->line_info[i].jump = i;
+	    continue;
+	}
+	/* jumps to __op_gen_label1 are ok */
+	if ( strcmp(b1,ASM_JMP_LABEL1)==0) {
+	    s->line_info[i].jump = i;
+	    continue;
+	}
+	if ( strcmp(b1,ASM_JMP_LABEL2)==0) {
+	    s->line_info[i].jump = i;
+	    continue;
+	}
+	if ( strcmp(b1,ASM_JMP_LABEL3)==0) {
+	    s->line_info[i].jump = i;
+	    continue;
+	}
+	/* js contains the size with trailing 0 */
+	if ((isdigit(b1[0])) && 
+	    ((js>1) && 
+	     ((b1[js-2]=='f') || (b1[js-2]=='b')))) {
+	    // fprintf(stdout, "jmp to local '%s' found\n", b1);
+	    b1[js-2]=0;
+	}
+	// fprintf(stdout, "jmp to '%s' found\n", b1);
+	int label_found = 0;
+	for ( j =0 ; j< n && label_found == 0; ++j) {
+	    if ( j == i )
+		continue;
+	    if ( s->line_info[j].label == 0)
+		continue;
+	    const struct line_s* ll= f->line[j];
+	    size_t ls=line_label_name(ll,0,0);
+	    char* b2= smalloc(ls);
+	    line_label_name(ll,b2,ls);
+	    // fprintf(stdout, "label '%s'\n", b2);
+	    if ( strcmp(b1,b2)==0) {
+		label_found=1;
+		s->line_info[i].jump = j;
+	    }
+	    free(b2);
+	}
+	free(b1);
+    }
+}
+
+void transform_check( struct transform_s* t,
+		      const struct func_lines_s* f)
+{
+    int i;
+    int n=f->n;
+    for (i=0; i<n; ++i) {
+	const struct line_info_s* info= t->line_info+i;
+	if ( info->ret )
+	    ++t->ret_cnt;
+	if ( info->jump < 0) 
+	    ++t->external_jumps;
+    }
+    for (i=n-1;i>=0;--i) {
+	const struct line_info_s* info= t->line_info+i;
+	if ( (info->instr != 0) && (info->ret !=0) ) 
+	    break;
+	if ( (info->instr != 0) && (info->ret ==0) ) {
+	    ++t->ret_not_at_end;
+	    break;
+	}
+    }
+}
+
+static void transform_fix ( const struct transform_s* t,
+			    struct func_lines_s* f)
+{
+    int i;
+    int n= f->n;
+    int last_instr=0;
+    // find the last instruction:
+    for (i=n-1;i>=0;--i) {
+	if ( t->line_info[i].instr != 0 ) {
+	    last_instr=i;
+	    break;
+	}
+    }
+    // produce a label name 
+    char label[128];
+    snprintf(label,sizeof(label),".L%s_exit",f->fname);
+    // replace external jumps with calls
+    for (i=0;i<n;++i) {
+	if (t->line_info[i].jump >= 0) 
+	    continue;
+	char jmp_target[512];
+	struct line_s* l= f->line[i];
+	line_jump_target_name(l,jmp_target,sizeof(jmp_target));
+	char call[4096];
+	size_t s;
+	char jmp_ret[512];
+	jmp_ret[0]=0;
+	if ( i != last_instr ) {
+	    snprintf(jmp_ret,sizeof(jmp_ret),
+		     "# CVTASM FIX ret after jmp\n"
+		     "\tjmp\t%s\n",label);
+	}
+#if defined (__i386__)
+	s=snprintf(call,sizeof(call),
+		   "# CVTASM FIX jmp --> call\n"
+		   "\tcall\t%s\n%s",
+		   jmp_target,jmp_ret);
+#endif
+#if defined (__x86_64__)
+	s=snprintf(call,sizeof(call),
+		   "# CVTASM FIX jmp --> call\n"
+		   "\tsubq\t$8, %%rsp\n"
+		   "\tcall\t%s\n"
+		   "\taddq\t$8, %%rsp\n%s",
+		   jmp_target,jmp_ret);
+#endif
+	line_clear(l);
+	int j;
+	for (j=0;j<s;++j)
+	    line_push_char(l,call[j]);
+    }
+    struct line_s* l= f->line[last_instr];
+    char newlines[4096];
+    size_t s;
+    if ( t->line_info[last_instr].ret !=0 ) {
+	// insert label before the ret.
+	s=snprintf(newlines,sizeof(newlines),
+		   "# CVTASM FIX label before ret \n"
+		   "%s:\n%s",
+		   label, l->l);
+    } else {
+	// insert label after the ret.
+	s=snprintf(newlines,sizeof(newlines),
+		   "%s%s:\n"
+		   "# CVTASM FIX ret at end\n"
+		   "\tret\n",l->l,label);
+    }
+    // convert the line
+    line_clear(l);
+    for (i =0; i< (int)s; ++i)
+	line_push_char(l,newlines[i]);
+    // replace internal rets with jmps to the generated label.
+    for (i=0;i<last_instr;++i) {
+	if ( t->line_info[i].ret == 0)
+	    continue;
+	l= f->line[i];
+	line_clear(l);
+	s=snprintf(newlines,sizeof(newlines),
+		   "# CVTASM FIX ret --> jmp to end\n"
+		   "\tjmp %s\n",
+		   label);
+	int j;
+	for (j=0;j<s;++j)
+	    line_push_char(l,newlines[j]);
+    }
+}
+
+struct func_lines_s* transform_function ( const struct func_lines_s* f)
+{
+    struct func_lines_s* fn= 0;
+    int n;
+    struct transform_s t;
+    transform_init (&t,f);
+    n=f->n;
+    // do the transformation checks.
+    do {
+	// Name must start with op.
+	if ( strncmp(f->fname,OP_FUNC_PFX,3)!=0) 
+	    break;
+	// exit tb has more than one exit
+	if ( strcmp(f->fname,"op_exit_tb")==0)
+	    break;
+	// exit tb has more than one exit
+	if ( strcmp(f->fname,"op_exit_tb")==0)
+	    break;
+	transform_check (&t,f);
+	if ( t.ret_cnt != 1 )
+	    fprintf(stdout, 
+		    "'%s' needs fixing (return count %i)\n", 
+		    f->fname, t.ret_cnt);
+	if ( t.external_jumps != 0 )
+	    fprintf(stdout, 
+		    "'%s' needs fixing (external jmps %i)\n", 
+		    f->fname, t.external_jumps);
+	if ( t.ret_not_at_end != 0 )
+	    fprintf(stdout, 
+		    "'%s' needs fixing "
+		    "(return not last instruction))\n", 
+		    f->fname);
+	if (t.ret_cnt != 1 || t.ret_not_at_end || 
+	    t.external_jumps)  {
+	    fn = func_lines_copy (f);
+	    transform_fix(&t,fn);
+	}
+    } while (0);
+    return fn;
+}
+
+int cvt_asm( FILE* in, FILE* out)
+{
+    struct line_s* l;
+    struct func_lines_s* f=0;
+    int done =0;
+    do {
+	l = line_read(in);
+	if ( l->n == 0) {
+	    if ( f != 0 )
+		error("Not terminated function");
+	    done = 1;
+	}
+	if ( f != NULL ) {
+	    /* collecting into f */
+	    func_lines_append_line (f, l);
+	    if ( line_is_function_end (l) ) {
+		/* check for the right function end here */
+		/* Transformation is done here */
+		struct func_lines_s* fn=
+		    transform_function(f);
+		if ( fn ) {
+		    func_lines_write(fn,out);
+		    func_lines_delete(fn);
+		} else {
+		    func_lines_write(f,out);
+		} 
+		func_lines_delete(f);
+		f=0;
+	    }
+	} else {
+	    /* check if we must collecting into new f */
+	    if ( line_is_function_start(l) ) {
+		f= func_lines_create();
+		func_lines_append_line(f,l);
+	    } else {
+		/* otherwise copy to output */
+		line_write(l,out);
+	    }
+	}
+	free(l);
+    } while ( !done );
+    if ( f )
+	free (f);
+    return 0;
+}
+
+#endif
+
+#if !defined (ARCH_HAS_CVT_ASM)
+int cvt_asm(FILE* in, FILE* out)
+{
+    int c;
+    while ( (c=fgetc(in))!= EOF) 
+	fputc(c,out);
+    return 0;
+}
+#endif
diff -r 83ff8e3c6392 exec-all.h
--- a/exec-all.h	Thu Mar 22 12:36:53 2007 +0000
+++ b/exec-all.h	Sat Mar 24 15:08:16 2007 -0500
@@ -337,6 +337,26 @@ do {\
 		  "1:\n");\
 } while (0)
 
+#elif defined (__x86_64__)
+
+/* GCC 4 optimises away the labels after the goto :-( */
+/* This is the main reason for the crashes of qemu if compiled with */
+/* gcc 4 */
+#define GOTO_TB(opname, tbparam, n)					\
+do {									\
+    void* target=(void *)(((TranslationBlock *)tbparam)->tb_next[n]);	\
+    __asm__ __volatile__						\
+	    ( ".data\n\t"						\
+	      ".align 8 \n"						\
+	      ASM_OP_LABEL_NAME(n, opname) ":\n"			\
+              ".quad 1f\n"						\
+              ".previous \n\t"						\
+	      "jmp *%0\n\t"						\
+	      "1:\n\t"							\
+	      :								\
+	      :"a"(target));						\
+} while (0)
+
 #else
 
 /* jump to next block operations (more portable code, does not need
diff -r 83ff8e3c6392 target-i386/exec.h
--- a/target-i386/exec.h	Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/exec.h	Sat Mar 24 15:08:16 2007 -0500
@@ -231,10 +231,14 @@ static inline void stfq(target_ulong ptr
 {
     union {
         double d;
-        uint64_t i;
+	struct {
+	    uint32_t lo;
+	    uint32_t hi;
+	} i;
     } u;
     u.d = v;
-    stq(ptr, u.i);
+    stl(ptr, u.i.lo);
+    stl(ptr + 4, u.i.hi);
 }
 
 static inline float ldfl(target_ulong ptr)
@@ -316,7 +320,13 @@ typedef union {
 typedef union {
     long double d;
     struct {
-        unsigned long long lower;
+	union {
+	    unsigned long long lower;
+            struct {
+		uint32_t lo;
+		uint32_t hi;
+	    } split;
+	};
         unsigned short upper;
     } l;
 } CPU86_LDoubleU;
@@ -444,7 +454,8 @@ static inline void helper_fstt(CPU86_LDo
     CPU86_LDoubleU temp;
     
     temp.d = f;
-    stq(ptr, temp.l.lower);
+    stl(ptr, temp.l.split.lo);
+    stl(ptr + 4, temp.l.split.hi);
     stw(ptr + 8, temp.l.upper);
 }
 
@@ -501,6 +512,7 @@ void helper_hlt(void);
 void helper_hlt(void);
 void helper_monitor(void);
 void helper_mwait(void);
+void helper_pshufw(uint16_t *dst, uint16_t *src, int order);
 
 extern const uint8_t parity_table[256];
 extern const uint8_t rclw_table[32];
diff -r 83ff8e3c6392 target-i386/helper.c
--- a/target-i386/helper.c	Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/helper.c	Sat Mar 24 15:08:16 2007 -0500
@@ -3452,8 +3452,10 @@ void helper_fxrstor(target_ulong ptr, in
         nb_xmm_regs = 8 << data64;
         addr = ptr + 0xa0;
         for(i = 0; i < nb_xmm_regs; i++) {
-            env->xmm_regs[i].XMM_Q(0) = ldq(addr);
-            env->xmm_regs[i].XMM_Q(1) = ldq(addr + 8);
+            env->xmm_regs[i].XMM_L(0) = ldl(addr);
+            env->xmm_regs[i].XMM_L(1) = ldl(addr + 4);
+            env->xmm_regs[i].XMM_L(2) = ldl(addr + 8);
+            env->xmm_regs[i].XMM_L(3) = ldl(addr + 12);
             addr += 16;
         }
     }
diff -r 83ff8e3c6392 target-i386/helper2.c
--- a/target-i386/helper2.c	Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/helper2.c	Sat Mar 24 15:08:16 2007 -0500
@@ -1034,3 +1034,11 @@ void save_native_fp_state(CPUState *env)
     env->native_fp_regs = 0;
 }
 #endif
+
+void helper_pshufw(uint16_t *dst, uint16_t *src, int order)
+{
+    dst[0] = src[order & 3];
+    dst[1] = src[(order >> 2) & 3];
+    dst[2] = src[(order >> 4) & 3];
+    dst[3] = src[(order >> 6) & 3];
+}
diff -r 83ff8e3c6392 target-i386/op.c
--- a/target-i386/op.c	Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/op.c	Sat Mar 24 15:08:16 2007 -0500
@@ -18,7 +18,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
-#define ASM_SOFTMMU
+//#define ASM_SOFTMMU
 #include "exec.h"
 
 /* n must be a constant to be efficient */
diff -r 83ff8e3c6392 target-i386/ops_sse.h
--- a/target-i386/ops_sse.h	Thu Mar 22 12:36:53 2007 +0000
+++ b/target-i386/ops_sse.h	Sat Mar 24 15:08:16 2007 -0500
@@ -580,16 +580,9 @@ void OPPROTO glue(op_movq_T0_mm, SUFFIX)
 #if SHIFT == 0
 void OPPROTO glue(op_pshufw, SUFFIX) (void)
 {
-    Reg r, *d, *s;
-    int order;
-    d = (Reg *)((char *)env + PARAM1);
-    s = (Reg *)((char *)env + PARAM2);
-    order = PARAM3;
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
-    *d = r;
+    helper_pshufw((uint16_t *)((char *)env + PARAM1),
+		  (uint16_t *)((char *)env + PARAM2),
+		  PARAM3);
 }
 #else
 void OPPROTO op_shufps(void)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-24 20:15 ` Anthony Liguori
@ 2007-03-25 10:15   ` Axel Zeuner
  2007-03-25 23:46     ` Anthony Liguori
  2007-03-25 12:12   ` Axel Zeuner
  2007-04-20 16:57   ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
  2 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-25 10:15 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> Axel Zeuner wrote:
> > Hi,
>
> Hi Axel,
>
> By adding some GCC4 fixes on top of your patch, I was able to get qemu
> for i386 (on i386) to compile and run.  So far, I've only tested a win2k
> guest.
Hi Anthony,

thank you for the test, I like to hear about your success. I have applied your 
patches, compiled and checked qemu-i386-softmmu on i386 without kqemu with 
FreeDos. It works also.

> The big problem (which pbrook helped me with) was GCC4 freaking out over
> some stq's.  Splitting up the 64bit ops into 32bit ops seemed to address
> most of the problems.
>
> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
> deal with the register pressure.  The problem I can't seem to fix though
> is that GCC sticks %1 in %esi because we're only using an "r"
> constraint, not a "q" constraint.  This results in the generation of
> %sib which is an invalid register.  However, refactoring the code to not
> require a "q" constraint doesn't seem to help either.
In the past I made some patches (not published yet) to speed up the helpers 
for 64 operations in target-i386/helper.c on x86_64 and i386 using gcc inline 
assembly.  x86_64 was really easy, but for i386 I had to use "m" and "=m" 
constraints and as less inputs and outputs as possible. 
> The attached patch is what I have so far.  Some help with people more
> familiar with gcc asm foo would be appreciated!

May I suggest some changes?  
I would like to try not to split the 64 bit accesses on hosts supporting it 
native, i.e. something like this:
===================================================================
--- cpu-all.h   (revision 16)
+++ cpu-all.h   (working copy)
@@ -339,7 +339,13 @@

 static inline void stq_le_p(void *ptr, uint64_t v)
 {
-    *(uint64_t *)ptr = v;
+#if (HOST_LONG_BITS < 64)
+    uint8_t *p = ptr;
+    stl_le_p(p, (uint32_t)v);
+    stl_le_p(p + 4, v >> 32);
+#else
+    *(uint64_t*)ptr = v;
+#endif
 }
Furthermore I think one should move helper_pshufw() from target-i386/helper2.c 
into target-i386/helper.c where all the other helper methods reside.   

Kind Regards
Axel

> Regards,
>
> Anthony Liguori
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-24 20:15 ` Anthony Liguori
  2007-03-25 10:15   ` Axel Zeuner
@ 2007-03-25 12:12   ` Axel Zeuner
  2007-03-25 23:44     ` Anthony Liguori
  2007-04-20 16:57   ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
  2 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-25 12:12 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
> deal with the register pressure.  The problem I can't seem to fix though
> is that GCC sticks %1 in %esi because we're only using an "r"
> constraint, not a "q" constraint.  This results in the generation of
> %sib which is an invalid register.  However, refactoring the code to not
> require a "q" constraint doesn't seem to help either.
Hi Anthony,
could you please try the attached patch for softmmu_header.h? Allows compiling 
with gcc4 and ASM_SOFTMMU.

Kind regards
Axel

[-- Attachment #2: softmmu.h.diff.zip --]
[-- Type: application/x-zip, Size: 813 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
  2007-03-24 20:15 ` Anthony Liguori
@ 2007-03-25 13:40 ` Avi Kivity
  2007-03-26 17:14   ` Axel Zeuner
  1 sibling, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2007-03-25 13:40 UTC (permalink / raw)
  To: qemu-devel

Axel Zeuner wrote:
> A full featured converter (cvtasm) has a lot of dependencies: it has to 
> support all hosts (M) (with all assembler dialects M') and all targets N, 
> i.e. in the worst case one would end with M'x N variants of it, or M x N if 
> one supports only one assembler dialect per host.  It is clear, that the 
> number of variants is one of the biggest disadvantages of such an approach.
>   

Perhaps a mixed approach can be made for gradual conversion: for 
combinations where cvtasm has been written, use that.  Where that's 
still to be done, have dyngen generate call instructions to the ops 
instead of pasting the ops text directly.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-25 12:12   ` Axel Zeuner
@ 2007-03-25 23:44     ` Anthony Liguori
  2007-03-26  6:16       ` Axel Zeuner
  0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-25 23:44 UTC (permalink / raw)
  To: Axel Zeuner; +Cc: qemu-devel

Axel Zeuner wrote:
> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>   
>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
>> deal with the register pressure.  The problem I can't seem to fix though
>> is that GCC sticks %1 in %esi because we're only using an "r"
>> constraint, not a "q" constraint.  This results in the generation of
>> %sib which is an invalid register.  However, refactoring the code to not
>> require a "q" constraint doesn't seem to help either.
>>     
> Hi Anthony,
> could you please try the attached patch for softmmu_header.h? Allows compiling 
> with gcc4 and ASM_SOFTMMU.
>   

That did the trick.  Could you explain what your changes did?

Regards,

Anthony Liguori

> Kind regards
> Axel
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-25 10:15   ` Axel Zeuner
@ 2007-03-25 23:46     ` Anthony Liguori
  2007-03-26  5:49       ` Axel Zeuner
  0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-25 23:46 UTC (permalink / raw)
  To: Axel Zeuner; +Cc: qemu-devel

Axel Zeuner wrote:
> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>   
>> Axel Zeuner wrote:
>>     
>>> Hi,
>>>       
>> Hi Axel,
>>
>> By adding some GCC4 fixes on top of your patch, I was able to get qemu
>> for i386 (on i386) to compile and run.  So far, I've only tested a win2k
>> guest.
>>     
> Hi Anthony,
>
> thank you for the test, I like to hear about your success. I have applied your 
> patches, compiled and checked qemu-i386-softmmu on i386 without kqemu with 
> FreeDos. It works also.
>
>   
>> The big problem (which pbrook helped me with) was GCC4 freaking out over
>> some stq's.  Splitting up the 64bit ops into 32bit ops seemed to address
>> most of the problems.
>>
>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
>> deal with the register pressure.  The problem I can't seem to fix though
>> is that GCC sticks %1 in %esi because we're only using an "r"
>> constraint, not a "q" constraint.  This results in the generation of
>> %sib which is an invalid register.  However, refactoring the code to not
>> require a "q" constraint doesn't seem to help either.
>>     
> In the past I made some patches (not published yet) to speed up the helpers 
> for 64 operations in target-i386/helper.c on x86_64 and i386 using gcc inline 
> assembly.  x86_64 was really easy, but for i386 I had to use "m" and "=m" 
> constraints and as less inputs and outputs as possible. 
>   
>> The attached patch is what I have so far.  Some help with people more
>> familiar with gcc asm foo would be appreciated!
>>     
>
> May I suggest some changes?  
> I would like to try not to split the 64 bit accesses on hosts supporting it 
> native, i.e. something like this:
> ===================================================================
> --- cpu-all.h   (revision 16)
> +++ cpu-all.h   (working copy)
> @@ -339,7 +339,13 @@
>
>  static inline void stq_le_p(void *ptr, uint64_t v)
>  {
> -    *(uint64_t *)ptr = v;
> +#if (HOST_LONG_BITS < 64)
> +    uint8_t *p = ptr;
> +    stl_le_p(p, (uint32_t)v);
> +    stl_le_p(p + 4, v >> 32);
> +#else
> +    *(uint64_t*)ptr = v;
> +#endif
>  }
>   

Yes, I think the proper thing to do is to use a configure check for GCC 
version to determine whether or not to use the 32 bit or 64 version of 
stq_le_p.

There is already a function in cpu-all.h that does the 32 bit version.

> Furthermore I think one should move helper_pshufw() from target-i386/helper2.c 
> into target-i386/helper.c where all the other helper methods reside.   
>   

I moved to helper2.c because AFAICT helper.c is compiled with the same 
sort of restrictions as op.c which leads to the compile failure.

Regards,

Anthony Liguori

> Kind Regards
> Axel
>
>   
>> Regards,
>>
>> Anthony Liguori
>>
>>     
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-25 23:46     ` Anthony Liguori
@ 2007-03-26  5:49       ` Axel Zeuner
  2007-03-26 22:53         ` Paul Brook
  0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26  5:49 UTC (permalink / raw)
  To: Anthony Liguori, qemu-devel

Hi Anthony,

On Monday 26 March 2007 01:46, you wrote:
> Axel Zeuner wrote:
> > On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >> Axel Zeuner wrote:
> >>> Hi,
> >>
>
> > Furthermore I think one should move helper_pshufw() from
> > target-i386/helper2.c into target-i386/helper.c where all the other
> > helper methods reside.
>
> I moved to helper2.c because AFAICT helper.c is compiled with the same
> sort of restrictions as op.c which leads to the compile failure.
Yes, helper.c is compiled with the global register variables and the code is 
called directly from the op_xxx functions, but one needs the global register 
variables to access global data, these contain the required environment for 
the emulation. AFAIK helper2.c is used by the CODE_COPY branch on i386 with 
even stronger restrictions, but I may be wrong here.

Kind Regards
Axel Zeuner
>
> Regards,
>
> Anthony Liguori
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-25 23:44     ` Anthony Liguori
@ 2007-03-26  6:16       ` Axel Zeuner
  2007-03-29  2:07         ` Anthony Liguori
  0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26  6:16 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

Hi Anthony,

On Monday 26 March 2007 01:44, you wrote:
> Axel Zeuner wrote:
> > On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> >> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
> >> deal with the register pressure.  The problem I can't seem to fix though
> >> is that GCC sticks %1 in %esi because we're only using an "r"
> >> constraint, not a "q" constraint.  This results in the generation of
> >> %sib which is an invalid register.  However, refactoring the code to not
> >> require a "q" constraint doesn't seem to help either.
> >
> > Hi Anthony,
> > could you please try the attached patch for softmmu_header.h? Allows
> > compiling with gcc4 and ASM_SOFTMMU.
>
> That did the trick.  Could you explain what your changes did?

QEMU/i386 has only 3 three available registers if TARGET_I386 is selected 
because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA A0). 
This makes inline assembly really ugly. The called external C functions in  
ASM_SOFTMMU are REGPARM(1,2), i.e. require their first arguments in eax, edx.

In the two ld functions three registers (eax, edx, ecx) are required and 
destroyed because an external C function may be called. We relax the register 
pressure a little bit by forcing the return value (res) into eax , because 
the return value is returned in a destroyed register. Furthermore the called 
C function returns its value in eax anyway (call %7). 

The st functions are a little more tricky: we need three registers and the 
assembly code requires a reload of %0 (ptr) after the check if the external 
function must be called. In the external function the three remaining 
registers are destroyed. After the call a need also to reload of %1 (v) into 
register is needed, i.e. we need more registers. Register saving on the stack 
does not work, because there exist already 2 "m" constraints: if the code is 
compiled with -fomit-frame-pointers these are expressed as offsets relative 
to %esp, i.e X(%esp) and would become invalid after pushes onto the stack.

One solution was to force all inputs to the asm block onto the stack, thats 
what the replacement of the "r" constraints into "m" constraints do: they 
force a memory reference. Because i386 can not do direct memory memory moves 
one has to reload "m"(v) into ecx again, otherwise the generated assembler 
code is invalid.
It must be mentioned, that the generated code is a little bit slower than the 
original one.

Kind Regards 
Axel
>
> Regards,
>
> Anthony Liguori
>
> > Kind regards
> > Axel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
@ 2007-03-26 17:14   ` Axel Zeuner
  2007-04-06 21:04     ` Rob Landley
  0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-26 17:14 UTC (permalink / raw)
  To: qemu-devel

Hi Avi,
On Sunday 25 March 2007 15:40, Avi Kivity wrote:
> Axel Zeuner wrote:
> > A full featured converter (cvtasm) has a lot of dependencies: it has to
> > support all hosts (M) (with all assembler dialects M') and all targets N,
> > i.e. in the worst case one would end with M'x N variants of it, or M x N
> > if one supports only one assembler dialect per host.  It is clear, that
> > the number of variants is one of the biggest disadvantages of such an
> > approach.
>
> Perhaps a mixed approach can be made for gradual conversion: for
> combinations where cvtasm has been written, use that.  Where that's
> still to be done, have dyngen generate call instructions to the ops
> instead of pasting the ops text directly.
Perhaps, but I am not sure, if the changes required for generating calls with 
parameters to functions instead of copied code in dyngen are much smaller 
than hand written code generators. Furthermore one would surely lose some 
performance.

Kind Regards
Axel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-26  5:49       ` Axel Zeuner
@ 2007-03-26 22:53         ` Paul Brook
  2007-03-27  5:48           ` Axel Zeuner
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Brook @ 2007-03-26 22:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Axel Zeuner

> > I moved to helper2.c because AFAICT helper.c is compiled with the same
> > sort of restrictions as op.c which leads to the compile failure.
>
> Yes, helper.c is compiled with the global register variables and the code
> is called directly from the op_xxx functions, but one needs the global
> register variables to access global data, these contain the required
> environment for the emulation. AFAIK helper2.c is used by the CODE_COPY
> branch on i386 with even stronger restrictions, but I may be wrong here.

helper.c is compiled with the same setting as op.c, so has direct access to 
the dyngen state ("T0", "env" etc). helper2.c is regular code. Either may be 
used from op.c, the difference is whether all arguments are explicit. Also, 
if a helper throws an exception it must be in helper.c to avoid clobbering 
CPU state before calling raise_exception.

Note that some targets use a different naming scheme. They use helper.c for 
regular code and op_helper.c for op.c-like code. IMHO this is a much better 
naming scheme.

Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-26 22:53         ` Paul Brook
@ 2007-03-27  5:48           ` Axel Zeuner
  0 siblings, 0 replies; 17+ messages in thread
From: Axel Zeuner @ 2007-03-27  5:48 UTC (permalink / raw)
  To: qemu-devel

Hi Paul,
On Tuesday 27 March 2007 00:53, Paul Brook wrote:
> > > I moved to helper2.c because AFAICT helper.c is compiled with the same
> > > sort of restrictions as op.c which leads to the compile failure.
> >
> > Yes, helper.c is compiled with the global register variables and the code
> > is called directly from the op_xxx functions, but one needs the global
> > register variables to access global data, these contain the required
> > environment for the emulation. AFAIK helper2.c is used by the CODE_COPY
> > branch on i386 with even stronger restrictions, but I may be wrong here.
>
> helper.c is compiled with the same setting as op.c, so has direct access to
> the dyngen state ("T0", "env" etc). helper2.c is regular code. Either may
> be used from op.c, the difference is whether all arguments are explicit.
> Also, if a helper throws an exception it must be in helper.c to avoid
> clobbering CPU state before calling raise_exception.

Thank you for the clarification, I was wrong. 

Kind regards
Axel

> Note that some targets use a different naming scheme. They use helper.c for
> regular code and op_helper.c for op.c-like code. IMHO this is a much better
> naming scheme.
>
> Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-26  6:16       ` Axel Zeuner
@ 2007-03-29  2:07         ` Anthony Liguori
  2007-03-29  6:03           ` Axel Zeuner
  0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2007-03-29  2:07 UTC (permalink / raw)
  To: Axel Zeuner; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3131 bytes --]

Axel Zeuner wrote:
> Hi Anthony,
>
> On Monday 26 March 2007 01:44, you wrote:
>   
>> Axel Zeuner wrote:
>>     
>>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>>>       
>>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>>>> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
>>>> deal with the register pressure.  The problem I can't seem to fix though
>>>> is that GCC sticks %1 in %esi because we're only using an "r"
>>>> constraint, not a "q" constraint.  This results in the generation of
>>>> %sib which is an invalid register.  However, refactoring the code to not
>>>> require a "q" constraint doesn't seem to help either.
>>>>         
>>> Hi Anthony,
>>> could you please try the attached patch for softmmu_header.h? Allows
>>> compiling with gcc4 and ASM_SOFTMMU.
>>>       
>> That did the trick.  Could you explain what your changes did?
>>     
>
> QEMU/i386 has only 3 three available registers if TARGET_I386 is selected 
> because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA A0). 
> This makes inline assembly really ugly. The called external C functions in  
> ASM_SOFTMMU are REGPARM(1,2), i.e. require their first arguments in eax, edx.
>   

Based on some feedback from Paul Brook, I wrote another patch that just 
disables the use of register variables for GCC4.  I think this is a 
considerably less hackish way to go about this.

The generated code won't be as nice of course but at least it works.  
The patch applies against your cvtasm patches.

Regards,

Anthony Liguori

> In the two ld functions three registers (eax, edx, ecx) are required and 
> destroyed because an external C function may be called. We relax the register 
> pressure a little bit by forcing the return value (res) into eax , because 
> the return value is returned in a destroyed register. Furthermore the called 
> C function returns its value in eax anyway (call %7). 
>
> The st functions are a little more tricky: we need three registers and the 
> assembly code requires a reload of %0 (ptr) after the check if the external 
> function must be called. In the external function the three remaining 
> registers are destroyed. After the call a need also to reload of %1 (v) into 
> register is needed, i.e. we need more registers. Register saving on the stack 
> does not work, because there exist already 2 "m" constraints: if the code is 
> compiled with -fomit-frame-pointers these are expressed as offsets relative 
> to %esp, i.e X(%esp) and would become invalid after pushes onto the stack.
>
> One solution was to force all inputs to the asm block onto the stack, thats 
> what the replacement of the "r" constraints into "m" constraints do: they 
> force a memory reference. Because i386 can not do direct memory memory moves 
> one has to reload "m"(v) into ecx again, otherwise the generated assembler 
> code is invalid.
> It must be mentioned, that the generated code is a little bit slower than the 
> original one.
>
> Kind Regards 
> Axel
>   
>> Regards,
>>
>> Anthony Liguori
>>
>>     
>>> Kind regards
>>> Axel
>>>       
>
>   


[-- Attachment #2: gcc4-register-pressure.diff --]
[-- Type: text/x-patch, Size: 2053 bytes --]

diff -r d19a5903d749 softmmu_header.h
--- a/softmmu_header.h	Tue Mar 27 13:23:10 2007 -0500
+++ b/softmmu_header.h	Tue Mar 27 13:23:21 2007 -0500
@@ -240,9 +240,13 @@ static inline void glue(glue(st, SUFFIX)
                   "2:\n"
                   : 
                   : "r" (ptr), 
+#ifdef USE_REGISTER_VARIABLES
 /* NOTE: 'q' would be needed as constraint, but we could not use it
    with T1 ! */
                   "r" (v), 
+#else
+		  "q" (v),
+#endif
                   "i" ((CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS), 
                   "i" (TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS), 
                   "i" (TARGET_PAGE_MASK | (DATA_SIZE - 1)),
diff -r d19a5903d749 target-i386/cpu.h
--- a/target-i386/cpu.h	Tue Mar 27 13:23:10 2007 -0500
+++ b/target-i386/cpu.h	Tue Mar 27 13:23:21 2007 -0500
@@ -26,6 +26,10 @@
 #define TARGET_LONG_BITS 64
 #else
 #define TARGET_LONG_BITS 32
+#endif
+
+#if TARGET_LONG_BITS <= HOST_LONG_BITS && __GNUC__ < 4
+#define USE_REGISTER_VARIABLES
 #endif
 
 /* target supports implicit self modifying code */
@@ -424,7 +428,7 @@ typedef union {
 #endif
 
 typedef struct CPUX86State {
-#if TARGET_LONG_BITS > HOST_LONG_BITS
+#ifndef USE_REGISTER_VARIABLES
     /* temporaries if we cannot store them in host registers */
     target_ulong t0, t1, t2;
 #endif
diff -r d19a5903d749 target-i386/exec.h
--- a/target-i386/exec.h	Tue Mar 27 13:23:10 2007 -0500
+++ b/target-i386/exec.h	Tue Mar 27 13:23:21 2007 -0500
@@ -27,12 +27,16 @@
 #define TARGET_LONG_BITS 32
 #endif
 
+#if TARGET_LONG_BITS <= HOST_LONG_BITS && __GNUC__ < 4
+#define USE_REGISTER_VARIABLES
+#endif
+
 #include "cpu-defs.h"
 
 /* at least 4 register variables are defined */
 register struct CPUX86State *env asm(AREG0);
 
-#if TARGET_LONG_BITS > HOST_LONG_BITS
+#ifndef USE_REGISTER_VARIABLES
 
 /* no registers can be used */
 #define T0 (env->t0)
@@ -88,7 +92,7 @@ register target_ulong EDI asm(AREG11);
 #define reg_EDI
 #endif
 
-#endif /* ! (TARGET_LONG_BITS > HOST_LONG_BITS) */
+#endif /* ! USE_REGISTER_VARIABLES */
 
 #define A0 T2
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-29  2:07         ` Anthony Liguori
@ 2007-03-29  6:03           ` Axel Zeuner
  2007-03-29 15:51             ` Anthony Liguori
  0 siblings, 1 reply; 17+ messages in thread
From: Axel Zeuner @ 2007-03-29  6:03 UTC (permalink / raw)
  To: Anthony Liguori, qemu-devel

Hi Anthony,

On Thursday 29 March 2007 04:07, you wrote:
> Axel Zeuner wrote:
> > Hi Anthony,
> >
> > On Monday 26 March 2007 01:44, you wrote:
> >> Axel Zeuner wrote:
> >>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
> >>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
> >>>> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
> >>>> deal with the register pressure.  The problem I can't seem to fix
> >>>> though is that GCC sticks %1 in %esi because we're only using an "r"
> >>>> constraint, not a "q" constraint.  This results in the generation of
> >>>> %sib which is an invalid register.  However, refactoring the code to
> >>>> not require a "q" constraint doesn't seem to help either.
> >>>
> >>> Hi Anthony,
> >>> could you please try the attached patch for softmmu_header.h? Allows
> >>> compiling with gcc4 and ASM_SOFTMMU.
> >>
> >> That did the trick.  Could you explain what your changes did?
> >
> > QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
> > because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA
> > A0). This makes inline assembly really ugly. The called external C
> > functions in ASM_SOFTMMU are REGPARM(1,2), i.e. require their first
> > arguments in eax, edx.
>
> Based on some feedback from Paul Brook, I wrote another patch that just
> disables the use of register variables for GCC4.  I think this is a
> considerably less hackish way to go about this.
>
> The generated code won't be as nice of course but at least it works.
> The patch applies against your cvtasm patches.
Looks good to me, sorry I had no time yet to test your patch. Did you check 
the performance impact of your changes? 
Perhaps it is possible to use register variables in dependence of the register 
count of the host processor.

Kind Regards
Axel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-29  6:03           ` Axel Zeuner
@ 2007-03-29 15:51             ` Anthony Liguori
  0 siblings, 0 replies; 17+ messages in thread
From: Anthony Liguori @ 2007-03-29 15:51 UTC (permalink / raw)
  To: Axel Zeuner; +Cc: qemu-devel

Axel Zeuner wrote:
> Hi Anthony,
>
> On Thursday 29 March 2007 04:07, you wrote:
>   
>> Axel Zeuner wrote:
>>     
>>> Hi Anthony,
>>>
>>> On Monday 26 March 2007 01:44, you wrote:
>>>       
>>>> Axel Zeuner wrote:
>>>>         
>>>>> On Saturday 24 March 2007 21:15, Anthony Liguori wrote:
>>>>>           
>>>>>> The tricky thing I still can't figure out is how to get ASM_SOFTMMU
>>>>>> working.  The problem is GLUE(st, SUFFIX) function.  First GCC cannot
>>>>>> deal with the register pressure.  The problem I can't seem to fix
>>>>>> though is that GCC sticks %1 in %esi because we're only using an "r"
>>>>>> constraint, not a "q" constraint.  This results in the generation of
>>>>>> %sib which is an invalid register.  However, refactoring the code to
>>>>>> not require a "q" constraint doesn't seem to help either.
>>>>>>             
>>>>> Hi Anthony,
>>>>> could you please try the attached patch for softmmu_header.h? Allows
>>>>> compiling with gcc4 and ASM_SOFTMMU.
>>>>>           
>>>> That did the trick.  Could you explain what your changes did?
>>>>         
>>> QEMU/i386 has only 3 three available registers if TARGET_I386 is selected
>>> because ebx,ebp,esi,edi are used by the environment and T0, T1, T3( AKA
>>> A0). This makes inline assembly really ugly. The called external C
>>> functions in ASM_SOFTMMU are REGPARM(1,2), i.e. require their first
>>> arguments in eax, edx.
>>>       
>> Based on some feedback from Paul Brook, I wrote another patch that just
>> disables the use of register variables for GCC4.  I think this is a
>> considerably less hackish way to go about this.
>>
>> The generated code won't be as nice of course but at least it works.
>> The patch applies against your cvtasm patches.
>>     
> Looks good to me, sorry I had no time yet to test your patch. Did you check 
> the performance impact of your changes? 
> Perhaps it is possible to use register variables in dependence of the register 
> count of the host processor.
>   

Yes, I need to update the patch to include a && defined(__i386__) and 
also to add the proper guards to the other architectures.

Regards,

Anthony Liguori

> Kind Regards
> Axel
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works
  2007-03-26 17:14   ` Axel Zeuner
@ 2007-04-06 21:04     ` Rob Landley
  0 siblings, 0 replies; 17+ messages in thread
From: Rob Landley @ 2007-04-06 21:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Axel Zeuner

On Monday 26 March 2007 1:14 pm, Axel Zeuner wrote:
> Hi Avi,
> On Sunday 25 March 2007 15:40, Avi Kivity wrote:
> > Axel Zeuner wrote:
> > > A full featured converter (cvtasm) has a lot of dependencies: it has to
> > > support all hosts (M) (with all assembler dialects M') and all targets 
N,
> > > i.e. in the worst case one would end with M'x N variants of it, or M x N
> > > if one supports only one assembler dialect per host.  It is clear, that
> > > the number of variants is one of the biggest disadvantages of such an
> > > approach.
> >
> > Perhaps a mixed approach can be made for gradual conversion: for
> > combinations where cvtasm has been written, use that.  Where that's
> > still to be done, have dyngen generate call instructions to the ops
> > instead of pasting the ops text directly.
> Perhaps, but I am not sure, if the changes required for generating calls 
with 
> parameters to functions instead of copied code in dyngen are much smaller 
> than hand written code generators. Furthermore one would surely lose some 
> performance.
> 
> Kind Regards
> Axel

On a related note, I have this vague urge from time to time to get qemu to 
build with tcc.

Haven't even come close to making it work yet, of course... :)

Rob
-- 
Penguicon 5.0 Apr 20-22, Linux Expo/SF Convention.  Bruce Schneier, Christine 
Peterson, Steve Jackson, Randy Milholland, Elizabeth Bear, Charlie Stross...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works)
  2007-03-24 20:15 ` Anthony Liguori
  2007-03-25 10:15   ` Axel Zeuner
  2007-03-25 12:12   ` Axel Zeuner
@ 2007-04-20 16:57   ` Gwenole Beauchesne
  2 siblings, 0 replies; 17+ messages in thread
From: Gwenole Beauchesne @ 2007-04-20 16:57 UTC (permalink / raw)
  To: qemu-devel

Hi,

> By adding some GCC4 fixes on top of your patch, I was able to get qemu for 
> i386 (on i386) to compile and run.  So far, I've only tested a win2k guest.

For op_pshufw(), please keep the temporary destination register as S and D 
may reference the same register.

FYI, I am experimenting with an alternate gcc4 patch (inlined hereunder).
<http://svn.mandriva.com/svn/packages/cooker/qemu/current/SOURCES/qemu-0.9.0-gcc4.patch>

I have only tested the following configurations with -no-kvm -no-kqemu
- compiler: gcc 4.1.2-1mdv
- guest OS: { winXPsp2, linux }
- guest CPU: { i386, x86_64 (linux-only) }
- host CPU (compiled as): { i386, x86_64 }

PS: I have not tested yet on MacOS X.

Regards,
Gwenole

2007-04-20  Gwenole Beauchesne  <gbeauchesne@mandriva.com>

 	* gcc4 host support.

--- qemu-0.9.0/target-i386/ops_template.h.gcc4	2005-02-21 20:23:59.000000000 +0000
+++ qemu-0.9.0/target-i386/ops_template.h	2007-04-20 14:53:32.000000000 +0000
@@ -268,7 +268,7 @@ static int glue(compute_all_mul, SUFFIX)

  /* various optimized jumps cases */

-void OPPROTO glue(op_jb_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jb_sub, SUFFIX),
  {
      target_long src1, src2;
      src1 = CC_DST + CC_SRC;
@@ -277,23 +277,23 @@ void OPPROTO glue(op_jb_sub, SUFFIX)(voi
      if ((DATA_TYPE)src1 < (DATA_TYPE)src2)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jz_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jz_sub, SUFFIX),
  {
      if ((DATA_TYPE)CC_DST == 0)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jnz_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jnz_sub, SUFFIX),
  {
      if ((DATA_TYPE)CC_DST != 0)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jbe_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jbe_sub, SUFFIX),
  {
      target_long src1, src2;
      src1 = CC_DST + CC_SRC;
@@ -302,16 +302,16 @@ void OPPROTO glue(op_jbe_sub, SUFFIX)(vo
      if ((DATA_TYPE)src1 <= (DATA_TYPE)src2)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_js_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_js_sub, SUFFIX),
  {
      if (CC_DST & SIGN_MASK)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jl_sub, SUFFIX)(void)
+DEFINE_OP(glue(op_jl_sub, SUFFIX),
  {
      target_long src1, src2;
      src1 = CC_DST + CC_SRC;
@@ -320,10 +320,9 @@ void OPPROTO glue(op_jl_sub, SUFFIX)(voi
      if ((DATA_STYPE)src1 < (DATA_STYPE)src2)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jle_sub, SUFFIX)(void)
-{
+DEFINE_OP(glue(op_jle_sub, SUFFIX), {
      target_long src1, src2;
      src1 = CC_DST + CC_SRC;
      src2 = CC_SRC;
@@ -331,39 +330,39 @@ void OPPROTO glue(op_jle_sub, SUFFIX)(vo
      if ((DATA_STYPE)src1 <= (DATA_STYPE)src2)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

  /* oldies */

  #if DATA_BITS >= 16

-void OPPROTO glue(op_loopnz, SUFFIX)(void)
+DEFINE_OP(glue(op_loopnz, SUFFIX),
  {
      if ((DATA_TYPE)ECX != 0 && !(T0 & CC_Z))
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_loopz, SUFFIX)(void)
+DEFINE_OP(glue(op_loopz, SUFFIX),
  {
      if ((DATA_TYPE)ECX != 0 && (T0 & CC_Z))
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jz_ecx, SUFFIX)(void)
+DEFINE_OP(glue(op_jz_ecx, SUFFIX),
  {
      if ((DATA_TYPE)ECX == 0)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

-void OPPROTO glue(op_jnz_ecx, SUFFIX)(void)
+DEFINE_OP(glue(op_jnz_ecx, SUFFIX),
  {
      if ((DATA_TYPE)ECX != 0)
          GOTO_LABEL_PARAM(1);
      FORCE_RET();
-}
+})

  #endif

--- qemu-0.9.0/target-i386/op.c.gcc4	2007-02-02 12:45:51.000000000 +0000
+++ qemu-0.9.0/target-i386/op.c	2007-04-20 15:20:55.000000000 +0000
@@ -18,7 +18,9 @@
   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
   */

+#if __GNUC__ < 4
  #define ASM_SOFTMMU
+#endif
  #include "exec.h"

  /* n must be a constant to be efficient */
@@ -250,6 +252,7 @@ void OPPROTO op_imulb_AL_T0(void)
      EAX = (EAX & ~0xffff) | (res & 0xffff);
      CC_DST = res;
      CC_SRC = (res != (int8_t)res);
+    FORCE_RET();
  }

  void OPPROTO op_mulw_AX_T0(void)
@@ -270,6 +273,7 @@ void OPPROTO op_imulw_AX_T0(void)
      EDX = (EDX & ~0xffff) | ((res >> 16) & 0xffff);
      CC_DST = res;
      CC_SRC = (res != (int16_t)res);
+    FORCE_RET();
  }

  void OPPROTO op_mull_EAX_T0(void)
@@ -290,6 +294,7 @@ void OPPROTO op_imull_EAX_T0(void)
      EDX = (uint32_t)(res >> 32);
      CC_DST = res;
      CC_SRC = (res != (int32_t)res);
+    FORCE_RET();
  }

  void OPPROTO op_imulw_T0_T1(void)
@@ -299,6 +304,7 @@ void OPPROTO op_imulw_T0_T1(void)
      T0 = res;
      CC_DST = res;
      CC_SRC = (res != (int16_t)res);
+    FORCE_RET();
  }

  void OPPROTO op_imull_T0_T1(void)
@@ -308,6 +314,7 @@ void OPPROTO op_imull_T0_T1(void)
      T0 = res;
      CC_DST = res;
      CC_SRC = (res != (int32_t)res);
+    FORCE_RET();
  }

  #ifdef TARGET_X86_64
--- qemu-0.9.0/target-i386/exec.h.gcc4	2006-09-24 18:40:46.000000000 +0000
+++ qemu-0.9.0/target-i386/exec.h	2007-04-20 15:14:38.000000000 +0000
@@ -501,6 +501,7 @@ void update_fp_status(void);
  void helper_hlt(void);
  void helper_monitor(void);
  void helper_mwait(void);
+void helper_pshufw(MMXReg *dst, MMXReg *src, int order);

  extern const uint8_t parity_table[256];
  extern const uint8_t rclw_table[32];
--- qemu-0.9.0/target-i386/helper.c.gcc4	2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/target-i386/helper.c	2007-04-20 15:00:02.000000000 +0000
@@ -3522,8 +3522,15 @@ void helper_fxrstor(target_ulong ptr, in
          nb_xmm_regs = 8 << data64;
          addr = ptr + 0xa0;
          for(i = 0; i < nb_xmm_regs; i++) {
+#if __GNUC__ < 4
              env->xmm_regs[i].XMM_Q(0) = ldq(addr);
              env->xmm_regs[i].XMM_Q(1) = ldq(addr + 8);
+#else
+            env->xmm_regs[i].XMM_L(0) = ldl(addr);
+            env->xmm_regs[i].XMM_L(1) = ldl(addr + 4);
+            env->xmm_regs[i].XMM_L(2) = ldl(addr + 8);
+            env->xmm_regs[i].XMM_L(3) = ldl(addr + 12);
+#endif
              addr += 16;
          }
      }
--- qemu-0.9.0/target-i386/ops_sse.h.gcc4	2007-01-16 19:28:58.000000000 +0000
+++ qemu-0.9.0/target-i386/ops_sse.h	2007-04-20 15:11:19.000000000 +0000
@@ -581,14 +581,9 @@ void OPPROTO glue(op_movq_T0_mm, SUFFIX)
  void OPPROTO glue(op_pshufw, SUFFIX) (void)
  {
      Reg r, *d, *s;
-    int order;
      d = (Reg *)((char *)env + PARAM1);
      s = (Reg *)((char *)env + PARAM2);
-    order = PARAM3;
-    r.W(0) = s->W(order & 3);
-    r.W(1) = s->W((order >> 2) & 3);
-    r.W(2) = s->W((order >> 4) & 3);
-    r.W(3) = s->W((order >> 6) & 3);
+    helper_pshufw(&r, s, PARAM3);
      *d = r;
  }
  #else
--- qemu-0.9.0/target-i386/helper2.c.gcc4	2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/target-i386/helper2.c	2007-04-20 15:15:22.000000000 +0000
@@ -1038,3 +1038,11 @@ void save_native_fp_state(CPUState *env)
      env->native_fp_regs = 0;
  }
  #endif
+
+void helper_pshufw(MMXReg *dst, MMXReg *src, int order)
+{
+    dst->MMX_W(0) = src->MMX_W(order & 3);
+    dst->MMX_W(1) = src->MMX_W((order >> 2) & 3);
+    dst->MMX_W(2) = src->MMX_W((order >> 4) & 3);
+    dst->MMX_W(3) = src->MMX_W((order >> 6) & 3);
+}
--- qemu-0.9.0/dyngen-exec.h.gcc4	2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/dyngen-exec.h	2007-04-20 14:54:50.000000000 +0000
@@ -279,4 +279,24 @@ extern int __op_jmp0, __op_jmp1, __op_jm
  #define EXIT_TB() asm volatile ("rts")
  #endif

+#if defined __i386__ || defined __x86_64__
+#define DEFINE_OP(NAME, ...)						\
+static void OPPROTO glue(impl_, NAME)(void) __attribute__((used));	\
+void OPPROTO glue(impl_, NAME)(void)					\
+{									\
+    asm volatile (".globl " ASM_NAME(NAME));				\
+    asm volatile (".type " ASM_NAME(NAME) ", @function");		\
+    asm volatile (ASM_NAME(NAME) ":");					\
+    __VA_ARGS__;							\
+    asm volatile ("ret");						\
+    asm volatile (".size " ASM_NAME(NAME) ", .-" ASM_NAME(NAME));	\
+}
+#else
+#define DEFINE_OP(NAME, ...)			\
+void OPPROTO NAME(void)				\
+{						\
+    __VA_ARGS__;				\
+}
+#endif
+
  #endif /* !defined(__DYNGEN_EXEC_H__) */
--- qemu-0.9.0/cpu-all.h.gcc4	2007-04-20 14:49:44.000000000 +0000
+++ qemu-0.9.0/cpu-all.h	2007-04-20 14:58:38.000000000 +0000
@@ -339,7 +339,13 @@ static inline void stl_le_p(void *ptr, i

  static inline void stq_le_p(void *ptr, uint64_t v)
  {
+#if __GNUC__ < 4
      *(uint64_t *)ptr = v;
+#else
+    uint8_t *p = ptr;
+    stl_le_p(p, (uint32_t)v);
+    stl_le_p(p + 4, v >> 32);
+#endif
  }

  /* float access */
--- qemu-0.9.0/cpu-exec.c.gcc4	2007-04-20 15:43:06.000000000 +0000
+++ qemu-0.9.0/cpu-exec.c	2007-04-20 15:50:20.000000000 +0000
@@ -737,6 +737,18 @@ int cpu_exec(CPUState *env1)
              );
      }
  }
+#elif defined(__i386__) || defined(__x86_64__)
+		asm volatile ("call *%0"
+			      : /* no outputs */
+			      : "r" (gen_func)
+			      : AREG0, AREG1, AREG2, AREG3
+#ifdef AREG4
+			      , AREG4
+#endif
+#ifdef AREG5
+			      , AREG5
+#endif
+			      );
  #elif defined(__ia64)
  		struct fptr {
  			void *ip;

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-04-20 17:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-24 17:50 [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Axel Zeuner
2007-03-24 20:15 ` Anthony Liguori
2007-03-25 10:15   ` Axel Zeuner
2007-03-25 23:46     ` Anthony Liguori
2007-03-26  5:49       ` Axel Zeuner
2007-03-26 22:53         ` Paul Brook
2007-03-27  5:48           ` Axel Zeuner
2007-03-25 12:12   ` Axel Zeuner
2007-03-25 23:44     ` Anthony Liguori
2007-03-26  6:16       ` Axel Zeuner
2007-03-29  2:07         ` Anthony Liguori
2007-03-29  6:03           ` Axel Zeuner
2007-03-29 15:51             ` Anthony Liguori
2007-04-20 16:57   ` qemu + gcc4 (Was: [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works) Gwenole Beauchesne
2007-03-25 13:40 ` [Qemu-devel] [RFC/experimental patch] qemu (x86_64 on x86_64 -no-kqemu) compiles with gcc4 and works Avi Kivity
2007-03-26 17:14   ` Axel Zeuner
2007-04-06 21:04     ` Rob Landley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.