Hi list !

Here are some news.

----------------------------------------------------------------------------------------------------------

This file is an introduction to a patch trying to provide accelerated OpenGL
to QEMU.

Since the first release of the patch, things have improved a bit but its 
status
must always be considered as experimental. There are still really many things 
to
do.

Changes/Improvements since first patch
-------------------------------------

* More complete implementation of OpenGL enabling to execute real-life 
programs.
* Communication protocol has been rewritten.
* More KQEMU friendly in terms of performances thanks to buffering of non 
blocking
  calls. Best performance is now achived with KQEMU enabled, which sounds 
right,
  doesn'it ?
* Code compiles with less warnings and with -O2 flag (though I still hit a GCC
  bug that makes necessary this ugly my_strlen function pointer in 
helper_opengl.c)

Architecture summary
--------------------

The patch is composed of three parts :
- a patch for QEMU itself that intercepts any call to 'int $0x99' which is the
  interface I have chosen to dedicate to communication between guest user
  programs and QEMU.
  The modification of QEMU existing code is very slight.
  This part is not very elegant as Fabrice Bellard noticed it. He proposed an
  alternative that sound better, but I certainly lack of time and knowledge
  of the related fields to go on that track.
- an OpenGL 'server' receiving OpenGL calls and executing them with the host 
GL
  library (preferably with accelerated drivers)
- an OpenGL-compliant (well, aiming to be...) library that must be installed 
on the
  guest virtual machine.
  This library implements the OpenGL API and send the calls to the QEMU host.


Currently supported platforms
-----------------------------
For the moment, only GNU/Linux i386 on both host and guest sides.
This limitation is just the current state of the patch.
I'm pretty confident that the code can be adapted to support 32/64 bits
combinations, and maybe cross endianesses with more work and probably reduced
performance.


How to use it ? (ONLY on Linux i386 and with only i386-softmmu as target-list)
---------------
 * Apply the patch and recompile QEMU (it applies cleanly on 10 feb 2007 CVS)
 * Lauch QEMU with -enable-gl option
 * Get the compiled libGL.so from ./i386-softmmu and copy it to the guest OS.
   You may need to make a symlink between libGL.so.1 and libGL.so
 * In the guest OS : LD_LIBRARY_PATH=/path/to/replacement/libGL.so glxgears

Debugging infrastructure
------------------------

I found very valuable and time-saving to add the two following small 
utilities.

* An OpenGL TCP/IP server (opengl_server.c), that has the same
role as QEMU. It works together with the OpenGL client library, recompiled 
with
the '#define TCP_COMMUNICATION' macro that replaces the 'int $0x99'
communication by a TCP connection to the server. You can then debug very
easily the pure OpenGL part of the mechanism.
(Please note, the server is actually a poor server since you must kill it each
time one of the client has terminated and it won't work with more than one
client connected at a time).
The server accepts two commandline arguments :
  * '-debug' that displays all OpenGL function names that are sent by the
    client
  * '-save' that saves all the data in the /tmp/debug_gl.bin file
Since a few weeks, I've been working almost 99% of the time only with that
TCP/IP stuff, so QEMU integration is still way behind. I've noticed that
there are crashes when playing with QEMU that are not found when playing with
TCP/IP stuff. So the qemu specific communication protocol may still be buggy.
    
* An OpenGL player (opengl_player.c). This one plays a file
recorded either by QEMU host side when the environment variable WRITE_GL is
activated on the guest side, either by opengl_server when it's launched with
the argument '-save'. Very useful to replay a sequence that has made crash
the server.

Status of the Open-GL implementation
------------------------------------

Well, quite a few things are implemented, but I can't honestly say if it's 
100%
OpenGL 1.0  (I'm pretty sure it's not), 90% OpenGL 1.1, etc... But I consider
that most usefull OpenGL calls are now more or less implemented. There are 
still
extensions missing of course. When you run parse_gl_h, you can see the list of
what is missing.

I wouldn't also bet too much on the correctness of what is implemented today. 
Indeed, I've implemented the API on a very pragmatic approach, each time it 
was required by tested applications.
There are quite many 'shortcuts' in the implementation because
I'm sometimes too lazy too fully implement the whole specification.
The implementation of some OpenGL calls has been specifically tuned to be
non-blocking, since it reduces the number of client/server round-trips and
improves largely performance (the same applies with the KQEMU case). As above,
it has been done only one a case-by-case rational.
(By the way, I'll be very happy if someone could explain me why client/server
 round-trips seem to cost a 40ms delay even on the 127.0.0.1 interface...)

Known-to-work (and-not-work) programs
----------------------
(Host computer : Athlon 64 3200+, 512 MB RAM, Ubuntu Edgy 32 bits,
                 ATI X300 with ATI proprietary drivers
Virtual computer : Fedora Core 5)

* Many programs in 'Mesa-6.5.1/progs/demos' that runs natively on my computer, 
that is to say :
    - arbfplight        85 fps QEMU/KQEMU , 824 fps native
    - arbfslight        100 fps QEMU/KQEMU, 900 fps native
    - arbocclude        Make QEMU/KQEMU crash, OpenGL server OK (8 fps)
                          (would require asynchronous glGetQueryObjectivARB
                           and glGetQueryObjectuivARB for good performance)
    - bounce            OK 
    - bufferobj         Make QEMU/KQEMU crash, OpenGL server OK
        (and it has just corrupted my FC5 image right now... grrr, so the next
         tests are just done with the TCP/IP server.)
    - clearspd          OK
    - cubemap           KO on host computer ('<glutCreateWindow> called 
without
                                                first calling 'glutInit')
    - drawpix           KO (something wrong with glBitmap implementation)
    - engine            OK (would require asynchronous glIsEnabled for good
                            performance)
    - fire              OK
    - fogcoord          OK
    - fplight           KO on host computer('Sorry, this demo requires
                                              GL_NV_vertex_program')
    - gamma             KO : nothing drawn in the window
    - gearbox           OK
    - gears             OK
    - geartrain         KO on host computer ('<glutCreateWindow> called 
without
                                                first calling 'glutInit')
    - glinfo            OK
    - gloss             OK
    - glslnoise         OK (though very poor performance on native hardware,
                            probably due to a fallback execution of shaders)
    - gltestperf        KO on host computer ('<glutCreateWindow> called 
without
                                                first calling 'glutInit')
    - glutfx            OK (though fullscreen prevents mouse events to be 
sent...)
    - ipers             OK
    - isosurf           KO on host computer ('<glutCreateWindow> called 
without
                                                first calling 'glutInit')
    - loadbias          OK
    - morph3d           OK (would require asynchronous glGenLists for better
                            performance)
    - multiarb          OK
    - paltex            KO on host computer ('Sorry, GL_EXT_paletted_texture
                                                not supported')
    - pointblast        OK
    - ray               OK
    - readpix           OK
    - reflect           OK
    - renormal          OK
    - shadowtex         OK
    - singlebuffer      OK
    - spectex           OK
    - spriteblast       OK
    - stex3d            OK
    - teapot            OK
    - terrain           OK
    - tessdemo          KO (no handling of several gl contexts)
    - texcyl            OK
    - texdown           OK
    - texenv            OK
    - texobj            KO (Assertion `glIsTexture(TexObj[0])' failed)
    - trispd            OK
    - tunnel            OK
    - tunnel2           KO (the two windows are superposed)
    - vao_demo          KO on host computer ('Sorry, this program requires
                                                GL_APPLE_vertex_array_object')
    - winpos            KO on host computer ('<glutCreateWindow> called 
without
                                                first calling 'glutInit')
    
    (wow, that's the first time I test the whole set of programs. I'm pleased
     to see that some of them work without fix or additions to the code ;-))

* fgl_glxgears with and without -fbo. OK in KQEMU/QEMU with very good 
performance

Now, funnier stuff :

* ppracer : very good performance in KQEMU/QEMU : 40 FPS
            host computer : 60 FPS
* openquartz-glx : good performance with  with TCP/IP server.
                   Doesn't work in KQEMU/QEMU (even with Mesa rendering)
* darkplaces-linux-686-glx (same game as previous one but different 3D engine)
* googleearth : good performance with TCP/IP server. not tried in QEMU
* ww2d : good performance with TCP/IP server. not tried in QEMU
* earth3d : works. performance not very good even on native hardware.
            not tried in QEMU
* doom3-demo : good performance with TCP/IP server. not tried in QEMU
* Mandriva 2007 Live CD : didn't manage to make XGL work, but the drak3d 
program
                          seemed to believe that there was real 3D hardware.


TODO LIST (almost all long as first time)
---------
- implement correctly full OpenGL API
- integrate in a better way the window that popups on the guest side with the 
  main qemu window. I've tried a bit to make OpenGL drawing in the SDL window,
  but I didn't come to a satisfactory result. I didn't try very hard though.
  There are several challenges to make it work : do proper viewport/matrix 
stuff
  to draw only on the part of the SDL window that corresponds to the client GL
  window, display the mouse pointer, etc...
- integrate it properly into QEMU build system
- fix how the end of the guest process is detected / enable several gl
  contexts at the same time / enable several guest OS processes to use OpenGL
  at the same time / thread safety of the client library...
- much testing and debugging
- clean the code / code review
- more optimizations to reduce the number of necessary round-trips
- improve the way OpenGL extensions are handled, and do what is necessary to
  only require OpenGL 1.0 symbols for host side linking with the host OpenGL
  library. (we could also all symbols dynamically).
- make parse_gl_h.c parse mesa headers instead of /usr/include GL headers ??
  (this would enable us to have the same generated code and ease binary
   compatibility of the communication protocol between hosts. To be clear, if
   Alice compiles QEMU on her machine and sends the )
- make it run on x86_64, and allowing any combination guest x86/x86_64 and 
  host x86/x86_64, other archs. That means a complete scan of the code and
  think each time if it's really a int, long, void*, etc etc....
- improve security if possible (preventing malicious guest code from crashing 
  host qemu)
- port it to other UNIX-like OS with X11
- port it to Windows platform (first OpenGL / WGL, then D3D through Wine 
  libs ?)
- (make a patch to Valgrind to make it happy with 'int $0x99' when running on 
   guest side ?)


SECURITY/ROBUSTNESS ISSUES
--------------------------

Security is really a huge challenge. It's really easy for guest code to make
QEMU/OpenGL  crash, most of the time due to implementation bugs of mine of 
course,
but also because of some "features" (to be polite) of some host OpenGL drivers
themselves... During my repeated tests, I have even managed to make my X 
server
crash from time to time.
I don't really see how we can ensure that QEMU won't crash even when we'll 
have
corrected most bugs on our side. One way is certainly to execute OpenGL calls
in an external process, like the opengl_server. But in that case, how to do 
the
integration in QEMU window ?


FILE LIST
---------

target-i386/
 - opengl_client.c  : the OpenGL guest library
 - helper_opengl.c : decoding of OpenGL calls in QEMU
 - opengl_exec.c : execution of OpenGL calls on host/server side
 - gl_func.h, server_stub.c, client_stub.c : files generated by parse_gl_h.c 
     from gl.h parsing
 - glgetv_cst.h  : file generated by parse_mesa_get_c.c
 - gl_func_perso.h, opengl_func.h : hand-written "prototypes"
 - opengl_player.c : see above
 - opengl_server.c : see above
 - mesa_gl.h, mesa_gl_ext.h, mesa_get.c, mesa_enums.c :
            directly taken from MESA project and just renamed with mesa_ 
prefix.
            Needed by parse_mesa_get_c

CONTRIBUTING
------------

I hope I've not discouraged people of good will to contribute to. The TODO 
list
is certainly a good start. In the short term, for example, I'd appreciate help 
for
integration in qemu window.

~~~~~~~~~~~~
Maybe time to stop chating and go back to code or bed... Have fun!
~~~~~~~~~~~~