All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm crashes with spice while loading qxl
@ 2011-02-26 11:43 xming
  2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
  0 siblings, 1 reply; 56+ messages in thread
From: xming @ 2011-02-26 11:43 UTC (permalink / raw)
  To: kvm

When trying to start X (and it loads qxl driver) the kvm process just crashes.

qemu-kvm 0.14

startup line

/usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel
/boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
port=5957,disable-ticketing -monitor
telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
/var/run/kvm/spaceball.pid

host is running vanilla 2.6.37.1 on amd64.

Here is the bt

# gdb /usr/bin/qemu-system-x86_64
GNU gdb (Gentoo 7.2 p1) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/bin/qemu-system-x86_64...done.
(gdb) set args -name spaceball,process=spaceball -m 1024 -kernel
/boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
port=5957,disable-ticketing -monitor
telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
/var/run/kvm/spaceball.pid
(gdb) run
Starting program: /usr/bin/qemu-system-x86_64 -name
spaceball,process=spaceball -m 1024 -kernel
/boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
port=5957,disable-ticketing -monitor
telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
/var/run/kvm/spaceball.pid
[Thread debugging using libthread_db enabled]
do_spice_init: starting 0.6.0
spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
spice_server_add_interface: SPICE_INTERFACE_MOUSE
[New Thread 0x7ffff4802710 (LWP 30294)]
spice_server_add_interface: SPICE_INTERFACE_QXL
[New Thread 0x7fffaacae710 (LWP 30295)]
red_worker_main: begin
handle_dev_destroy_surfaces:
handle_dev_destroy_surfaces:
handle_dev_input: start
[New Thread 0x7fffaa4ad710 (LWP 30298)]
[New Thread 0x7fffa9cac710 (LWP 30299)]
[New Thread 0x7fffa94ab710 (LWP 30300)]
[New Thread 0x7fffa8caa710 (LWP 30301)]
[New Thread 0x7fffa3fff710 (LWP 30302)]
[New Thread 0x7fffa37fe710 (LWP 30303)]
[New Thread 0x7fffa2ffd710 (LWP 30304)]
[New Thread 0x7fffa27fc710 (LWP 30305)]
[New Thread 0x7fffa1ffb710 (LWP 30306)]
[New Thread 0x7fffa17fa710 (LWP 30307)]
reds_handle_main_link:
reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link
reds_main_handle_message: net test: latency 5.636000 ms, bitrate
11027768 bps (10.516899 Mbps)
reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link
red_dispatcher_set_peer:
handle_dev_input: connect
handle_new_display_channel: jpeg disabled
handle_new_display_channel: zlib-over-glz disabled
reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link
red_dispatcher_set_cursor_peer:
handle_dev_input: cursor connect
reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link
inputs_link:
[New Thread 0x7fffa07f8710 (LWP 30312)]
[New Thread 0x7fff9fff7710 (LWP 30313)]
[New Thread 0x7fff9f7f6710 (LWP 30314)]
[New Thread 0x7fff9eff5710 (LWP 30315)]
[New Thread 0x7fff9e7f4710 (LWP 30316)]
[New Thread 0x7fff9dff3710 (LWP 30317)]
[New Thread 0x7fff9d7f2710 (LWP 30318)]
qemu-system-x86_64:
/var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
kvm_mutex_unlock: Assertion `!cpu_single_env' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff4802710 (LWP 30294)]
0x00007ffff5daa165 in raise () from /lib/libc.so.6
(gdb)
(gdb)
(gdb)
(gdb)
(gdb) bt
#0  0x00007ffff5daa165 in raise () from /lib/libc.so.6
#1  0x00007ffff5dab580 in abort () from /lib/libc.so.6
#2  0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6
#3  0x0000000000436f7e in kvm_mutex_unlock ()
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724
#4  qemu_mutex_unlock_iothread ()
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737
#5  0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665
#6  0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value
optimized out>, val=0)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979
#7  0x0000000000439d4e in kvm_handle_io (env=0x11a3e00)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818
#8  kvm_run (env=0x11a3e00)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617
#9  0x0000000000439f79 in kvm_cpu_exec (env=0x764b)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233
#10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419
#11 ap_main_loop (_env=0x11a3e00)
    at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
#12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
#13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
(gdb)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-02-26 11:43 kvm crashes with spice while loading qxl xming
@ 2011-02-26 12:29   ` Jan Kiszka
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-26 12:29 UTC (permalink / raw)
  To: xming, Gerd Hoffmann; +Cc: kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7086 bytes --]

On 2011-02-26 12:43, xming wrote:
> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> 
> qemu-kvm 0.14
> 
> startup line
> 
> /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> 
> host is running vanilla 2.6.37.1 on amd64.
> 
> Here is the bt
> 
> # gdb /usr/bin/qemu-system-x86_64
> GNU gdb (Gentoo 7.2 p1) 7.2
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-pc-linux-gnu".
> For bug reporting instructions, please see:
> <http://bugs.gentoo.org/>...
> Reading symbols from /usr/bin/qemu-system-x86_64...done.
> (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> (gdb) run
> Starting program: /usr/bin/qemu-system-x86_64 -name
> spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> [Thread debugging using libthread_db enabled]
> do_spice_init: starting 0.6.0
> spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
> spice_server_add_interface: SPICE_INTERFACE_MOUSE
> [New Thread 0x7ffff4802710 (LWP 30294)]
> spice_server_add_interface: SPICE_INTERFACE_QXL
> [New Thread 0x7fffaacae710 (LWP 30295)]
> red_worker_main: begin
> handle_dev_destroy_surfaces:
> handle_dev_destroy_surfaces:
> handle_dev_input: start
> [New Thread 0x7fffaa4ad710 (LWP 30298)]
> [New Thread 0x7fffa9cac710 (LWP 30299)]
> [New Thread 0x7fffa94ab710 (LWP 30300)]
> [New Thread 0x7fffa8caa710 (LWP 30301)]
> [New Thread 0x7fffa3fff710 (LWP 30302)]
> [New Thread 0x7fffa37fe710 (LWP 30303)]
> [New Thread 0x7fffa2ffd710 (LWP 30304)]
> [New Thread 0x7fffa27fc710 (LWP 30305)]
> [New Thread 0x7fffa1ffb710 (LWP 30306)]
> [New Thread 0x7fffa17fa710 (LWP 30307)]
> reds_handle_main_link:
> reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link
> reds_main_handle_message: net test: latency 5.636000 ms, bitrate
> 11027768 bps (10.516899 Mbps)
> reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link
> red_dispatcher_set_peer:
> handle_dev_input: connect
> handle_new_display_channel: jpeg disabled
> handle_new_display_channel: zlib-over-glz disabled
> reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link
> red_dispatcher_set_cursor_peer:
> handle_dev_input: cursor connect
> reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link
> inputs_link:
> [New Thread 0x7fffa07f8710 (LWP 30312)]
> [New Thread 0x7fff9fff7710 (LWP 30313)]
> [New Thread 0x7fff9f7f6710 (LWP 30314)]
> [New Thread 0x7fff9eff5710 (LWP 30315)]
> [New Thread 0x7fff9e7f4710 (LWP 30316)]
> [New Thread 0x7fff9dff3710 (LWP 30317)]
> [New Thread 0x7fff9d7f2710 (LWP 30318)]
> qemu-system-x86_64:
> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0x7ffff4802710 (LWP 30294)]
> 0x00007ffff5daa165 in raise () from /lib/libc.so.6
> (gdb)
> (gdb)
> (gdb)
> (gdb)
> (gdb) bt
> #0  0x00007ffff5daa165 in raise () from /lib/libc.so.6
> #1  0x00007ffff5dab580 in abort () from /lib/libc.so.6
> #2  0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6
> #3  0x0000000000436f7e in kvm_mutex_unlock ()
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724
> #4  qemu_mutex_unlock_iothread ()
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737
> #5  0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665
> #6  0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value
> optimized out>, val=0)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979
> #7  0x0000000000439d4e in kvm_handle_io (env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818
> #8  kvm_run (env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617
> #9  0x0000000000439f79 in kvm_cpu_exec (env=0x764b)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233
> #10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419
> #11 ap_main_loop (_env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> (gdb)

That's a spice bug. In fact, there are a lot of
qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
of them can cause even more subtle problems.

Two general issues with dropping the global mutex like this:
 - The caller of mutex_unlock is responsible for maintaining
   cpu_single_env across the unlocked phase (that's related to the
   abort above).
 - Dropping the lock in the middle of a callback is risky. That may
   enable re-entrances of code sections that weren't designed for this
   (I'm skeptic about the side effects of
   qemu_spice_vm_change_state_handler - why dropping the lock here?).

Spice requires a careful review regarding such issues. Or it should
pioneer with introducing its own lock so that we can handle at least
related I/O activities over the VCPUs without holding the global mutex
(but I bet it's not the simplest candidate for such a new scheme).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-26 12:29   ` Jan Kiszka
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-26 12:29 UTC (permalink / raw)
  To: xming, Gerd Hoffmann; +Cc: qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 7086 bytes --]

On 2011-02-26 12:43, xming wrote:
> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> 
> qemu-kvm 0.14
> 
> startup line
> 
> /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> 
> host is running vanilla 2.6.37.1 on amd64.
> 
> Here is the bt
> 
> # gdb /usr/bin/qemu-system-x86_64
> GNU gdb (Gentoo 7.2 p1) 7.2
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-pc-linux-gnu".
> For bug reporting instructions, please see:
> <http://bugs.gentoo.org/>...
> Reading symbols from /usr/bin/qemu-system-x86_64...done.
> (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> (gdb) run
> Starting program: /usr/bin/qemu-system-x86_64 -name
> spaceball,process=spaceball -m 1024 -kernel
> /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> port=5957,disable-ticketing -monitor
> telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> /var/run/kvm/spaceball.pid
> [Thread debugging using libthread_db enabled]
> do_spice_init: starting 0.6.0
> spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
> spice_server_add_interface: SPICE_INTERFACE_MOUSE
> [New Thread 0x7ffff4802710 (LWP 30294)]
> spice_server_add_interface: SPICE_INTERFACE_QXL
> [New Thread 0x7fffaacae710 (LWP 30295)]
> red_worker_main: begin
> handle_dev_destroy_surfaces:
> handle_dev_destroy_surfaces:
> handle_dev_input: start
> [New Thread 0x7fffaa4ad710 (LWP 30298)]
> [New Thread 0x7fffa9cac710 (LWP 30299)]
> [New Thread 0x7fffa94ab710 (LWP 30300)]
> [New Thread 0x7fffa8caa710 (LWP 30301)]
> [New Thread 0x7fffa3fff710 (LWP 30302)]
> [New Thread 0x7fffa37fe710 (LWP 30303)]
> [New Thread 0x7fffa2ffd710 (LWP 30304)]
> [New Thread 0x7fffa27fc710 (LWP 30305)]
> [New Thread 0x7fffa1ffb710 (LWP 30306)]
> [New Thread 0x7fffa17fa710 (LWP 30307)]
> reds_handle_main_link:
> reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link
> reds_main_handle_message: net test: latency 5.636000 ms, bitrate
> 11027768 bps (10.516899 Mbps)
> reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link
> red_dispatcher_set_peer:
> handle_dev_input: connect
> handle_new_display_channel: jpeg disabled
> handle_new_display_channel: zlib-over-glz disabled
> reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link
> red_dispatcher_set_cursor_peer:
> handle_dev_input: cursor connect
> reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link
> inputs_link:
> [New Thread 0x7fffa07f8710 (LWP 30312)]
> [New Thread 0x7fff9fff7710 (LWP 30313)]
> [New Thread 0x7fff9f7f6710 (LWP 30314)]
> [New Thread 0x7fff9eff5710 (LWP 30315)]
> [New Thread 0x7fff9e7f4710 (LWP 30316)]
> [New Thread 0x7fff9dff3710 (LWP 30317)]
> [New Thread 0x7fff9d7f2710 (LWP 30318)]
> qemu-system-x86_64:
> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
> Program received signal SIGABRT, Aborted.
> [Switching to Thread 0x7ffff4802710 (LWP 30294)]
> 0x00007ffff5daa165 in raise () from /lib/libc.so.6
> (gdb)
> (gdb)
> (gdb)
> (gdb)
> (gdb) bt
> #0  0x00007ffff5daa165 in raise () from /lib/libc.so.6
> #1  0x00007ffff5dab580 in abort () from /lib/libc.so.6
> #2  0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6
> #3  0x0000000000436f7e in kvm_mutex_unlock ()
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724
> #4  qemu_mutex_unlock_iothread ()
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737
> #5  0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665
> #6  0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value
> optimized out>, val=0)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979
> #7  0x0000000000439d4e in kvm_handle_io (env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818
> #8  kvm_run (env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617
> #9  0x0000000000439f79 in kvm_cpu_exec (env=0x764b)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233
> #10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419
> #11 ap_main_loop (_env=0x11a3e00)
>     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> (gdb)

That's a spice bug. In fact, there are a lot of
qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
of them can cause even more subtle problems.

Two general issues with dropping the global mutex like this:
 - The caller of mutex_unlock is responsible for maintaining
   cpu_single_env across the unlocked phase (that's related to the
   abort above).
 - Dropping the lock in the middle of a callback is risky. That may
   enable re-entrances of code sections that weren't designed for this
   (I'm skeptic about the side effects of
   qemu_spice_vm_change_state_handler - why dropping the lock here?).

Spice requires a careful review regarding such issues. Or it should
pioneer with introducing its own lock so that we can handle at least
related I/O activities over the VCPUs without holding the global mutex
(but I bet it's not the simplest candidate for such a new scheme).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
@ 2011-02-26 14:44     ` xming
  -1 siblings, 0 replies; 56+ messages in thread
From: xming @ 2011-02-26 14:44 UTC (permalink / raw)
  To: kvm, qemu-devel

Oops forgot to send this to the list too, here we go

> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.

Just tried spice 0.7.3 (was using 0.6.x) and still the same, should I
file a bug against spice?

> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>   cpu_single_env across the unlocked phase (that's related to the
>   abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>   enable re-entrances of code sections that weren't designed for this
>   (I'm skeptic about the side effects of
>   qemu_spice_vm_change_state_handler - why dropping the lock here?).
>
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
>
> Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-26 14:44     ` xming
  0 siblings, 0 replies; 56+ messages in thread
From: xming @ 2011-02-26 14:44 UTC (permalink / raw)
  To: kvm, qemu-devel

Oops forgot to send this to the list too, here we go

> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.

Just tried spice 0.7.3 (was using 0.6.x) and still the same, should I
file a bug against spice?

> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>   cpu_single_env across the unlocked phase (that's related to the
>   abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>   enable re-entrances of code sections that weren't designed for this
>   (I'm skeptic about the side effects of
>   qemu_spice_vm_change_state_handler - why dropping the lock here?).
>
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
>
> Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
@ 2011-02-27 19:03     ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:03 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 7473 bytes --]

On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> On 2011-02-26 12:43, xming wrote:
> > When trying to start X (and it loads qxl driver) the kvm process just crashes.

This is fixed by Gerd's attached patch (taken from rhel repository, don't know
why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).

> > 
> > qemu-kvm 0.14
> > 
> > startup line
> > 
> > /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > 
> > host is running vanilla 2.6.37.1 on amd64.
> > 
> > Here is the bt
> > 
> > # gdb /usr/bin/qemu-system-x86_64
> > GNU gdb (Gentoo 7.2 p1) 7.2
> > Copyright (C) 2010 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-pc-linux-gnu".
> > For bug reporting instructions, please see:
> > <http://bugs.gentoo.org/>...
> > Reading symbols from /usr/bin/qemu-system-x86_64...done.
> > (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > (gdb) run
> > Starting program: /usr/bin/qemu-system-x86_64 -name
> > spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > [Thread debugging using libthread_db enabled]
> > do_spice_init: starting 0.6.0
> > spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
> > spice_server_add_interface: SPICE_INTERFACE_MOUSE
> > [New Thread 0x7ffff4802710 (LWP 30294)]
> > spice_server_add_interface: SPICE_INTERFACE_QXL
> > [New Thread 0x7fffaacae710 (LWP 30295)]
> > red_worker_main: begin
> > handle_dev_destroy_surfaces:
> > handle_dev_destroy_surfaces:
> > handle_dev_input: start
> > [New Thread 0x7fffaa4ad710 (LWP 30298)]
> > [New Thread 0x7fffa9cac710 (LWP 30299)]
> > [New Thread 0x7fffa94ab710 (LWP 30300)]
> > [New Thread 0x7fffa8caa710 (LWP 30301)]
> > [New Thread 0x7fffa3fff710 (LWP 30302)]
> > [New Thread 0x7fffa37fe710 (LWP 30303)]
> > [New Thread 0x7fffa2ffd710 (LWP 30304)]
> > [New Thread 0x7fffa27fc710 (LWP 30305)]
> > [New Thread 0x7fffa1ffb710 (LWP 30306)]
> > [New Thread 0x7fffa17fa710 (LWP 30307)]
> > reds_handle_main_link:
> > reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link
> > reds_main_handle_message: net test: latency 5.636000 ms, bitrate
> > 11027768 bps (10.516899 Mbps)
> > reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link
> > red_dispatcher_set_peer:
> > handle_dev_input: connect
> > handle_new_display_channel: jpeg disabled
> > handle_new_display_channel: zlib-over-glz disabled
> > reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link
> > red_dispatcher_set_cursor_peer:
> > handle_dev_input: cursor connect
> > reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link
> > inputs_link:
> > [New Thread 0x7fffa07f8710 (LWP 30312)]
> > [New Thread 0x7fff9fff7710 (LWP 30313)]
> > [New Thread 0x7fff9f7f6710 (LWP 30314)]
> > [New Thread 0x7fff9eff5710 (LWP 30315)]
> > [New Thread 0x7fff9e7f4710 (LWP 30316)]
> > [New Thread 0x7fff9dff3710 (LWP 30317)]
> > [New Thread 0x7fff9d7f2710 (LWP 30318)]
> > qemu-system-x86_64:
> > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> > kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> > 
> > Program received signal SIGABRT, Aborted.
> > [Switching to Thread 0x7ffff4802710 (LWP 30294)]
> > 0x00007ffff5daa165 in raise () from /lib/libc.so.6
> > (gdb)
> > (gdb)
> > (gdb)
> > (gdb)
> > (gdb) bt
> > #0  0x00007ffff5daa165 in raise () from /lib/libc.so.6
> > #1  0x00007ffff5dab580 in abort () from /lib/libc.so.6
> > #2  0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6
> > #3  0x0000000000436f7e in kvm_mutex_unlock ()
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724
> > #4  qemu_mutex_unlock_iothread ()
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737
> > #5  0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665
> > #6  0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value
> > optimized out>, val=0)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979
> > #7  0x0000000000439d4e in kvm_handle_io (env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818
> > #8  kvm_run (env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617
> > #9  0x0000000000439f79 in kvm_cpu_exec (env=0x764b)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233
> > #10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419
> > #11 ap_main_loop (_env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > (gdb)
> 
> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
> 
> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>    cpu_single_env across the unlocked phase (that's related to the
>    abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>    enable re-entrances of code sections that weren't designed for this
>    (I'm skeptic about the side effects of
>    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> 
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
> 
> Jan
> 



[-- Attachment #2: 0001-spice-qxl-locking-fix-for-qemu-kvm.patch --]
[-- Type: text/plain, Size: 5329 bytes --]

>From bef7336f79e3325451d2ead120b958a401b0ccc4 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Mon, 24 Jan 2011 09:18:32 -0200
Subject: [PATCH] spice/qxl: locking fix for qemu-kvm

qxl needs to release the qemu lock before calling some libspice
functions (and re-aquire it later).  In upstream qemu qxl can just
use qemu_mutex_{unlock,lock}_iothread.  In qemu-kvm this doesn't
work, qxl needs additionally save+restore the cpu_single_env pointer
on unlock+lock.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/qxl.c           |   37 +++++++++++++++++++++++++++++--------
 ui/spice-display.c |   12 ++++++------
 ui/spice-display.h |    6 ++++++
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index fe4212b..117f7c8 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -125,6 +125,27 @@ static void qxl_reset_memslots(PCIQXLDevice *d);
 static void qxl_reset_surfaces(PCIQXLDevice *d);
 static void qxl_ring_set_dirty(PCIQXLDevice *qxl);
 
+/* qemu-kvm locking ... */
+void qxl_unlock_iothread(SimpleSpiceDisplay *ssd)
+{
+    if (cpu_single_env) {
+        assert(ssd->env == NULL);
+        ssd->env = cpu_single_env;
+        cpu_single_env = NULL;
+    }
+    qemu_mutex_unlock_iothread();
+}
+
+void qxl_lock_iothread(SimpleSpiceDisplay *ssd)
+{
+    qemu_mutex_lock_iothread();
+    if (ssd->env) {
+        assert(cpu_single_env == NULL);
+        cpu_single_env = ssd->env;
+        ssd->env = NULL;
+    }
+}
+
 static inline uint32_t msb_mask(uint32_t val)
 {
     uint32_t mask;
@@ -662,10 +683,10 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm)
     dprint(d, 1, "%s: start%s\n", __FUNCTION__,
            loadvm ? " (loadvm)" : "");
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->reset_cursor(d->ssd.worker);
     d->ssd.worker->reset_image_cache(d->ssd.worker);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
     qxl_reset_surfaces(d);
     qxl_reset_memslots(d);
 
@@ -795,9 +816,9 @@ static void qxl_reset_surfaces(PCIQXLDevice *d)
 {
     dprint(d, 1, "%s:\n", __FUNCTION__);
     d->mode = QXL_MODE_UNDEFINED;
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->destroy_surfaces(d->ssd.worker);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
     memset(&d->guest_surfaces.cmds, 0, sizeof(d->guest_surfaces.cmds));
 }
 
@@ -866,9 +887,9 @@ static void qxl_destroy_primary(PCIQXLDevice *d)
     dprint(d, 1, "%s\n", __FUNCTION__);
 
     d->mode = QXL_MODE_UNDEFINED;
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->destroy_primary_surface(d->ssd.worker, 0);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
 }
 
 static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm)
@@ -938,10 +959,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val)
     case QXL_IO_UPDATE_AREA:
     {
         QXLRect update = d->ram->update_area;
-        qemu_mutex_unlock_iothread();
+        qxl_unlock_iothread(&d->ssd);
         d->ssd.worker->update_area(d->ssd.worker, d->ram->update_surface,
                                    &update, NULL, 0, 0);
-        qemu_mutex_lock_iothread();
+        qxl_lock_iothread(&d->ssd);
         break;
     }
     case QXL_IO_NOTIFY_CMD:
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 020b423..defe652 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -186,18 +186,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd)
     surface.mem        = (intptr_t)ssd->buf;
     surface.group_id   = MEMSLOT_GROUP_HOST;
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(ssd);
     ssd->worker->create_primary_surface(ssd->worker, 0, &surface);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(ssd);
 }
 
 void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd)
 {
     dprint(1, "%s:\n", __FUNCTION__);
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(ssd);
     ssd->worker->destroy_primary_surface(ssd->worker, 0);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(ssd);
 }
 
 void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason)
@@ -207,9 +207,9 @@ void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason)
     if (running) {
         ssd->worker->start(ssd->worker);
     } else {
-        qemu_mutex_unlock_iothread();
+        qxl_unlock_iothread(ssd);
         ssd->worker->stop(ssd->worker);
-        qemu_mutex_lock_iothread();
+        qxl_lock_iothread(ssd);
     }
     ssd->running = running;
 }
diff --git a/ui/spice-display.h b/ui/spice-display.h
index aef0464..df74828 100644
--- a/ui/spice-display.h
+++ b/ui/spice-display.h
@@ -43,6 +43,9 @@ typedef struct SimpleSpiceDisplay {
     QXLRect dirty;
     int notify;
     int running;
+
+    /* qemu-kvm locking ... */
+    void *env;
 } SimpleSpiceDisplay;
 
 typedef struct SimpleSpiceUpdate {
@@ -52,6 +55,9 @@ typedef struct SimpleSpiceUpdate {
     uint8_t *bitmap;
 } SimpleSpiceUpdate;
 
+void qxl_unlock_iothread(SimpleSpiceDisplay *ssd);
+void qxl_lock_iothread(SimpleSpiceDisplay *ssd);
+
 int qemu_spice_rect_is_empty(const QXLRect* r);
 void qemu_spice_rect_union(QXLRect *dest, const QXLRect *r);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:03     ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:03 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7473 bytes --]

On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> On 2011-02-26 12:43, xming wrote:
> > When trying to start X (and it loads qxl driver) the kvm process just crashes.

This is fixed by Gerd's attached patch (taken from rhel repository, don't know
why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).

> > 
> > qemu-kvm 0.14
> > 
> > startup line
> > 
> > /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > 
> > host is running vanilla 2.6.37.1 on amd64.
> > 
> > Here is the bt
> > 
> > # gdb /usr/bin/qemu-system-x86_64
> > GNU gdb (Gentoo 7.2 p1) 7.2
> > Copyright (C) 2010 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-pc-linux-gnu".
> > For bug reporting instructions, please see:
> > <http://bugs.gentoo.org/>...
> > Reading symbols from /usr/bin/qemu-system-x86_64...done.
> > (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > (gdb) run
> > Starting program: /usr/bin/qemu-system-x86_64 -name
> > spaceball,process=spaceball -m 1024 -kernel
> > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev
> > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device
> > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive
> > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice
> > port=5957,disable-ticketing -monitor
> > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile
> > /var/run/kvm/spaceball.pid
> > [Thread debugging using libthread_db enabled]
> > do_spice_init: starting 0.6.0
> > spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
> > spice_server_add_interface: SPICE_INTERFACE_MOUSE
> > [New Thread 0x7ffff4802710 (LWP 30294)]
> > spice_server_add_interface: SPICE_INTERFACE_QXL
> > [New Thread 0x7fffaacae710 (LWP 30295)]
> > red_worker_main: begin
> > handle_dev_destroy_surfaces:
> > handle_dev_destroy_surfaces:
> > handle_dev_input: start
> > [New Thread 0x7fffaa4ad710 (LWP 30298)]
> > [New Thread 0x7fffa9cac710 (LWP 30299)]
> > [New Thread 0x7fffa94ab710 (LWP 30300)]
> > [New Thread 0x7fffa8caa710 (LWP 30301)]
> > [New Thread 0x7fffa3fff710 (LWP 30302)]
> > [New Thread 0x7fffa37fe710 (LWP 30303)]
> > [New Thread 0x7fffa2ffd710 (LWP 30304)]
> > [New Thread 0x7fffa27fc710 (LWP 30305)]
> > [New Thread 0x7fffa1ffb710 (LWP 30306)]
> > [New Thread 0x7fffa17fa710 (LWP 30307)]
> > reds_handle_main_link:
> > reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link
> > reds_main_handle_message: net test: latency 5.636000 ms, bitrate
> > 11027768 bps (10.516899 Mbps)
> > reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link
> > red_dispatcher_set_peer:
> > handle_dev_input: connect
> > handle_new_display_channel: jpeg disabled
> > handle_new_display_channel: zlib-over-glz disabled
> > reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link
> > red_dispatcher_set_cursor_peer:
> > handle_dev_input: cursor connect
> > reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link
> > inputs_link:
> > [New Thread 0x7fffa07f8710 (LWP 30312)]
> > [New Thread 0x7fff9fff7710 (LWP 30313)]
> > [New Thread 0x7fff9f7f6710 (LWP 30314)]
> > [New Thread 0x7fff9eff5710 (LWP 30315)]
> > [New Thread 0x7fff9e7f4710 (LWP 30316)]
> > [New Thread 0x7fff9dff3710 (LWP 30317)]
> > [New Thread 0x7fff9d7f2710 (LWP 30318)]
> > qemu-system-x86_64:
> > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> > kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> > 
> > Program received signal SIGABRT, Aborted.
> > [Switching to Thread 0x7ffff4802710 (LWP 30294)]
> > 0x00007ffff5daa165 in raise () from /lib/libc.so.6
> > (gdb)
> > (gdb)
> > (gdb)
> > (gdb)
> > (gdb) bt
> > #0  0x00007ffff5daa165 in raise () from /lib/libc.so.6
> > #1  0x00007ffff5dab580 in abort () from /lib/libc.so.6
> > #2  0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6
> > #3  0x0000000000436f7e in kvm_mutex_unlock ()
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724
> > #4  qemu_mutex_unlock_iothread ()
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737
> > #5  0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665
> > #6  0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value
> > optimized out>, val=0)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979
> > #7  0x0000000000439d4e in kvm_handle_io (env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818
> > #8  kvm_run (env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617
> > #9  0x0000000000439f79 in kvm_cpu_exec (env=0x764b)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233
> > #10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419
> > #11 ap_main_loop (_env=0x11a3e00)
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > (gdb)
> 
> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
> 
> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>    cpu_single_env across the unlocked phase (that's related to the
>    abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>    enable re-entrances of code sections that weren't designed for this
>    (I'm skeptic about the side effects of
>    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> 
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
> 
> Jan
> 



[-- Attachment #2: 0001-spice-qxl-locking-fix-for-qemu-kvm.patch --]
[-- Type: text/plain, Size: 5329 bytes --]

>From bef7336f79e3325451d2ead120b958a401b0ccc4 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Mon, 24 Jan 2011 09:18:32 -0200
Subject: [PATCH] spice/qxl: locking fix for qemu-kvm

qxl needs to release the qemu lock before calling some libspice
functions (and re-aquire it later).  In upstream qemu qxl can just
use qemu_mutex_{unlock,lock}_iothread.  In qemu-kvm this doesn't
work, qxl needs additionally save+restore the cpu_single_env pointer
on unlock+lock.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/qxl.c           |   37 +++++++++++++++++++++++++++++--------
 ui/spice-display.c |   12 ++++++------
 ui/spice-display.h |    6 ++++++
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index fe4212b..117f7c8 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -125,6 +125,27 @@ static void qxl_reset_memslots(PCIQXLDevice *d);
 static void qxl_reset_surfaces(PCIQXLDevice *d);
 static void qxl_ring_set_dirty(PCIQXLDevice *qxl);
 
+/* qemu-kvm locking ... */
+void qxl_unlock_iothread(SimpleSpiceDisplay *ssd)
+{
+    if (cpu_single_env) {
+        assert(ssd->env == NULL);
+        ssd->env = cpu_single_env;
+        cpu_single_env = NULL;
+    }
+    qemu_mutex_unlock_iothread();
+}
+
+void qxl_lock_iothread(SimpleSpiceDisplay *ssd)
+{
+    qemu_mutex_lock_iothread();
+    if (ssd->env) {
+        assert(cpu_single_env == NULL);
+        cpu_single_env = ssd->env;
+        ssd->env = NULL;
+    }
+}
+
 static inline uint32_t msb_mask(uint32_t val)
 {
     uint32_t mask;
@@ -662,10 +683,10 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm)
     dprint(d, 1, "%s: start%s\n", __FUNCTION__,
            loadvm ? " (loadvm)" : "");
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->reset_cursor(d->ssd.worker);
     d->ssd.worker->reset_image_cache(d->ssd.worker);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
     qxl_reset_surfaces(d);
     qxl_reset_memslots(d);
 
@@ -795,9 +816,9 @@ static void qxl_reset_surfaces(PCIQXLDevice *d)
 {
     dprint(d, 1, "%s:\n", __FUNCTION__);
     d->mode = QXL_MODE_UNDEFINED;
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->destroy_surfaces(d->ssd.worker);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
     memset(&d->guest_surfaces.cmds, 0, sizeof(d->guest_surfaces.cmds));
 }
 
@@ -866,9 +887,9 @@ static void qxl_destroy_primary(PCIQXLDevice *d)
     dprint(d, 1, "%s\n", __FUNCTION__);
 
     d->mode = QXL_MODE_UNDEFINED;
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(&d->ssd);
     d->ssd.worker->destroy_primary_surface(d->ssd.worker, 0);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(&d->ssd);
 }
 
 static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm)
@@ -938,10 +959,10 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val)
     case QXL_IO_UPDATE_AREA:
     {
         QXLRect update = d->ram->update_area;
-        qemu_mutex_unlock_iothread();
+        qxl_unlock_iothread(&d->ssd);
         d->ssd.worker->update_area(d->ssd.worker, d->ram->update_surface,
                                    &update, NULL, 0, 0);
-        qemu_mutex_lock_iothread();
+        qxl_lock_iothread(&d->ssd);
         break;
     }
     case QXL_IO_NOTIFY_CMD:
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 020b423..defe652 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -186,18 +186,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd)
     surface.mem        = (intptr_t)ssd->buf;
     surface.group_id   = MEMSLOT_GROUP_HOST;
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(ssd);
     ssd->worker->create_primary_surface(ssd->worker, 0, &surface);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(ssd);
 }
 
 void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd)
 {
     dprint(1, "%s:\n", __FUNCTION__);
 
-    qemu_mutex_unlock_iothread();
+    qxl_unlock_iothread(ssd);
     ssd->worker->destroy_primary_surface(ssd->worker, 0);
-    qemu_mutex_lock_iothread();
+    qxl_lock_iothread(ssd);
 }
 
 void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason)
@@ -207,9 +207,9 @@ void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason)
     if (running) {
         ssd->worker->start(ssd->worker);
     } else {
-        qemu_mutex_unlock_iothread();
+        qxl_unlock_iothread(ssd);
         ssd->worker->stop(ssd->worker);
-        qemu_mutex_lock_iothread();
+        qxl_lock_iothread(ssd);
     }
     ssd->running = running;
 }
diff --git a/ui/spice-display.h b/ui/spice-display.h
index aef0464..df74828 100644
--- a/ui/spice-display.h
+++ b/ui/spice-display.h
@@ -43,6 +43,9 @@ typedef struct SimpleSpiceDisplay {
     QXLRect dirty;
     int notify;
     int running;
+
+    /* qemu-kvm locking ... */
+    void *env;
 } SimpleSpiceDisplay;
 
 typedef struct SimpleSpiceUpdate {
@@ -52,6 +55,9 @@ typedef struct SimpleSpiceUpdate {
     uint8_t *bitmap;
 } SimpleSpiceUpdate;
 
+void qxl_unlock_iothread(SimpleSpiceDisplay *ssd);
+void qxl_lock_iothread(SimpleSpiceDisplay *ssd);
+
 int qemu_spice_rect_is_empty(const QXLRect* r);
 void qemu_spice_rect_union(QXLRect *dest, const QXLRect *r);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:03     ` Alon Levy
@ 2011-02-27 19:11       ` Jan Kiszka
  -1 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-27 19:11 UTC (permalink / raw)
  To: Alon Levy; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

On 2011-02-27 20:03, Alon Levy wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>> On 2011-02-26 12:43, xming wrote:
>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> 
> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).

Patch looks OK on first glance, but the changelog is misleading: This
was broken for _both_ trees, but upstream didn't detect the bug.

My concerns regarding other side effects of juggling with global mutex
in spice code remain.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:11       ` Jan Kiszka
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-27 19:11 UTC (permalink / raw)
  To: Alon Levy; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

On 2011-02-27 20:03, Alon Levy wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>> On 2011-02-26 12:43, xming wrote:
>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> 
> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).

Patch looks OK on first glance, but the changelog is misleading: This
was broken for _both_ trees, but upstream didn't detect the bug.

My concerns regarding other side effects of juggling with global mutex
in spice code remain.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:11       ` Jan Kiszka
@ 2011-02-27 19:16         ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:16 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:03, Alon Levy wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >> On 2011-02-26 12:43, xming wrote:
> >>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> > 
> > This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> 
> Patch looks OK on first glance, but the changelog is misleading: This
> was broken for _both_ trees, but upstream didn't detect the bug.
> 

The trees the patch commit message refers to are qemu and qemu-kvm. qemu doesn't even
have cpu_single_env. It didn't talk about two qemu-kvm trees.

> My concerns regarding other side effects of juggling with global mutex
> in spice code remain.

I know there used to be a mutex in spice code and during the upstreaming process it
got ditched in favor of the qemu global io mutex. I would have rather deferred this
to Gerd since he wrote this, but he is not available atm.

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:16         ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:16 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:03, Alon Levy wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >> On 2011-02-26 12:43, xming wrote:
> >>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> > 
> > This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> 
> Patch looks OK on first glance, but the changelog is misleading: This
> was broken for _both_ trees, but upstream didn't detect the bug.
> 

The trees the patch commit message refers to are qemu and qemu-kvm. qemu doesn't even
have cpu_single_env. It didn't talk about two qemu-kvm trees.

> My concerns regarding other side effects of juggling with global mutex
> in spice code remain.

I know there used to be a mutex in spice code and during the upstreaming process it
got ditched in favor of the qemu global io mutex. I would have rather deferred this
to Gerd since he wrote this, but he is not available atm.

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:16         ` Alon Levy
@ 2011-02-27 19:27           ` Jan Kiszka
  -1 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-27 19:27 UTC (permalink / raw)
  To: Alon Levy; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]

On 2011-02-27 20:16, Alon Levy wrote:
> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
>> On 2011-02-27 20:03, Alon Levy wrote:
>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-26 12:43, xming wrote:
>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
>>>
>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
>>
>> Patch looks OK on first glance, but the changelog is misleading: This
>> was broken for _both_ trees, but upstream didn't detect the bug.
>>
> 
> The trees the patch commit message refers to are qemu and qemu-kvm.

The same did I.

> qemu doesn't even have cpu_single_env.

Really? Check again. :)

> It didn't talk about two qemu-kvm trees.
> 
>> My concerns regarding other side effects of juggling with global mutex
>> in spice code remain.
> 
> I know there used to be a mutex in spice code and during the upstreaming process it
> got ditched in favor of the qemu global io mutex. I would have rather deferred this
> to Gerd since he wrote this, but he is not available atm.

It's not necessarily bad to drop the io mutex, but it is more tricky
than it may appear on first glance.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:27           ` Jan Kiszka
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-02-27 19:27 UTC (permalink / raw)
  To: Alon Levy; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]

On 2011-02-27 20:16, Alon Levy wrote:
> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
>> On 2011-02-27 20:03, Alon Levy wrote:
>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-26 12:43, xming wrote:
>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
>>>
>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
>>
>> Patch looks OK on first glance, but the changelog is misleading: This
>> was broken for _both_ trees, but upstream didn't detect the bug.
>>
> 
> The trees the patch commit message refers to are qemu and qemu-kvm.

The same did I.

> qemu doesn't even have cpu_single_env.

Really? Check again. :)

> It didn't talk about two qemu-kvm trees.
> 
>> My concerns regarding other side effects of juggling with global mutex
>> in spice code remain.
> 
> I know there used to be a mutex in spice code and during the upstreaming process it
> got ditched in favor of the qemu global io mutex. I would have rather deferred this
> to Gerd since he wrote this, but he is not available atm.

It's not necessarily bad to drop the io mutex, but it is more tricky
than it may appear on first glance.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:27           ` Jan Kiszka
@ 2011-02-27 19:29             ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:16, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> >>
> > 
> > The trees the patch commit message refers to are qemu and qemu-kvm.
> 
> The same did I.
> 
> > qemu doesn't even have cpu_single_env.
> 
> Really? Check again. :)

Sorry, grepped the wrong repo. I'll send this to qemu-devel too then.

> 
> > It didn't talk about two qemu-kvm trees.
> > 
> >> My concerns regarding other side effects of juggling with global mutex
> >> in spice code remain.
> > 
> > I know there used to be a mutex in spice code and during the upstreaming process it
> > got ditched in favor of the qemu global io mutex. I would have rather deferred this
> > to Gerd since he wrote this, but he is not available atm.
> 
> It's not necessarily bad to drop the io mutex, but it is more tricky
> than it may appear on first glance.
> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:29             ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:16, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> >>
> > 
> > The trees the patch commit message refers to are qemu and qemu-kvm.
> 
> The same did I.
> 
> > qemu doesn't even have cpu_single_env.
> 
> Really? Check again. :)

Sorry, grepped the wrong repo. I'll send this to qemu-devel too then.

> 
> > It didn't talk about two qemu-kvm trees.
> > 
> >> My concerns regarding other side effects of juggling with global mutex
> >> in spice code remain.
> > 
> > I know there used to be a mutex in spice code and during the upstreaming process it
> > got ditched in favor of the qemu global io mutex. I would have rather deferred this
> > to Gerd since he wrote this, but he is not available atm.
> 
> It's not necessarily bad to drop the io mutex, but it is more tricky
> than it may appear on first glance.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:27           ` Jan Kiszka
@ 2011-02-27 19:32             ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:16, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> >>
> > 
> > The trees the patch commit message refers to are qemu and qemu-kvm.
> 
> The same did I.
> 
> > qemu doesn't even have cpu_single_env.
> 
> Really? Check again. :)
> 
> > It didn't talk about two qemu-kvm trees.
> > 
> >> My concerns regarding other side effects of juggling with global mutex
> >> in spice code remain.
> > 
> > I know there used to be a mutex in spice code and during the upstreaming process it
> > got ditched in favor of the qemu global io mutex. I would have rather deferred this
> > to Gerd since he wrote this, but he is not available atm.
> 
> It's not necessarily bad to drop the io mutex, but it is more tricky
> than it may appear on first glance.

The problem with not dropping it is that we may be in vga mode and create
updates synthtically (i.e. qemu created and not driver created) that access the
framebuffer and need to be locked so the framebuffer isn't updated at the same
time. We drop the mutex only when we are about to call the dispatcher, which basically
waits on red_worker (a libspice-server thread) to do some work. red_worker may
in turn callback into qxl in qemu, which may try to acquire the lock. (the
many may's here are just reflections of the codepaths).

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-02-27 19:32             ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-02-27 19:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Sun, Feb 27, 2011 at 08:27:01PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:16, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> >>
> > 
> > The trees the patch commit message refers to are qemu and qemu-kvm.
> 
> The same did I.
> 
> > qemu doesn't even have cpu_single_env.
> 
> Really? Check again. :)
> 
> > It didn't talk about two qemu-kvm trees.
> > 
> >> My concerns regarding other side effects of juggling with global mutex
> >> in spice code remain.
> > 
> > I know there used to be a mutex in spice code and during the upstreaming process it
> > got ditched in favor of the qemu global io mutex. I would have rather deferred this
> > to Gerd since he wrote this, but he is not available atm.
> 
> It's not necessarily bad to drop the io mutex, but it is more tricky
> than it may appear on first glance.

The problem with not dropping it is that we may be in vga mode and create
updates synthtically (i.e. qemu created and not driver created) that access the
framebuffer and need to be locked so the framebuffer isn't updated at the same
time. We drop the mutex only when we are about to call the dispatcher, which basically
waits on red_worker (a libspice-server thread) to do some work. red_worker may
in turn callback into qxl in qemu, which may try to acquire the lock. (the
many may's here are just reflections of the codepaths).

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:03     ` Alon Levy
  (?)
  (?)
@ 2011-02-28 12:56     ` xming
  -1 siblings, 0 replies; 56+ messages in thread
From: xming @ 2011-02-28 12:56 UTC (permalink / raw)
  To: qemu-devel, kvm; +Cc: Alon Levy

On Sun, Feb 27, 2011 at 8:03 PM, Alon Levy <alevy@redhat.com> wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>> On 2011-02-26 12:43, xming wrote:
>> > When trying to start X (and it loads qxl driver) the kvm process just crashes.
>
> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).

I can confirm that this patch fixes the issue, thanks a lot

cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Re: kvm crashes with spice while loading qxl
  2011-02-27 19:03     ` Alon Levy
@ 2011-03-01  3:56       ` Rick Vernam
  -1 siblings, 0 replies; 56+ messages in thread
From: Rick Vernam @ 2011-03-01  3:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, xming, Alon Levy, Gerd Hoffmann, kvm

On Sunday 27 February 2011 13:03:14 Alon Levy wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > On 2011-02-26 12:43, xming wrote:
> > > When trying to start X (and it loads qxl driver) the kvm process just
> > > crashes.
> 
> This is fixed by Gerd's attached patch (taken from rhel repository, don't
> know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list
> as well (separate email).
> 

This patch also fixed
https://bugs.launchpad.net/bugs/723871
I created the bug report on launchpad, but I suppose it should be left open 
until the patch hits qemu-kvm?

-Rick

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-01  3:56       ` Rick Vernam
  0 siblings, 0 replies; 56+ messages in thread
From: Rick Vernam @ 2011-03-01  3:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, xming, Alon Levy, Gerd Hoffmann, kvm

On Sunday 27 February 2011 13:03:14 Alon Levy wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > On 2011-02-26 12:43, xming wrote:
> > > When trying to start X (and it loads qxl driver) the kvm process just
> > > crashes.
> 
> This is fixed by Gerd's attached patch (taken from rhel repository, don't
> know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list
> as well (separate email).
> 

This patch also fixed
https://bugs.launchpad.net/bugs/723871
I created the bug report on launchpad, but I suppose it should be left open 
until the patch hits qemu-kvm?

-Rick

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-02-27 19:11       ` Jan Kiszka
@ 2011-03-01 12:58         ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-01 12:58 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:03, Alon Levy wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >> On 2011-02-26 12:43, xming wrote:
> >>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> > 
> > This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> 
> Patch looks OK on first glance, but the changelog is misleading: This
> was broken for _both_ trees, but upstream didn't detect the bug.

So I didn't test with qemu not having this patch, but according to the discussion in the
launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
bug, perhaps it is just triggered much less frequently I guess.

> 
> My concerns regarding other side effects of juggling with global mutex
> in spice code remain.
> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-01 12:58         ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-01 12:58 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> On 2011-02-27 20:03, Alon Levy wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >> On 2011-02-26 12:43, xming wrote:
> >>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> > 
> > This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> > why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> 
> Patch looks OK on first glance, but the changelog is misleading: This
> was broken for _both_ trees, but upstream didn't detect the bug.

So I didn't test with qemu not having this patch, but according to the discussion in the
launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
bug, perhaps it is just triggered much less frequently I guess.

> 
> My concerns regarding other side effects of juggling with global mutex
> in spice code remain.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-03-01 12:58         ` Alon Levy
  (?)
@ 2011-03-02  8:22         ` Jan Kiszka
  2011-03-02 10:56             ` Alon Levy
  -1 siblings, 1 reply; 56+ messages in thread
From: Jan Kiszka @ 2011-03-02  8:22 UTC (permalink / raw)
  To: xming, Gerd Hoffmann, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]

On 2011-03-01 13:58, Alon Levy wrote:
> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
>> On 2011-02-27 20:03, Alon Levy wrote:
>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-26 12:43, xming wrote:
>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
>>>
>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
>>
>> Patch looks OK on first glance, but the changelog is misleading: This
>> was broken for _both_ trees, but upstream didn't detect the bug.
> 
> So I didn't test with qemu not having this patch, but according to the discussion in the
> launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
> bug, perhaps it is just triggered much less frequently I guess.

Again: qemu-kvm has the instrumentation to detect the bug, qemu is
lacking this, but both trees will break subtly if cpu_current_env is not
properly restored.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-03-02  8:22         ` Jan Kiszka
@ 2011-03-02 10:56             ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-02 10:56 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote:
> On 2011-03-01 13:58, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> > 
> > So I didn't test with qemu not having this patch, but according to the discussion in the
> > launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
> > bug, perhaps it is just triggered much less frequently I guess.
> 
> Again: qemu-kvm has the instrumentation to detect the bug, qemu is
> lacking this, but both trees will break subtly if cpu_current_env is not
> properly restored.

ok, so what do you want to be done further before this patch is applied?

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-02 10:56             ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-02 10:56 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote:
> On 2011-03-01 13:58, Alon Levy wrote:
> > On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >> On 2011-02-27 20:03, Alon Levy wrote:
> >>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-26 12:43, xming wrote:
> >>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>
> >>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>
> >> Patch looks OK on first glance, but the changelog is misleading: This
> >> was broken for _both_ trees, but upstream didn't detect the bug.
> > 
> > So I didn't test with qemu not having this patch, but according to the discussion in the
> > launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
> > bug, perhaps it is just triggered much less frequently I guess.
> 
> Again: qemu-kvm has the instrumentation to detect the bug, qemu is
> lacking this, but both trees will break subtly if cpu_current_env is not
> properly restored.

ok, so what do you want to be done further before this patch is applied?

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-03-02 10:56             ` Alon Levy
  (?)
@ 2011-03-02 11:34             ` Jan Kiszka
  2011-03-02 12:32                 ` Alon Levy
  -1 siblings, 1 reply; 56+ messages in thread
From: Jan Kiszka @ 2011-03-02 11:34 UTC (permalink / raw)
  To: xming, Gerd Hoffmann, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 1438 bytes --]

On 2011-03-02 11:56, Alon Levy wrote:
> On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote:
>> On 2011-03-01 13:58, Alon Levy wrote:
>>> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-27 20:03, Alon Levy wrote:
>>>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
>>>>>> On 2011-02-26 12:43, xming wrote:
>>>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
>>>>>
>>>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
>>>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
>>>>
>>>> Patch looks OK on first glance, but the changelog is misleading: This
>>>> was broken for _both_ trees, but upstream didn't detect the bug.
>>>
>>> So I didn't test with qemu not having this patch, but according to the discussion in the
>>> launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
>>> bug, perhaps it is just triggered much less frequently I guess.
>>
>> Again: qemu-kvm has the instrumentation to detect the bug, qemu is
>> lacking this, but both trees will break subtly if cpu_current_env is not
>> properly restored.
> 
> ok, so what do you want to be done further before this patch is applied?

The patch posted to qemu-devel just requires a changelog that correctly
reflects what it addresses (and where).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
  2011-03-02 11:34             ` Jan Kiszka
@ 2011-03-02 12:32                 ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-02 12:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, qemu-devel, kvm

On Wed, Mar 02, 2011 at 12:34:24PM +0100, Jan Kiszka wrote:
> On 2011-03-02 11:56, Alon Levy wrote:
> > On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote:
> >> On 2011-03-01 13:58, Alon Levy wrote:
> >>> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-27 20:03, Alon Levy wrote:
> >>>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>>>> On 2011-02-26 12:43, xming wrote:
> >>>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>>>
> >>>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>>>
> >>>> Patch looks OK on first glance, but the changelog is misleading: This
> >>>> was broken for _both_ trees, but upstream didn't detect the bug.
> >>>
> >>> So I didn't test with qemu not having this patch, but according to the discussion in the
> >>> launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
> >>> bug, perhaps it is just triggered much less frequently I guess.
> >>
> >> Again: qemu-kvm has the instrumentation to detect the bug, qemu is
> >> lacking this, but both trees will break subtly if cpu_current_env is not
> >> properly restored.
> > 
> > ok, so what do you want to be done further before this patch is applied?
> 
> The patch posted to qemu-devel just requires a changelog that correctly
> reflects what it addresses (and where).

Just sent,

Alon

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-02 12:32                 ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-02 12:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, Gerd Hoffmann, kvm, qemu-devel

On Wed, Mar 02, 2011 at 12:34:24PM +0100, Jan Kiszka wrote:
> On 2011-03-02 11:56, Alon Levy wrote:
> > On Wed, Mar 02, 2011 at 09:22:35AM +0100, Jan Kiszka wrote:
> >> On 2011-03-01 13:58, Alon Levy wrote:
> >>> On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote:
> >>>> On 2011-02-27 20:03, Alon Levy wrote:
> >>>>> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >>>>>> On 2011-02-26 12:43, xming wrote:
> >>>>>>> When trying to start X (and it loads qxl driver) the kvm process just crashes.
> >>>>>
> >>>>> This is fixed by Gerd's attached patch (taken from rhel repository, don't know
> >>>>> why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email).
> >>>>
> >>>> Patch looks OK on first glance, but the changelog is misleading: This
> >>>> was broken for _both_ trees, but upstream didn't detect the bug.
> >>>
> >>> So I didn't test with qemu not having this patch, but according to the discussion in the
> >>> launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a
> >>> bug, perhaps it is just triggered much less frequently I guess.
> >>
> >> Again: qemu-kvm has the instrumentation to detect the bug, qemu is
> >> lacking this, but both trees will break subtly if cpu_current_env is not
> >> properly restored.
> > 
> > ok, so what do you want to be done further before this patch is applied?
> 
> The patch posted to qemu-devel just requires a changelog that correctly
> reflects what it addresses (and where).

Just sent,

Alon

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
@ 2011-03-05 16:35     ` Marcelo Tosatti
  -1 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-05 16:35 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini, Avi Kivity

On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > (gdb)
> 
> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
> 
> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>    cpu_single_env across the unlocked phase (that's related to the
>    abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>    enable re-entrances of code sections that weren't designed for this
>    (I'm skeptic about the side effects of
>    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> 
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
> 
> Jan
> 

Agree with the concern regarding spice.

Regarding global mutex, TCG and KVM execution behaviour can become more
similar wrt locking by dropping qemu_global_mutex during generation and
execution of TBs.

Of course for memory or PIO accesses from vcpu context qemu_global_mutex
must be acquired.

With that in place, it becomes easier to justify further improvements
regarding parallelization, such as using a read-write lock for
l1_phys_map / phys_page_find_alloc.


 21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f                  
  6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc              
  4.90%               sh  qemu-system-x86_64    [.] tb_find_fast                      
  4.34%               sh  qemu-system-x86_64    [.] tlb_flush  


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-05 16:35     ` Marcelo Tosatti
  0 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-05 16:35 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, qemu-devel, xming, Avi Kivity, Paolo Bonzini, Gerd Hoffmann

On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > (gdb)
> 
> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
> 
> Two general issues with dropping the global mutex like this:
>  - The caller of mutex_unlock is responsible for maintaining
>    cpu_single_env across the unlocked phase (that's related to the
>    abort above).
>  - Dropping the lock in the middle of a callback is risky. That may
>    enable re-entrances of code sections that weren't designed for this
>    (I'm skeptic about the side effects of
>    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> 
> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).
> 
> Jan
> 

Agree with the concern regarding spice.

Regarding global mutex, TCG and KVM execution behaviour can become more
similar wrt locking by dropping qemu_global_mutex during generation and
execution of TBs.

Of course for memory or PIO accesses from vcpu context qemu_global_mutex
must be acquired.

With that in place, it becomes easier to justify further improvements
regarding parallelization, such as using a read-write lock for
l1_phys_map / phys_page_find_alloc.


 21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f                  
  6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc              
  4.90%               sh  qemu-system-x86_64    [.] tb_find_fast                      
  4.34%               sh  qemu-system-x86_64    [.] tlb_flush  

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-05 16:35     ` [Qemu-devel] " Marcelo Tosatti
@ 2011-03-05 17:11       ` Paolo Bonzini
  -1 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-05 17:11 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Avi Kivity

On 03/05/2011 05:35 PM, Marcelo Tosatti wrote:
> TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.
>
> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.

-icount already has most of the machinery needed for this.

At this point, I think people interested in TCG should choose between 
staying with legacy and gaining a bit more speed for uniprocessor 
simulation, or following KVM with a more complex architecture and 
finer-grained locking---but also a more future-proof design.

BTW, I'll post soon patches for iothread -icount that work (at least in 
my tests), are much simpler than anything I posted so far, and hopefully 
will show that iothread is not fundamentally incompatible with anything 
(and in fact can simplify things much more than complicate them).

> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.

Or URCU (userspace RCU), too.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-05 17:11       ` Paolo Bonzini
  0 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-05 17:11 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Avi Kivity, Gerd Hoffmann

On 03/05/2011 05:35 PM, Marcelo Tosatti wrote:
> TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.
>
> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.

-icount already has most of the machinery needed for this.

At this point, I think people interested in TCG should choose between 
staying with legacy and gaining a bit more speed for uniprocessor 
simulation, or following KVM with a more complex architecture and 
finer-grained locking---but also a more future-proof design.

BTW, I'll post soon patches for iothread -icount that work (at least in 
my tests), are much simpler than anything I posted so far, and hopefully 
will show that iothread is not fundamentally incompatible with anything 
(and in fact can simplify things much more than complicate them).

> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.

Or URCU (userspace RCU), too.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-05 16:35     ` [Qemu-devel] " Marcelo Tosatti
@ 2011-03-06 10:30       ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-06 10:30 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini,
	Avi Kivity

On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > > (gdb)
> > 
> > That's a spice bug. In fact, there are a lot of
> > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> > of them can cause even more subtle problems.
> > 
> > Two general issues with dropping the global mutex like this:
> >  - The caller of mutex_unlock is responsible for maintaining
> >    cpu_single_env across the unlocked phase (that's related to the
> >    abort above).
> >  - Dropping the lock in the middle of a callback is risky. That may
> >    enable re-entrances of code sections that weren't designed for this
> >    (I'm skeptic about the side effects of
> >    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> > 
> > Spice requires a careful review regarding such issues. Or it should
> > pioneer with introducing its own lock so that we can handle at least
> > related I/O activities over the VCPUs without holding the global mutex
> > (but I bet it's not the simplest candidate for such a new scheme).
> > 
> > Jan
> > 
> 
> Agree with the concern regarding spice.
> 

What are the pros and cons of (re)introducing a spice specific lock?
 + simplicity. Only spice touches the spice lock.
 - ? what were the original reasons for Gerd dropping the spice lock?

I have no problem reintroducing this lock, I'm just concerned that it's
wasted effort because after I send that patch someone will jump and remind
me why it was removed in the first place.

> Regarding global mutex, TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.
> 
> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.
> 
> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.
> 
> 
>  21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f                  
>   6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc              
>   4.90%               sh  qemu-system-x86_64    [.] tb_find_fast                      
>   4.34%               sh  qemu-system-x86_64    [.] tlb_flush  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-06 10:30       ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-03-06 10:30 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann, Paolo Bonzini,
	Avi Kivity

On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote:
> On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > > (gdb)
> > 
> > That's a spice bug. In fact, there are a lot of
> > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> > of them can cause even more subtle problems.
> > 
> > Two general issues with dropping the global mutex like this:
> >  - The caller of mutex_unlock is responsible for maintaining
> >    cpu_single_env across the unlocked phase (that's related to the
> >    abort above).
> >  - Dropping the lock in the middle of a callback is risky. That may
> >    enable re-entrances of code sections that weren't designed for this
> >    (I'm skeptic about the side effects of
> >    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> > 
> > Spice requires a careful review regarding such issues. Or it should
> > pioneer with introducing its own lock so that we can handle at least
> > related I/O activities over the VCPUs without holding the global mutex
> > (but I bet it's not the simplest candidate for such a new scheme).
> > 
> > Jan
> > 
> 
> Agree with the concern regarding spice.
> 

What are the pros and cons of (re)introducing a spice specific lock?
 + simplicity. Only spice touches the spice lock.
 - ? what were the original reasons for Gerd dropping the spice lock?

I have no problem reintroducing this lock, I'm just concerned that it's
wasted effort because after I send that patch someone will jump and remind
me why it was removed in the first place.

> Regarding global mutex, TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.
> 
> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.
> 
> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.
> 
> 
>  21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f                  
>   6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc              
>   4.90%               sh  qemu-system-x86_64    [.] tb_find_fast                      
>   4.34%               sh  qemu-system-x86_64    [.] tlb_flush  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-05 16:35     ` [Qemu-devel] " Marcelo Tosatti
@ 2011-03-06 10:38       ` Avi Kivity
  -1 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-06 10:38 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini

On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> Regarding global mutex, TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.

How can you do that?  During generation, a device can assert the reset 
line, changing cpu modes, or move the memory map.

During execution, tcg accesses memory a lot.  So we'll need to acquire 
qemu_global_mutex for every memory access, and separate protection for TB.

kvm achieves lockless protection by forcing vcpus off and dropping their 
page tables while executing natively, and using srcu while emulating.  
We can do something similar for tcg, but it won't be easy.

> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.

Yes, and not just mmio - all memory accesses.

> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.
>
>
>   21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f
>    6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc

should be replaced by a memslot list probably

>    4.90%               sh  qemu-system-x86_64    [.] tb_find_fast
>    4.34%               sh  qemu-system-x86_64    [.] tlb_flush
>


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-06 10:38       ` Avi Kivity
  0 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-06 10:38 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann, Paolo Bonzini

On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> Regarding global mutex, TCG and KVM execution behaviour can become more
> similar wrt locking by dropping qemu_global_mutex during generation and
> execution of TBs.

How can you do that?  During generation, a device can assert the reset 
line, changing cpu modes, or move the memory map.

During execution, tcg accesses memory a lot.  So we'll need to acquire 
qemu_global_mutex for every memory access, and separate protection for TB.

kvm achieves lockless protection by forcing vcpus off and dropping their 
page tables while executing natively, and using srcu while emulating.  
We can do something similar for tcg, but it won't be easy.

> Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> must be acquired.

Yes, and not just mmio - all memory accesses.

> With that in place, it becomes easier to justify further improvements
> regarding parallelization, such as using a read-write lock for
> l1_phys_map / phys_page_find_alloc.
>
>
>   21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f
>    6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc

should be replaced by a memslot list probably

>    4.90%               sh  qemu-system-x86_64    [.] tb_find_fast
>    4.34%               sh  qemu-system-x86_64    [.] tlb_flush
>


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-06 10:30       ` [Qemu-devel] " Alon Levy
@ 2011-03-07 16:02         ` Marcelo Tosatti
  -1 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-07 16:02 UTC (permalink / raw)
  To: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini

On Sun, Mar 06, 2011 at 12:30:59PM +0200, Alon Levy wrote:
> On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > > >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > > > (gdb)
> > > 
> > > That's a spice bug. In fact, there are a lot of
> > > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> > > of them can cause even more subtle problems.
> > > 
> > > Two general issues with dropping the global mutex like this:
> > >  - The caller of mutex_unlock is responsible for maintaining
> > >    cpu_single_env across the unlocked phase (that's related to the
> > >    abort above).
> > >  - Dropping the lock in the middle of a callback is risky. That may
> > >    enable re-entrances of code sections that weren't designed for this
> > >    (I'm skeptic about the side effects of
> > >    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> > > 
> > > Spice requires a careful review regarding such issues. Or it should
> > > pioneer with introducing its own lock so that we can handle at least
> > > related I/O activities over the VCPUs without holding the global mutex
> > > (but I bet it's not the simplest candidate for such a new scheme).
> > > 
> > > Jan
> > > 
> > 
> > Agree with the concern regarding spice.
> > 
> 
> What are the pros and cons of (re)introducing a spice specific lock?
>  + simplicity. Only spice touches the spice lock.
>  - ? what were the original reasons for Gerd dropping the spice lock?
> 
> I have no problem reintroducing this lock, I'm just concerned that it's
> wasted effort because after I send that patch someone will jump and remind
> me why it was removed in the first place.

Well, can't comment on why it was done or a proper way to fix it.
Point is that dropping the global lock requires careful review to verify
safety, as Jan mentioned.
For example, a potential problem would be:

  vcpu context                  iothread context
  qxl pio write 
  drop lock 
                                acquire lock
                                1) change state
                                drop lock
  acquire lock

1) could be device hotunplug, system reset, etc.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-07 16:02         ` Marcelo Tosatti
  0 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-07 16:02 UTC (permalink / raw)
  To: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini,
	Avi Kivity

On Sun, Mar 06, 2011 at 12:30:59PM +0200, Alon Levy wrote:
> On Sat, Mar 05, 2011 at 01:35:58PM -0300, Marcelo Tosatti wrote:
> > On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote:
> > > >     at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466
> > > > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0
> > > > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6
> > > > (gdb)
> > > 
> > > That's a spice bug. In fact, there are a lot of
> > > qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> > > of them can cause even more subtle problems.
> > > 
> > > Two general issues with dropping the global mutex like this:
> > >  - The caller of mutex_unlock is responsible for maintaining
> > >    cpu_single_env across the unlocked phase (that's related to the
> > >    abort above).
> > >  - Dropping the lock in the middle of a callback is risky. That may
> > >    enable re-entrances of code sections that weren't designed for this
> > >    (I'm skeptic about the side effects of
> > >    qemu_spice_vm_change_state_handler - why dropping the lock here?).
> > > 
> > > Spice requires a careful review regarding such issues. Or it should
> > > pioneer with introducing its own lock so that we can handle at least
> > > related I/O activities over the VCPUs without holding the global mutex
> > > (but I bet it's not the simplest candidate for such a new scheme).
> > > 
> > > Jan
> > > 
> > 
> > Agree with the concern regarding spice.
> > 
> 
> What are the pros and cons of (re)introducing a spice specific lock?
>  + simplicity. Only spice touches the spice lock.
>  - ? what were the original reasons for Gerd dropping the spice lock?
> 
> I have no problem reintroducing this lock, I'm just concerned that it's
> wasted effort because after I send that patch someone will jump and remind
> me why it was removed in the first place.

Well, can't comment on why it was done or a proper way to fix it.
Point is that dropping the global lock requires careful review to verify
safety, as Jan mentioned.
For example, a potential problem would be:

  vcpu context                  iothread context
  qxl pio write 
  drop lock 
                                acquire lock
                                1) change state
                                drop lock
  acquire lock

1) could be device hotunplug, system reset, etc.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-06 10:38       ` [Qemu-devel] " Avi Kivity
@ 2011-03-07 16:13         ` Marcelo Tosatti
  -1 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-07 16:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini

On Sun, Mar 06, 2011 at 12:38:44PM +0200, Avi Kivity wrote:
> On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> >Regarding global mutex, TCG and KVM execution behaviour can become more
> >similar wrt locking by dropping qemu_global_mutex during generation and
> >execution of TBs.
> 
> How can you do that?  During generation, a device can assert the
> reset line, changing cpu modes,

Writes to CPUState fields needs can moved to vcpu thread context
(run_on_cpu), and reads performed under a lock.

Good thing is most CPUState accesses are local to vcpu context.

>  or move the memory map.

Memory map can be protected by a read-write lock initially, so that vcpu
thread holds it for read. Later can be converted to URCU.

Is write access to memory map necessary from vcpu context?

> During execution, tcg accesses memory a lot.  So we'll need to
> acquire qemu_global_mutex for every memory access, and separate
> protection for TB.
> 
> kvm achieves lockless protection by forcing vcpus off and dropping
> their page tables while executing natively, and using srcu while
> emulating.  We can do something similar for tcg, but it won't be
> easy.
> 
> >Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> >must be acquired.
> 
> Yes, and not just mmio - all memory accesses.
> 
> >With that in place, it becomes easier to justify further improvements
> >regarding parallelization, such as using a read-write lock for
> >l1_phys_map / phys_page_find_alloc.
> >
> >
> >  21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f
> >   6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc
> 
> should be replaced by a memslot list probably
> 
> >   4.90%               sh  qemu-system-x86_64    [.] tb_find_fast
> >   4.34%               sh  qemu-system-x86_64    [.] tlb_flush
> >
> 
> 
> -- 
> error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-07 16:13         ` Marcelo Tosatti
  0 siblings, 0 replies; 56+ messages in thread
From: Marcelo Tosatti @ 2011-03-07 16:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann, Paolo Bonzini

On Sun, Mar 06, 2011 at 12:38:44PM +0200, Avi Kivity wrote:
> On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> >Regarding global mutex, TCG and KVM execution behaviour can become more
> >similar wrt locking by dropping qemu_global_mutex during generation and
> >execution of TBs.
> 
> How can you do that?  During generation, a device can assert the
> reset line, changing cpu modes,

Writes to CPUState fields needs can moved to vcpu thread context
(run_on_cpu), and reads performed under a lock.

Good thing is most CPUState accesses are local to vcpu context.

>  or move the memory map.

Memory map can be protected by a read-write lock initially, so that vcpu
thread holds it for read. Later can be converted to URCU.

Is write access to memory map necessary from vcpu context?

> During execution, tcg accesses memory a lot.  So we'll need to
> acquire qemu_global_mutex for every memory access, and separate
> protection for TB.
> 
> kvm achieves lockless protection by forcing vcpus off and dropping
> their page tables while executing natively, and using srcu while
> emulating.  We can do something similar for tcg, but it won't be
> easy.
> 
> >Of course for memory or PIO accesses from vcpu context qemu_global_mutex
> >must be acquired.
> 
> Yes, and not just mmio - all memory accesses.
> 
> >With that in place, it becomes easier to justify further improvements
> >regarding parallelization, such as using a read-write lock for
> >l1_phys_map / phys_page_find_alloc.
> >
> >
> >  21.62%               sh            3d38920b3f  [.] 0x00003d38920b3f
> >   6.38%               sh  qemu-system-x86_64    [.] phys_page_find_alloc
> 
> should be replaced by a memslot list probably
> 
> >   4.90%               sh  qemu-system-x86_64    [.] tb_find_fast
> >   4.34%               sh  qemu-system-x86_64    [.] tlb_flush
> >
> 
> 
> -- 
> error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-07 16:13         ` [Qemu-devel] " Marcelo Tosatti
@ 2011-03-07 22:27           ` Paolo Bonzini
  -1 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-07 22:27 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Avi Kivity, Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel

On 03/07/2011 05:13 PM, Marcelo Tosatti wrote:
> Is write access to memory map necessary from vcpu context?

At least in x86 world SMM is entered from vcpu context, so I guess yes.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-07 22:27           ` Paolo Bonzini
  0 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-07 22:27 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Avi Kivity, Gerd Hoffmann

On 03/07/2011 05:13 PM, Marcelo Tosatti wrote:
> Is write access to memory map necessary from vcpu context?

At least in x86 world SMM is entered from vcpu context, so I guess yes.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-07 16:13         ` [Qemu-devel] " Marcelo Tosatti
@ 2011-03-08  9:17           ` Avi Kivity
  -1 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-08  9:17 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel, Paolo Bonzini

On 03/07/2011 06:13 PM, Marcelo Tosatti wrote:
> On Sun, Mar 06, 2011 at 12:38:44PM +0200, Avi Kivity wrote:
> >  On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> >  >Regarding global mutex, TCG and KVM execution behaviour can become more
> >  >similar wrt locking by dropping qemu_global_mutex during generation and
> >  >execution of TBs.
> >
> >  How can you do that?  During generation, a device can assert the
> >  reset line, changing cpu modes,
>
> Writes to CPUState fields needs can moved to vcpu thread context
> (run_on_cpu), and reads performed under a lock.
>
> Good thing is most CPUState accesses are local to vcpu context.

That's a really good idea.  And in fact that's what we do in kvm, if we 
have something to do to a cpu we queue it up in vcpu->requests and let 
the vcpu thread process it.

> >   or move the memory map.
>
> Memory map can be protected by a read-write lock initially, so that vcpu
> thread holds it for read. Later can be converted to URCU.

rwlock is insufficient, need a way to force the vcpu off so a writer can 
actually do something.

So we need some kind of priority rwlock where a reader lets the lock 
know how it can force it off in case a writer comes along.

> Is write access to memory map necessary from vcpu context?

Yes, examples writes to PCI BARs or the VGA windows at 0xa0000.

We can move them to a device thread, but the vcpu thread will have to 
wait for it (so it needs to drop any locks anyway).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-08  9:17           ` Avi Kivity
  0 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-08  9:17 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann, Paolo Bonzini

On 03/07/2011 06:13 PM, Marcelo Tosatti wrote:
> On Sun, Mar 06, 2011 at 12:38:44PM +0200, Avi Kivity wrote:
> >  On 03/05/2011 06:35 PM, Marcelo Tosatti wrote:
> >  >Regarding global mutex, TCG and KVM execution behaviour can become more
> >  >similar wrt locking by dropping qemu_global_mutex during generation and
> >  >execution of TBs.
> >
> >  How can you do that?  During generation, a device can assert the
> >  reset line, changing cpu modes,
>
> Writes to CPUState fields needs can moved to vcpu thread context
> (run_on_cpu), and reads performed under a lock.
>
> Good thing is most CPUState accesses are local to vcpu context.

That's a really good idea.  And in fact that's what we do in kvm, if we 
have something to do to a cpu we queue it up in vcpu->requests and let 
the vcpu thread process it.

> >   or move the memory map.
>
> Memory map can be protected by a read-write lock initially, so that vcpu
> thread holds it for read. Later can be converted to URCU.

rwlock is insufficient, need a way to force the vcpu off so a writer can 
actually do something.

So we need some kind of priority rwlock where a reader lets the lock 
know how it can force it off in case a writer comes along.

> Is write access to memory map necessary from vcpu context?

Yes, examples writes to PCI BARs or the VGA windows at 0xa0000.

We can move them to a device thread, but the vcpu thread will have to 
wait for it (so it needs to drop any locks anyway).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-08  9:17           ` [Qemu-devel] " Avi Kivity
@ 2011-03-08  9:28             ` Paolo Bonzini
  -1 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-08  9:28 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel

On 03/08/2011 10:17 AM, Avi Kivity wrote:
>> Memory map can be protected by a read-write lock initially, so that vcpu
>> thread holds it for read. Later can be converted to URCU.
>
> rwlock is insufficient, need a way to force the vcpu off so a writer can
> actually do something.
>
> So we need some kind of priority rwlock where a reader lets the lock
> know how it can force it off in case a writer comes along.

If the reader is the VCPU thread, the VCPU thread can always be bounced 
via qemu_cpu_kick.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-08  9:28             ` Paolo Bonzini
  0 siblings, 0 replies; 56+ messages in thread
From: Paolo Bonzini @ 2011-03-08  9:28 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, Marcelo Tosatti, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann

On 03/08/2011 10:17 AM, Avi Kivity wrote:
>> Memory map can be protected by a read-write lock initially, so that vcpu
>> thread holds it for read. Later can be converted to URCU.
>
> rwlock is insufficient, need a way to force the vcpu off so a writer can
> actually do something.
>
> So we need some kind of priority rwlock where a reader lets the lock
> know how it can force it off in case a writer comes along.

If the reader is the VCPU thread, the VCPU thread can always be bounced 
via qemu_cpu_kick.

Paolo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-03-08  9:28             ` [Qemu-devel] " Paolo Bonzini
@ 2011-03-08  9:32               ` Avi Kivity
  -1 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-08  9:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Marcelo Tosatti, Jan Kiszka, xming, Gerd Hoffmann, kvm, qemu-devel

On 03/08/2011 11:28 AM, Paolo Bonzini wrote:
> On 03/08/2011 10:17 AM, Avi Kivity wrote:
>>> Memory map can be protected by a read-write lock initially, so that 
>>> vcpu
>>> thread holds it for read. Later can be converted to URCU.
>>
>> rwlock is insufficient, need a way to force the vcpu off so a writer can
>> actually do something.
>>
>> So we need some kind of priority rwlock where a reader lets the lock
>> know how it can force it off in case a writer comes along.
>
> If the reader is the VCPU thread, the VCPU thread can always be 
> bounced via qemu_cpu_kick.
>

Right, I just want the lock to do it automatically instead of sprinkling 
it everywhere we take a write lock.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] Re: kvm crashes with spice while loading qxl
@ 2011-03-08  9:32               ` Avi Kivity
  0 siblings, 0 replies; 56+ messages in thread
From: Avi Kivity @ 2011-03-08  9:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Marcelo Tosatti, qemu-devel, xming, Jan Kiszka, Gerd Hoffmann

On 03/08/2011 11:28 AM, Paolo Bonzini wrote:
> On 03/08/2011 10:17 AM, Avi Kivity wrote:
>>> Memory map can be protected by a read-write lock initially, so that 
>>> vcpu
>>> thread holds it for read. Later can be converted to URCU.
>>
>> rwlock is insufficient, need a way to force the vcpu off so a writer can
>> actually do something.
>>
>> So we need some kind of priority rwlock where a reader lets the lock
>> know how it can force it off in case a writer comes along.
>
> If the reader is the VCPU thread, the VCPU thread can always be 
> bounced via qemu_cpu_kick.
>

Right, I just want the lock to do it automatically instead of sprinkling 
it everywhere we take a write lock.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
@ 2011-04-26  8:53     ` Gerd Hoffmann
  -1 siblings, 0 replies; 56+ messages in thread
From: Gerd Hoffmann @ 2011-04-26  8:53 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, qemu-devel, kvm

   Hi,

[ ... back online now ... ]

>> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
>> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.

> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
>
> Two general issues with dropping the global mutex like this:
>   - The caller of mutex_unlock is responsible for maintaining
>     cpu_single_env across the unlocked phase (that's related to the
>     abort above).

This is true for qemu-kvm only, right?

qemu-kvm specific patches which add the cpu_single_env tracking (not 
polished yet) are here:

http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28

>   - Dropping the lock in the middle of a callback is risky. That may
>     enable re-entrances of code sections that weren't designed for this

Hmm, indeed.

> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).

spice/qxl used to have its own locking scheme.  That didn't work out 
though.  spice server is threaded and calls back into qxl from spice 
thread context, and some of these callbacks need access to qemu data 
structures (display surface) and thus lock protection which covers more 
than just the spice subsystem.

I'll look hard again whenever I can find a way out of this (preferably 
drop the need for the global lock somehow).  For now I'm pretty busy 
with the email backlog though ...

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] kvm crashes with spice while loading qxl
@ 2011-04-26  8:53     ` Gerd Hoffmann
  0 siblings, 0 replies; 56+ messages in thread
From: Gerd Hoffmann @ 2011-04-26  8:53 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, qemu-devel, kvm

   Hi,

[ ... back online now ... ]

>> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
>> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.

> That's a spice bug. In fact, there are a lot of
> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> of them can cause even more subtle problems.
>
> Two general issues with dropping the global mutex like this:
>   - The caller of mutex_unlock is responsible for maintaining
>     cpu_single_env across the unlocked phase (that's related to the
>     abort above).

This is true for qemu-kvm only, right?

qemu-kvm specific patches which add the cpu_single_env tracking (not 
polished yet) are here:

http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28

>   - Dropping the lock in the middle of a callback is risky. That may
>     enable re-entrances of code sections that weren't designed for this

Hmm, indeed.

> Spice requires a careful review regarding such issues. Or it should
> pioneer with introducing its own lock so that we can handle at least
> related I/O activities over the VCPUs without holding the global mutex
> (but I bet it's not the simplest candidate for such a new scheme).

spice/qxl used to have its own locking scheme.  That didn't work out 
though.  spice server is threaded and calls back into qxl from spice 
thread context, and some of these callbacks need access to qemu data 
structures (display surface) and thus lock protection which covers more 
than just the spice subsystem.

I'll look hard again whenever I can find a way out of this (preferably 
drop the need for the global lock somehow).  For now I'm pretty busy 
with the email backlog though ...

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-04-26  8:53     ` [Qemu-devel] " Gerd Hoffmann
@ 2011-04-26  9:06       ` Jan Kiszka
  -1 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-04-26  9:06 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: xming, kvm, qemu-devel

On 2011-04-26 10:53, Gerd Hoffmann wrote:
>   Hi,
> 
> [ ... back online now ... ]
> 
>>> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
>>>
>>> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
>> That's a spice bug. In fact, there are a lot of
>> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
>> of them can cause even more subtle problems.
>>
>> Two general issues with dropping the global mutex like this:
>>   - The caller of mutex_unlock is responsible for maintaining
>>     cpu_single_env across the unlocked phase (that's related to the
>>     abort above).
> 
> This is true for qemu-kvm only, right?

Nope, this applies to both implementations.

> 
> qemu-kvm specific patches which add the cpu_single_env tracking (not
> polished yet) are here:
> 
> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28

Cannot spot that quickly: In which way are they specific to qemu-kvm?

If they are, try to focus on upstream first. The qemu-kvm differences
are virtually deprecated, and I hope we can remove them really soon now
(my patches are all ready).

> 
>>   - Dropping the lock in the middle of a callback is risky. That may
>>     enable re-entrances of code sections that weren't designed for this
> 
> Hmm, indeed.
> 
>> Spice requires a careful review regarding such issues. Or it should
>> pioneer with introducing its own lock so that we can handle at least
>> related I/O activities over the VCPUs without holding the global mutex
>> (but I bet it's not the simplest candidate for such a new scheme).
> 
> spice/qxl used to have its own locking scheme.  That didn't work out
> though.  spice server is threaded and calls back into qxl from spice
> thread context, and some of these callbacks need access to qemu data
> structures (display surface) and thus lock protection which covers more
> than just the spice subsystem.
> 
> I'll look hard again whenever I can find a way out of this (preferably
> drop the need for the global lock somehow).  For now I'm pretty busy
> with the email backlog though ...

Yeah, I can imagine...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] kvm crashes with spice while loading qxl
@ 2011-04-26  9:06       ` Jan Kiszka
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Kiszka @ 2011-04-26  9:06 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: xming, qemu-devel, kvm

On 2011-04-26 10:53, Gerd Hoffmann wrote:
>   Hi,
> 
> [ ... back online now ... ]
> 
>>> /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
>>>
>>> kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
>> That's a spice bug. In fact, there are a lot of
>> qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
>> of them can cause even more subtle problems.
>>
>> Two general issues with dropping the global mutex like this:
>>   - The caller of mutex_unlock is responsible for maintaining
>>     cpu_single_env across the unlocked phase (that's related to the
>>     abort above).
> 
> This is true for qemu-kvm only, right?

Nope, this applies to both implementations.

> 
> qemu-kvm specific patches which add the cpu_single_env tracking (not
> polished yet) are here:
> 
> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28

Cannot spot that quickly: In which way are they specific to qemu-kvm?

If they are, try to focus on upstream first. The qemu-kvm differences
are virtually deprecated, and I hope we can remove them really soon now
(my patches are all ready).

> 
>>   - Dropping the lock in the middle of a callback is risky. That may
>>     enable re-entrances of code sections that weren't designed for this
> 
> Hmm, indeed.
> 
>> Spice requires a careful review regarding such issues. Or it should
>> pioneer with introducing its own lock so that we can handle at least
>> related I/O activities over the VCPUs without holding the global mutex
>> (but I bet it's not the simplest candidate for such a new scheme).
> 
> spice/qxl used to have its own locking scheme.  That didn't work out
> though.  spice server is threaded and calls back into qxl from spice
> thread context, and some of these callbacks need access to qemu data
> structures (display surface) and thus lock protection which covers more
> than just the spice subsystem.
> 
> I'll look hard again whenever I can find a way out of this (preferably
> drop the need for the global lock somehow).  For now I'm pretty busy
> with the email backlog though ...

Yeah, I can imagine...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-04-26  8:53     ` [Qemu-devel] " Gerd Hoffmann
@ 2011-04-26  9:34       ` Alon Levy
  -1 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-04-26  9:34 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: xming, Jan Kiszka, qemu-devel, kvm

On Tue, Apr 26, 2011 at 10:53:04AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> [ ... back online now ... ]
> 
> >>/var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> >>kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
> >That's a spice bug. In fact, there are a lot of
> >qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> >of them can cause even more subtle problems.
> >
> >Two general issues with dropping the global mutex like this:
> >  - The caller of mutex_unlock is responsible for maintaining
> >    cpu_single_env across the unlocked phase (that's related to the
> >    abort above).
> 
> This is true for qemu-kvm only, right?
> 
> qemu-kvm specific patches which add the cpu_single_env tracking (not
> polished yet) are here:
> 
> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28
> 
> >  - Dropping the lock in the middle of a callback is risky. That may
> >    enable re-entrances of code sections that weren't designed for this
> 
> Hmm, indeed.
> 
> >Spice requires a careful review regarding such issues. Or it should
> >pioneer with introducing its own lock so that we can handle at least
> >related I/O activities over the VCPUs without holding the global mutex
> >(but I bet it's not the simplest candidate for such a new scheme).
> 
> spice/qxl used to have its own locking scheme.  That didn't work out
> though.  spice server is threaded and calls back into qxl from spice
> thread context, and some of these callbacks need access to qemu data
> structures (display surface) and thus lock protection which covers
> more than just the spice subsystem.
> 
> I'll look hard again whenever I can find a way out of this
> (preferably drop the need for the global lock somehow).  For now I'm
> pretty busy with the email backlog though ...
> 

We (Hans, Uri, and Me) have already sent a fix for this, it seems to have
passed everyone's testing, and it basically does just that - drops the
use of the mutex. It's in http://cgit.freedesktop.org/spice/qemu/log/?h=spice.v32.kvm,
see the patches:

 qxl/spice-display: move pipe to ssd
 qxl: implement get_command in vga mode without locks
 qxl/spice: remove qemu_mutex_{un,}lock_iothread around dispatcher
 hw/qxl-render: drop cursor locks, replace with pipe

And specifically the comments too.

Alon

> cheers,
>   Gerd
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] kvm crashes with spice while loading qxl
@ 2011-04-26  9:34       ` Alon Levy
  0 siblings, 0 replies; 56+ messages in thread
From: Alon Levy @ 2011-04-26  9:34 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: xming, Jan Kiszka, qemu-devel, kvm

On Tue, Apr 26, 2011 at 10:53:04AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> [ ... back online now ... ]
> 
> >>/var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724:
> >>kvm_mutex_unlock: Assertion `!cpu_single_env' failed.
> 
> >That's a spice bug. In fact, there are a lot of
> >qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few
> >of them can cause even more subtle problems.
> >
> >Two general issues with dropping the global mutex like this:
> >  - The caller of mutex_unlock is responsible for maintaining
> >    cpu_single_env across the unlocked phase (that's related to the
> >    abort above).
> 
> This is true for qemu-kvm only, right?
> 
> qemu-kvm specific patches which add the cpu_single_env tracking (not
> polished yet) are here:
> 
> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28
> 
> >  - Dropping the lock in the middle of a callback is risky. That may
> >    enable re-entrances of code sections that weren't designed for this
> 
> Hmm, indeed.
> 
> >Spice requires a careful review regarding such issues. Or it should
> >pioneer with introducing its own lock so that we can handle at least
> >related I/O activities over the VCPUs without holding the global mutex
> >(but I bet it's not the simplest candidate for such a new scheme).
> 
> spice/qxl used to have its own locking scheme.  That didn't work out
> though.  spice server is threaded and calls back into qxl from spice
> thread context, and some of these callbacks need access to qemu data
> structures (display surface) and thus lock protection which covers
> more than just the spice subsystem.
> 
> I'll look hard again whenever I can find a way out of this
> (preferably drop the need for the global lock somehow).  For now I'm
> pretty busy with the email backlog though ...
> 

We (Hans, Uri, and Me) have already sent a fix for this, it seems to have
passed everyone's testing, and it basically does just that - drops the
use of the mutex. It's in http://cgit.freedesktop.org/spice/qemu/log/?h=spice.v32.kvm,
see the patches:

 qxl/spice-display: move pipe to ssd
 qxl: implement get_command in vga mode without locks
 qxl/spice: remove qemu_mutex_{un,}lock_iothread around dispatcher
 hw/qxl-render: drop cursor locks, replace with pipe

And specifically the comments too.

Alon

> cheers,
>   Gerd
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: kvm crashes with spice while loading qxl
  2011-04-26  9:06       ` [Qemu-devel] " Jan Kiszka
@ 2011-04-26  9:43         ` Gerd Hoffmann
  -1 siblings, 0 replies; 56+ messages in thread
From: Gerd Hoffmann @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, kvm, qemu-devel

On 04/26/11 11:06, Jan Kiszka wrote:
> On 2011-04-26 10:53, Gerd Hoffmann wrote:
>>> Two general issues with dropping the global mutex like this:
>>>    - The caller of mutex_unlock is responsible for maintaining
>>>      cpu_single_env across the unlocked phase (that's related to the
>>>      abort above).
>>
>> This is true for qemu-kvm only, right?
>
> Nope, this applies to both implementations.

Oops.

>> qemu-kvm specific patches which add the cpu_single_env tracking (not
>> polished yet) are here:
>>
>> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28
>
> Cannot spot that quickly: In which way are they specific to qemu-kvm?

cpu_single_env bookeeping.  But if upstream needs that too having 
specific patches is pretty pointless.  I'll go fix it up upstream then.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] kvm crashes with spice while loading qxl
@ 2011-04-26  9:43         ` Gerd Hoffmann
  0 siblings, 0 replies; 56+ messages in thread
From: Gerd Hoffmann @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xming, qemu-devel, kvm

On 04/26/11 11:06, Jan Kiszka wrote:
> On 2011-04-26 10:53, Gerd Hoffmann wrote:
>>> Two general issues with dropping the global mutex like this:
>>>    - The caller of mutex_unlock is responsible for maintaining
>>>      cpu_single_env across the unlocked phase (that's related to the
>>>      abort above).
>>
>> This is true for qemu-kvm only, right?
>
> Nope, this applies to both implementations.

Oops.

>> qemu-kvm specific patches which add the cpu_single_env tracking (not
>> polished yet) are here:
>>
>> http://cgit.freedesktop.org/spice/qemu/log/?h=spice.kvm.v28
>
> Cannot spot that quickly: In which way are they specific to qemu-kvm?

cpu_single_env bookeeping.  But if upstream needs that too having 
specific patches is pretty pointless.  I'll go fix it up upstream then.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2011-04-26  9:43 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-26 11:43 kvm crashes with spice while loading qxl xming
2011-02-26 12:29 ` Jan Kiszka
2011-02-26 12:29   ` [Qemu-devel] " Jan Kiszka
2011-02-26 14:44   ` xming
2011-02-26 14:44     ` [Qemu-devel] " xming
2011-02-27 19:03   ` Alon Levy
2011-02-27 19:03     ` Alon Levy
2011-02-27 19:11     ` Jan Kiszka
2011-02-27 19:11       ` Jan Kiszka
2011-02-27 19:16       ` Alon Levy
2011-02-27 19:16         ` Alon Levy
2011-02-27 19:27         ` Jan Kiszka
2011-02-27 19:27           ` Jan Kiszka
2011-02-27 19:29           ` Alon Levy
2011-02-27 19:29             ` Alon Levy
2011-02-27 19:32           ` Alon Levy
2011-02-27 19:32             ` Alon Levy
2011-03-01 12:58       ` Alon Levy
2011-03-01 12:58         ` Alon Levy
2011-03-02  8:22         ` Jan Kiszka
2011-03-02 10:56           ` Alon Levy
2011-03-02 10:56             ` Alon Levy
2011-03-02 11:34             ` Jan Kiszka
2011-03-02 12:32               ` Alon Levy
2011-03-02 12:32                 ` Alon Levy
2011-02-28 12:56     ` xming
2011-03-01  3:56     ` Rick Vernam
2011-03-01  3:56       ` [Qemu-devel] " Rick Vernam
2011-03-05 16:35   ` Marcelo Tosatti
2011-03-05 16:35     ` [Qemu-devel] " Marcelo Tosatti
2011-03-05 17:11     ` Paolo Bonzini
2011-03-05 17:11       ` [Qemu-devel] " Paolo Bonzini
2011-03-06 10:30     ` Alon Levy
2011-03-06 10:30       ` [Qemu-devel] " Alon Levy
2011-03-07 16:02       ` Marcelo Tosatti
2011-03-07 16:02         ` [Qemu-devel] " Marcelo Tosatti
2011-03-06 10:38     ` Avi Kivity
2011-03-06 10:38       ` [Qemu-devel] " Avi Kivity
2011-03-07 16:13       ` Marcelo Tosatti
2011-03-07 16:13         ` [Qemu-devel] " Marcelo Tosatti
2011-03-07 22:27         ` Paolo Bonzini
2011-03-07 22:27           ` [Qemu-devel] " Paolo Bonzini
2011-03-08  9:17         ` Avi Kivity
2011-03-08  9:17           ` [Qemu-devel] " Avi Kivity
2011-03-08  9:28           ` Paolo Bonzini
2011-03-08  9:28             ` [Qemu-devel] " Paolo Bonzini
2011-03-08  9:32             ` Avi Kivity
2011-03-08  9:32               ` [Qemu-devel] " Avi Kivity
2011-04-26  8:53   ` Gerd Hoffmann
2011-04-26  8:53     ` [Qemu-devel] " Gerd Hoffmann
2011-04-26  9:06     ` Jan Kiszka
2011-04-26  9:06       ` [Qemu-devel] " Jan Kiszka
2011-04-26  9:43       ` Gerd Hoffmann
2011-04-26  9:43         ` [Qemu-devel] " Gerd Hoffmann
2011-04-26  9:34     ` Alon Levy
2011-04-26  9:34       ` [Qemu-devel] " Alon Levy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.