All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
@ 2011-09-01 19:35 Luiz Capitulino
  2011-09-01 19:47 ` Daniel P. Berrange
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Luiz Capitulino @ 2011-09-01 19:35 UTC (permalink / raw)
  To: qemu-devel; +Cc: Marian Krcmarik, Alon Levy

Sometimes, when having lots of VMs running on a RHEV host and the user
attempts to close a SPICE window, libvirt will get corrupted json from
QEMU.

After some investigation, I found out that the problem is that different
SPICE threads are calling monitor functions (such as
monitor_protocol_event()) in parallel which causes concurrent access
to the monitor's internal buffer outbuf[].

This fixes the problem by protecting accesses to outbuf[] with a mutex.

Honestly speaking, I'm not completely sure this the best thing to do
because the monitor itself and other qemu subsystems are not thread safe,
so having subsystems like SPICE assuming the contrary seems a bit
catastrophic to me...

Anyways, this commit fixes the problem at hand.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 monitor.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/monitor.c b/monitor.c
index 04f465a..61d4d93 100644
--- a/monitor.c
+++ b/monitor.c
@@ -57,6 +57,7 @@
 #include "json-parser.h"
 #include "osdep.h"
 #include "cpu.h"
+#include "qemu-thread.h"
 #ifdef CONFIG_SIMPLE_TRACE
 #include "trace.h"
 #endif
@@ -144,6 +145,7 @@ struct Monitor {
     int suspend_cnt;
     uint8_t outbuf[1024];
     int outbuf_index;
+    QemuMutex mutex;
     ReadLineState *rs;
     MonitorControl *mc;
     CPUState *mon_cpu;
@@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func,
 
 void monitor_flush(Monitor *mon)
 {
+    qemu_mutex_lock(&mon->mutex);
+
     if (mon && mon->outbuf_index != 0 && !mon->mux_out) {
         qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
         mon->outbuf_index = 0;
     }
+
+    qemu_mutex_unlock(&mon->mutex);
 }
 
 /* flush at every end of line or if the buffer is full */
@@ -257,6 +263,8 @@ static void monitor_puts(Monitor *mon, const char *str)
 {
     char c;
 
+    qemu_mutex_lock(&mon->mutex);
+
     for(;;) {
         c = *str++;
         if (c == '\0')
@@ -265,9 +273,14 @@ static void monitor_puts(Monitor *mon, const char *str)
             mon->outbuf[mon->outbuf_index++] = '\r';
         mon->outbuf[mon->outbuf_index++] = c;
         if (mon->outbuf_index >= (sizeof(mon->outbuf) - 1)
-            || c == '\n')
+            || c == '\n') {
+            qemu_mutex_unlock(&mon->mutex);
             monitor_flush(mon);
+            qemu_mutex_lock(&mon->mutex);
+        }
     }
+
+    qemu_mutex_unlock(&mon->mutex);
 }
 
 void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap)
@@ -5275,6 +5288,7 @@ void monitor_init(CharDriverState *chr, int flags)
 
     mon = g_malloc0(sizeof(*mon));
 
+    qemu_mutex_init(&mon->mutex);
     mon->chr = chr;
     mon->flags = flags;
     if (flags & MONITOR_USE_READLINE) {
-- 
1.7.7.rc0.72.g4b5ea

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino
@ 2011-09-01 19:47 ` Daniel P. Berrange
  2011-09-01 21:03 ` Jan Kiszka
  2011-09-02  1:34 ` Anthony Liguori
  2 siblings, 0 replies; 15+ messages in thread
From: Daniel P. Berrange @ 2011-09-01 19:47 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel

On Thu, Sep 01, 2011 at 04:35:45PM -0300, Luiz Capitulino wrote:
> Sometimes, when having lots of VMs running on a RHEV host and the user
> attempts to close a SPICE window, libvirt will get corrupted json from
> QEMU.
> 
> After some investigation, I found out that the problem is that different
> SPICE threads are calling monitor functions (such as
> monitor_protocol_event()) in parallel which causes concurrent access
> to the monitor's internal buffer outbuf[].
> 
> This fixes the problem by protecting accesses to outbuf[] with a mutex.
> 
> Honestly speaking, I'm not completely sure this the best thing to do
> because the monitor itself and other qemu subsystems are not thread safe,
> so having subsystems like SPICE assuming the contrary seems a bit
> catastrophic to me...
> 
> Anyways, this commit fixes the problem at hand.

IMHO this patch should be applied to stable-0.15 as is, since it is
an important fix for SPICE, and this highly targetted mutex lock has
low-risk of regressions elsewhere.

I'd also apply it for master now, but at the same time perhaps start
work on adding broader locking that covers all APIs that monitor.c
exposes to internal QEMU code, so we're future proofed against other
surprises.

> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

  Signed-off-by: Daniel P. Berrange <berrange@redhat.com>


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino
  2011-09-01 19:47 ` Daniel P. Berrange
@ 2011-09-01 21:03 ` Jan Kiszka
  2011-09-02  1:34 ` Anthony Liguori
  2 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2011-09-01 21:03 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1564 bytes --]

On 2011-09-01 21:35, Luiz Capitulino wrote:
> Sometimes, when having lots of VMs running on a RHEV host and the user
> attempts to close a SPICE window, libvirt will get corrupted json from
> QEMU.
> 
> After some investigation, I found out that the problem is that different
> SPICE threads are calling monitor functions (such as
> monitor_protocol_event()) in parallel which causes concurrent access
> to the monitor's internal buffer outbuf[].
> 
> This fixes the problem by protecting accesses to outbuf[] with a mutex.
> 
> Honestly speaking, I'm not completely sure this the best thing to do
> because the monitor itself and other qemu subsystems are not thread safe,
> so having subsystems like SPICE assuming the contrary seems a bit
> catastrophic to me...

I fully agree.

...

> @@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func,
>  
>  void monitor_flush(Monitor *mon)
>  {
> +    qemu_mutex_lock(&mon->mutex);
> +
>      if (mon && mon->outbuf_index != 0 && !mon->mux_out) {
>          qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
>          mon->outbuf_index = 0;
>      }
> +
> +    qemu_mutex_unlock(&mon->mutex);

Here is another example for things that can break due to "optimistic"
parallelization: What protects the chardev state that will be touched by
calling qemu_chr_fe_write? Even when ignoring mux'ed channels for now, I
bet there are code paths that modify the state without holding the
frontend lock (i.e. Monitor::mutex).

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino
  2011-09-01 19:47 ` Daniel P. Berrange
  2011-09-01 21:03 ` Jan Kiszka
@ 2011-09-02  1:34 ` Anthony Liguori
  2011-09-02  9:41   ` Daniel P. Berrange
  2011-09-02 13:39   ` Gerd Hoffmann
  2 siblings, 2 replies; 15+ messages in thread
From: Anthony Liguori @ 2011-09-02  1:34 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel

On 09/01/2011 02:35 PM, Luiz Capitulino wrote:
> Sometimes, when having lots of VMs running on a RHEV host and the user
> attempts to close a SPICE window, libvirt will get corrupted json from
> QEMU.
>
> After some investigation, I found out that the problem is that different
> SPICE threads are calling monitor functions (such as
> monitor_protocol_event()) in parallel which causes concurrent access
> to the monitor's internal buffer outbuf[].
>
> This fixes the problem by protecting accesses to outbuf[] with a mutex.
>
> Honestly speaking, I'm not completely sure this the best thing to do
> because the monitor itself and other qemu subsystems are not thread safe,
> so having subsystems like SPICE assuming the contrary seems a bit
> catastrophic to me...
>
> Anyways, this commit fixes the problem at hand.

Nack.

This is absolutely a Spice bug.  Spice should not be calling into QEMU 
code from multiple threads.  It should only call into QEMU code while 
it's holding the qemu_mutex.

The right way to fix this is probably to make all of the 
SpiceCoreInterface callbacks simply write to a file descriptor which can 
then wake up QEMU to do the operation on behalf of it.   It's ugly but 
the libspice interface is far too tied to QEMU internals in the first 
place which is the root of the problem.

Regards,

Anthony Liguori

>
> Signed-off-by: Luiz Capitulino<lcapitulino@redhat.com>
> ---
>   monitor.c |   16 +++++++++++++++-
>   1 files changed, 15 insertions(+), 1 deletions(-)
>
> diff --git a/monitor.c b/monitor.c
> index 04f465a..61d4d93 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -57,6 +57,7 @@
>   #include "json-parser.h"
>   #include "osdep.h"
>   #include "cpu.h"
> +#include "qemu-thread.h"
>   #ifdef CONFIG_SIMPLE_TRACE
>   #include "trace.h"
>   #endif
> @@ -144,6 +145,7 @@ struct Monitor {
>       int suspend_cnt;
>       uint8_t outbuf[1024];
>       int outbuf_index;
> +    QemuMutex mutex;
>       ReadLineState *rs;
>       MonitorControl *mc;
>       CPUState *mon_cpu;
> @@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func,
>
>   void monitor_flush(Monitor *mon)
>   {
> +    qemu_mutex_lock(&mon->mutex);
> +
>       if (mon&&  mon->outbuf_index != 0&&  !mon->mux_out) {
>           qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
>           mon->outbuf_index = 0;
>       }
> +
> +    qemu_mutex_unlock(&mon->mutex);
>   }
>
>   /* flush at every end of line or if the buffer is full */
> @@ -257,6 +263,8 @@ static void monitor_puts(Monitor *mon, const char *str)
>   {
>       char c;
>
> +    qemu_mutex_lock(&mon->mutex);
> +
>       for(;;) {
>           c = *str++;
>           if (c == '\0')
> @@ -265,9 +273,14 @@ static void monitor_puts(Monitor *mon, const char *str)
>               mon->outbuf[mon->outbuf_index++] = '\r';
>           mon->outbuf[mon->outbuf_index++] = c;
>           if (mon->outbuf_index>= (sizeof(mon->outbuf) - 1)
> -            || c == '\n')
> +            || c == '\n') {
> +            qemu_mutex_unlock(&mon->mutex);
>               monitor_flush(mon);
> +            qemu_mutex_lock(&mon->mutex);
> +        }
>       }
> +
> +    qemu_mutex_unlock(&mon->mutex);
>   }
>
>   void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap)
> @@ -5275,6 +5288,7 @@ void monitor_init(CharDriverState *chr, int flags)
>
>       mon = g_malloc0(sizeof(*mon));
>
> +    qemu_mutex_init(&mon->mutex);
>       mon->chr = chr;
>       mon->flags = flags;
>       if (flags&  MONITOR_USE_READLINE) {

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02  1:34 ` Anthony Liguori
@ 2011-09-02  9:41   ` Daniel P. Berrange
  2011-09-02 11:26     ` Jan Kiszka
  2011-09-02 13:39   ` Gerd Hoffmann
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel P. Berrange @ 2011-09-02  9:41 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Marian Krcmarik, Alon Levy, qemu-devel, Luiz Capitulino

On Thu, Sep 01, 2011 at 08:34:35PM -0500, Anthony Liguori wrote:
> On 09/01/2011 02:35 PM, Luiz Capitulino wrote:
> >Sometimes, when having lots of VMs running on a RHEV host and the user
> >attempts to close a SPICE window, libvirt will get corrupted json from
> >QEMU.
> >
> >After some investigation, I found out that the problem is that different
> >SPICE threads are calling monitor functions (such as
> >monitor_protocol_event()) in parallel which causes concurrent access
> >to the monitor's internal buffer outbuf[].
> >
> >This fixes the problem by protecting accesses to outbuf[] with a mutex.
> >
> >Honestly speaking, I'm not completely sure this the best thing to do
> >because the monitor itself and other qemu subsystems are not thread safe,
> >so having subsystems like SPICE assuming the contrary seems a bit
> >catastrophic to me...
> >
> >Anyways, this commit fixes the problem at hand.
> 
> Nack.
> 
> This is absolutely a Spice bug.  Spice should not be calling into
> QEMU code from multiple threads.  It should only call into QEMU code
> while it's holding the qemu_mutex.
> 
> The right way to fix this is probably to make all of the
> SpiceCoreInterface callbacks simply write to a file descriptor which
> can then wake up QEMU to do the operation on behalf of it.   It's
> ugly but the libspice interface is far too tied to QEMU internals in
> the first place which is the root of the problem.

This feels like a rather short-term approach to fixing the problem
to me. As QEMU becomes increasingly multi-threaded, there is high
liklihood that we'll get other code in QEMU which wants to use the
monitor from multiple threads. The monitor code in QEMU is fairly
well isolated & thus comparatively easy to make threadsafe, so I
don't see why we wouldn't want todo that & avoid any chance of this
type of problem recurring in the future.

IMHO, "fixing" SPICE is not fixing the bug at all, it is just removing
the trigger of the bug in the monitor.

Regards,
Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02  9:41   ` Daniel P. Berrange
@ 2011-09-02 11:26     ` Jan Kiszka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2011-09-02 11:26 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, Luiz Capitulino

On 2011-09-02 11:41, Daniel P. Berrange wrote:
> On Thu, Sep 01, 2011 at 08:34:35PM -0500, Anthony Liguori wrote:
>> On 09/01/2011 02:35 PM, Luiz Capitulino wrote:
>>> Sometimes, when having lots of VMs running on a RHEV host and the user
>>> attempts to close a SPICE window, libvirt will get corrupted json from
>>> QEMU.
>>>
>>> After some investigation, I found out that the problem is that different
>>> SPICE threads are calling monitor functions (such as
>>> monitor_protocol_event()) in parallel which causes concurrent access
>>> to the monitor's internal buffer outbuf[].
>>>
>>> This fixes the problem by protecting accesses to outbuf[] with a mutex.
>>>
>>> Honestly speaking, I'm not completely sure this the best thing to do
>>> because the monitor itself and other qemu subsystems are not thread safe,
>>> so having subsystems like SPICE assuming the contrary seems a bit
>>> catastrophic to me...
>>>
>>> Anyways, this commit fixes the problem at hand.
>>
>> Nack.
>>
>> This is absolutely a Spice bug.  Spice should not be calling into
>> QEMU code from multiple threads.  It should only call into QEMU code
>> while it's holding the qemu_mutex.
>>
>> The right way to fix this is probably to make all of the
>> SpiceCoreInterface callbacks simply write to a file descriptor which
>> can then wake up QEMU to do the operation on behalf of it.   It's
>> ugly but the libspice interface is far too tied to QEMU internals in
>> the first place which is the root of the problem.
> 
> This feels like a rather short-term approach to fixing the problem
> to me. As QEMU becomes increasingly multi-threaded, there is high
> liklihood that we'll get other code in QEMU which wants to use the
> monitor from multiple threads. The monitor code in QEMU is fairly
> well isolated & thus comparatively easy to make threadsafe, so I

As pointed out before, this assumption is not correct.

> don't see why we wouldn't want todo that & avoid any chance of this
> type of problem recurring in the future.
> 
> IMHO, "fixing" SPICE is not fixing the bug at all, it is just removing
> the trigger of the bug in the monitor.

Until we have officially thread-safe subsystems, SPICE must take the
qemu_global_mutex before calling core services. This patch does not make
the monitor thread-safe as it does not address indirectly called services.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02  1:34 ` Anthony Liguori
  2011-09-02  9:41   ` Daniel P. Berrange
@ 2011-09-02 13:39   ` Gerd Hoffmann
  2011-09-02 14:03     ` Anthony Liguori
                       ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2011-09-02 13:39 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino

[-- Attachment #1: Type: text/plain, Size: 1804 bytes --]

   Hi,

>> After some investigation, I found out that the problem is that different
>> SPICE threads are calling monitor functions (such as
>> monitor_protocol_event()) in parallel which causes concurrent access
>> to the monitor's internal buffer outbuf[].

[ adding spice-list to Cc, see qemu-devel for the rest of the thread ]

spice isn't supposed to do that.

/me just added a assert in channel_event() and saw it trigger in display 
channel disconnects.

#0  0x0000003ceba32a45 in raise () from /lib64/libc.so.6
#1  0x0000003ceba34225 in abort () from /lib64/libc.so.6
#2  0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6
#3  0x0000000000503759 in channel_event (event=3, info=0x35e9340)
     at /home/kraxel/projects/qemu/ui/spice-core.c:223
#4  0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400
#5  reds_stream_free (s=0x35e92c0) at reds.c:4981
#6  0x00007f9a77aac8b0 in red_disconnect_channel 
(channel=0x7f9a24069a80) at red_worker.c:8489
#7  0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, 
events=<value optimized out>)
     at red_worker.c:10062
#8  0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at 
red_worker.c:10304
#9  0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003cebae68ed in clone () from /lib64/libc.so.6

IMHO spice server should handle the display channel tear-down in the 
dispatcher instead of the worker thread.  Alon?

>> Anyways, this commit fixes the problem at hand.

Not really.  channel_event() itself isn't thread-safe too, it does 
unlocked list operations which can also blow up when called from 
different threads.

A patch like the attached (warning: untested) should do as quick&dirty 
fix for stable.  But IMO we really should fix spice instead.

cheers,
   Gerd


[-- Attachment #2: 0001-spice-workaround-a-spice-server-bug.patch --]
[-- Type: text/plain, Size: 2258 bytes --]

>From 7496e573ff6085d3c42d7e65b72c85fd2a7b4a78 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Fri, 2 Sep 2011 15:03:28 +0200
Subject: [PATCH] spice: workaround a spice server bug.

---
 ui/spice-core.c |   21 ++++++++++++++++++++-
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index dba11f0..c99cdc5 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -19,6 +19,7 @@
 #include <spice-experimental.h>
 
 #include <netdb.h>
+#include <pthread.h>
 
 #include "qemu-common.h"
 #include "qemu-spice.h"
@@ -44,6 +45,8 @@ static char *auth_passwd;
 static time_t auth_expires = TIME_MAX;
 int using_spice = 0;
 
+static pthread_t me;
+
 struct SpiceTimer {
     QEMUTimer *timer;
     QTAILQ_ENTRY(SpiceTimer) next;
@@ -216,6 +219,8 @@ static void channel_event(int event, SpiceChannelEventInfo *info)
     };
     QDict *server, *client;
     QObject *data;
+    bool need_lock = !pthread_equal(me, pthread_self());
+    static int first = 1;
 
     client = qdict_new();
     add_addr_info(client, &info->paddr, info->plen);
@@ -223,6 +228,14 @@ static void channel_event(int event, SpiceChannelEventInfo *info)
     server = qdict_new();
     add_addr_info(server, &info->laddr, info->llen);
 
+    if (need_lock) {
+        qemu_mutex_lock_iothread();
+        if (first) {
+            fprintf(stderr, "You are using a broken spice-server version\n");
+            first = 0;
+        }
+    }
+
     if (event == SPICE_CHANNEL_EVENT_INITIALIZED) {
         qdict_put(server, "auth", qstring_from_str(auth));
         add_channel_info(client, info);
@@ -236,6 +249,10 @@ static void channel_event(int event, SpiceChannelEventInfo *info)
                               QOBJECT(client), QOBJECT(server));
     monitor_protocol_event(qevent[event], data);
     qobject_decref(data);
+
+    if (need_lock) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 #else /* SPICE_INTERFACE_CORE_MINOR >= 3 */
@@ -482,7 +499,9 @@ void qemu_spice_init(void)
     spice_image_compression_t compression;
     spice_wan_compression_t wan_compr;
 
-    if (!opts) {
+    me = pthread_self();
+
+   if (!opts) {
         return;
     }
     port = qemu_opt_get_number(opts, "port", 0);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 13:39   ` Gerd Hoffmann
@ 2011-09-02 14:03     ` Anthony Liguori
  2011-09-02 14:24     ` Luiz Capitulino
  2011-09-02 14:28     ` Anthony Liguori
  2 siblings, 0 replies; 15+ messages in thread
From: Anthony Liguori @ 2011-09-02 14:03 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino

On 09/02/2011 08:39 AM, Gerd Hoffmann wrote:
> Hi,
>
>>> After some investigation, I found out that the problem is that different
>>> SPICE threads are calling monitor functions (such as
>>> monitor_protocol_event()) in parallel which causes concurrent access
>>> to the monitor's internal buffer outbuf[].
>
> [ adding spice-list to Cc, see qemu-devel for the rest of the thread ]
>
> spice isn't supposed to do that.
>
> /me just added a assert in channel_event() and saw it trigger in display
> channel disconnects.
>
> #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6
> #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6
> #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6
> #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340)
> at /home/kraxel/projects/qemu/ui/spice-core.c:223
> #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400
> #5 reds_stream_free (s=0x35e92c0) at reds.c:4981
> #6 0x00007f9a77aac8b0 in red_disconnect_channel (channel=0x7f9a24069a80)
> at red_worker.c:8489
> #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20,
> events=<value optimized out>)
> at red_worker.c:10062
> #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at
> red_worker.c:10304
> #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0
> #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6
>
> IMHO spice server should handle the display channel tear-down in the
> dispatcher instead of the worker thread. Alon?
>
>>> Anyways, this commit fixes the problem at hand.
>
> Not really. channel_event() itself isn't thread-safe too, it does
> unlocked list operations which can also blow up when called from
> different threads.
>
> A patch like the attached (warning: untested) should do as quick&dirty
> fix for stable. But IMO we really should fix spice instead.

Spice should not be calling *any* QEMU code without holding the global 
mutex.  That includes all of the QObject interactions.

Regards,

Anthony Liguori

>
> cheers,
> Gerd
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 13:39   ` Gerd Hoffmann
  2011-09-02 14:03     ` Anthony Liguori
@ 2011-09-02 14:24     ` Luiz Capitulino
  2011-09-02 14:28     ` Anthony Liguori
  2 siblings, 0 replies; 15+ messages in thread
From: Luiz Capitulino @ 2011-09-02 14:24 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: spice-devel, Marian Krcmarik, Alon Levy, qemu-devel

On Fri, 02 Sep 2011 15:39:03 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

>    Hi,
> 
> >> After some investigation, I found out that the problem is that different
> >> SPICE threads are calling monitor functions (such as
> >> monitor_protocol_event()) in parallel which causes concurrent access
> >> to the monitor's internal buffer outbuf[].
> 
> [ adding spice-list to Cc, see qemu-devel for the rest of the thread ]
> 
> spice isn't supposed to do that.
> 
> /me just added a assert in channel_event() and saw it trigger in display 
> channel disconnects.
> 
> #0  0x0000003ceba32a45 in raise () from /lib64/libc.so.6
> #1  0x0000003ceba34225 in abort () from /lib64/libc.so.6
> #2  0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6
> #3  0x0000000000503759 in channel_event (event=3, info=0x35e9340)
>      at /home/kraxel/projects/qemu/ui/spice-core.c:223
> #4  0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400
> #5  reds_stream_free (s=0x35e92c0) at reds.c:4981
> #6  0x00007f9a77aac8b0 in red_disconnect_channel 
> (channel=0x7f9a24069a80) at red_worker.c:8489
> #7  0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, 
> events=<value optimized out>)
>      at red_worker.c:10062
> #8  0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at 
> red_worker.c:10304
> #9  0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0
> #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6
> 
> IMHO spice server should handle the display channel tear-down in the 
> dispatcher instead of the worker thread.  Alon?
> 
> >> Anyways, this commit fixes the problem at hand.
> 
> Not really.  channel_event() itself isn't thread-safe too, it does 
> unlocked list operations which can also blow up when called from 
> different threads.

I thought my patch was at least a candidate for stable, but after this
thread I'm convinced the problem should be fixed in spice instead.

> 
> A patch like the attached (warning: untested) should do as quick&dirty 
> fix for stable.  But IMO we really should fix spice instead.
> 
> cheers,
>    Gerd
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 13:39   ` Gerd Hoffmann
  2011-09-02 14:03     ` Anthony Liguori
  2011-09-02 14:24     ` Luiz Capitulino
@ 2011-09-02 14:28     ` Anthony Liguori
  2011-09-02 15:18       ` Gerd Hoffmann
  2 siblings, 1 reply; 15+ messages in thread
From: Anthony Liguori @ 2011-09-02 14:28 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino

On 09/02/2011 08:39 AM, Gerd Hoffmann wrote:
> Hi,
>
>>> After some investigation, I found out that the problem is that different
>>> SPICE threads are calling monitor functions (such as
>>> monitor_protocol_event()) in parallel which causes concurrent access
>>> to the monitor's internal buffer outbuf[].
>
> [ adding spice-list to Cc, see qemu-devel for the rest of the thread ]
>
> spice isn't supposed to do that.
>
> /me just added a assert in channel_event() and saw it trigger in display
> channel disconnects.
>
> #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6
> #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6
> #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6
> #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340)
> at /home/kraxel/projects/qemu/ui/spice-core.c:223
> #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400
> #5 reds_stream_free (s=0x35e92c0) at reds.c:4981
> #6 0x00007f9a77aac8b0 in red_disconnect_channel (channel=0x7f9a24069a80)
> at red_worker.c:8489
> #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20,
> events=<value optimized out>)
> at red_worker.c:10062
> #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at
> red_worker.c:10304
> #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0
> #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6
>
> IMHO spice server should handle the display channel tear-down in the
> dispatcher instead of the worker thread. Alon?
>
>>> Anyways, this commit fixes the problem at hand.
>
> Not really. channel_event() itself isn't thread-safe too, it does
> unlocked list operations which can also blow up when called from
> different threads.
>
> A patch like the attached (warning: untested) should do as quick&dirty
> fix for stable. But IMO we really should fix spice instead.

I agree.  I'm not sure I like the idea of still calling QEMU code 
without holding the mutex (even the QObject code).

Can you just use a bottom half to defer this work to the I/O thread? 
Bottom half scheduling has to be signal safe which means it will also be 
thread safe.

Regards,

Anthony Liguori

>
> cheers,
> Gerd
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 14:28     ` Anthony Liguori
@ 2011-09-02 15:18       ` Gerd Hoffmann
  2011-09-02 15:20         ` Anthony Liguori
  2011-09-02 15:31         ` Paolo Bonzini
  0 siblings, 2 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2011-09-02 15:18 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino

   Hi,

>> A patch like the attached (warning: untested) should do as quick&dirty
>> fix for stable. But IMO we really should fix spice instead.
>
> I agree. I'm not sure I like the idea of still calling QEMU code without
> holding the mutex (even the QObject code).

I though just creating the objects isn't an issue, but if you disagree 
we can just move up the lock to the head of the function.

> Can you just use a bottom half to defer this work to the I/O thread?
> Bottom half scheduling has to be signal safe which means it will also be
> thread safe.

Not that straight forward as I would have to pass arguments to the 
bottom half.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 15:18       ` Gerd Hoffmann
@ 2011-09-02 15:20         ` Anthony Liguori
  2011-09-02 15:31         ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Anthony Liguori @ 2011-09-02 15:20 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino

On 09/02/2011 10:18 AM, Gerd Hoffmann wrote:
> Hi,
>
>>> A patch like the attached (warning: untested) should do as quick&dirty
>>> fix for stable. But IMO we really should fix spice instead.
>>
>> I agree. I'm not sure I like the idea of still calling QEMU code without
>> holding the mutex (even the QObject code).
>
> I though just creating the objects isn't an issue, but if you disagree
> we can just move up the lock to the head of the function.

What I fear is that Spice will assume something is thread safe, but then 
someone will make a change that makes the subsystem non-reentrant.

I'd rather that we have very clear rules about what's thread safe and 
not thread safe.  If you want to audit the QObject subsystem, declare it 
thread safe, and document it as such, that would be okay.  But it needs 
to be systematic, not ad-hoc.

Regards,

Anthony Liguori

>
>> Can you just use a bottom half to defer this work to the I/O thread?
>> Bottom half scheduling has to be signal safe which means it will also be
>> thread safe.
>
> Not that straight forward as I would have to pass arguments to the
> bottom half.
>
> cheers,
> Gerd
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 15:18       ` Gerd Hoffmann
  2011-09-02 15:20         ` Anthony Liguori
@ 2011-09-02 15:31         ` Paolo Bonzini
  2011-09-02 15:37           ` Anthony Liguori
  2011-09-05  7:48           ` Gerd Hoffmann
  1 sibling, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-09-02 15:31 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy, spice-devel

On 09/02/2011 05:18 PM, Gerd Hoffmann wrote:
>
>> Can you just use a bottom half to defer this work to the I/O thread?
>> Bottom half scheduling has to be signal safe which means it will also be
>> thread safe.
>
> Not that straight forward as I would have to pass arguments to the
> bottom half.

Can you add a variant of qemu_bh_new that accepts a sizeof for the new 
bottom half?  Then the bottom half itself can be passed as the opaque 
and used for the arguments.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 15:31         ` Paolo Bonzini
@ 2011-09-02 15:37           ` Anthony Liguori
  2011-09-05  7:48           ` Gerd Hoffmann
  1 sibling, 0 replies; 15+ messages in thread
From: Anthony Liguori @ 2011-09-02 15:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy,
	Gerd Hoffmann, spice-devel

On 09/02/2011 10:31 AM, Paolo Bonzini wrote:
> On 09/02/2011 05:18 PM, Gerd Hoffmann wrote:
>>
>>> Can you just use a bottom half to defer this work to the I/O thread?
>>> Bottom half scheduling has to be signal safe which means it will also be
>>> thread safe.
>>
>> Not that straight forward as I would have to pass arguments to the
>> bottom half.
>
> Can you add a variant of qemu_bh_new that accepts a sizeof for the new
> bottom half? Then the bottom half itself can be passed as the opaque and
> used for the arguments.

Bottom halves are opaque to the caller.

Passing arguments would require careful consideration of locking too.  I 
think the best way to resolve this is to fix libspice and not try to 
work around the problem in QEMU.

Regards,

Anthony Liguori

>
> Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access
  2011-09-02 15:31         ` Paolo Bonzini
  2011-09-02 15:37           ` Anthony Liguori
@ 2011-09-05  7:48           ` Gerd Hoffmann
  1 sibling, 0 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2011-09-05  7:48 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy, spice-devel

On 09/02/11 17:31, Paolo Bonzini wrote:
> On 09/02/2011 05:18 PM, Gerd Hoffmann wrote:
>>
>>> Can you just use a bottom half to defer this work to the I/O thread?
>>> Bottom half scheduling has to be signal safe which means it will also be
>>> thread safe.
>>
>> Not that straight forward as I would have to pass arguments to the
>> bottom half.
>
> Can you add a variant of qemu_bh_new that accepts a sizeof for the new
> bottom half? Then the bottom half itself can be passed as the opaque and
> used for the arguments.

That wouldn't help.  I would have to create some kind of job queue which 
is then processed by the bottom half.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-09-05  7:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino
2011-09-01 19:47 ` Daniel P. Berrange
2011-09-01 21:03 ` Jan Kiszka
2011-09-02  1:34 ` Anthony Liguori
2011-09-02  9:41   ` Daniel P. Berrange
2011-09-02 11:26     ` Jan Kiszka
2011-09-02 13:39   ` Gerd Hoffmann
2011-09-02 14:03     ` Anthony Liguori
2011-09-02 14:24     ` Luiz Capitulino
2011-09-02 14:28     ` Anthony Liguori
2011-09-02 15:18       ` Gerd Hoffmann
2011-09-02 15:20         ` Anthony Liguori
2011-09-02 15:31         ` Paolo Bonzini
2011-09-02 15:37           ` Anthony Liguori
2011-09-05  7:48           ` Gerd Hoffmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.