All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Regression with commit 095497ffc66b7f031
@ 2016-07-15  6:32 Juergen Gross
  2016-07-15  7:39 ` Peter Lieven
  2016-07-15  7:39 ` [Qemu-devel] " Peter Lieven
  0 siblings, 2 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  6:32 UTC (permalink / raw)
  To: xen-devel, qemu-devel, Peter Lieven; +Cc: Gerd Hoffmann, Paolo Bonzini

Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
thread local storage for palette") introduced a regression with Xen:
Since this commit qemu used as a device model is no longer capable
to register Xenstore watches (that's the effect visible to a user).
Reverting this commit makes qemu behave well again. I have no idea
why that commit would have this effect with Xen, may be some memory
is clobbered?


Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15  6:32 [Qemu-devel] Regression with commit 095497ffc66b7f031 Juergen Gross
  2016-07-15  7:39 ` Peter Lieven
@ 2016-07-15  7:39 ` Peter Lieven
  2016-07-15  8:47   ` Juergen Gross
  2016-07-15  8:47   ` Juergen Gross
  1 sibling, 2 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15  7:39 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann, Paolo Bonzini

Am 15.07.2016 um 08:32 schrieb Juergen Gross:
> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
> thread local storage for palette") introduced a regression with Xen:
> Since this commit qemu used as a device model is no longer capable
> to register Xenstore watches (that's the effect visible to a user).
> Reverting this commit makes qemu behave well again. I have no idea
> why that commit would have this effect with Xen, may be some memory
> is clobbered?

I personally have no idea, maybe @Paolo has?

Maybe the corruption happens somewhere else and is just visible
due to this change.

Do you see sth when you ran qemu/xen in valgrind?

Peter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15  6:32 [Qemu-devel] Regression with commit 095497ffc66b7f031 Juergen Gross
@ 2016-07-15  7:39 ` Peter Lieven
  2016-07-15  7:39 ` [Qemu-devel] " Peter Lieven
  1 sibling, 0 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15  7:39 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, qemu-devel; +Cc: Paolo Bonzini, Gerd Hoffmann

Am 15.07.2016 um 08:32 schrieb Juergen Gross:
> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
> thread local storage for palette") introduced a regression with Xen:
> Since this commit qemu used as a device model is no longer capable
> to register Xenstore watches (that's the effect visible to a user).
> Reverting this commit makes qemu behave well again. I have no idea
> why that commit would have this effect with Xen, may be some memory
> is clobbered?

I personally have no idea, maybe @Paolo has?

Maybe the corruption happens somewhere else and is just visible
due to this change.

Do you see sth when you ran qemu/xen in valgrind?

Peter


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15  7:39 ` [Qemu-devel] " Peter Lieven
@ 2016-07-15  8:47   ` Juergen Gross
  2016-07-15  9:03     ` Peter Lieven
                       ` (3 more replies)
  2016-07-15  8:47   ` Juergen Gross
  1 sibling, 4 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  8:47 UTC (permalink / raw)
  To: Peter Lieven, xen-devel, qemu-devel; +Cc: Gerd Hoffmann, Paolo Bonzini

On 15/07/16 09:39, Peter Lieven wrote:
> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>> thread local storage for palette") introduced a regression with Xen:
>> Since this commit qemu used as a device model is no longer capable
>> to register Xenstore watches (that's the effect visible to a user).
>> Reverting this commit makes qemu behave well again. I have no idea
>> why that commit would have this effect with Xen, may be some memory
>> is clobbered?
> 
> I personally have no idea, maybe @Paolo has?
> 
> Maybe the corruption happens somewhere else and is just visible
> due to this change.
> 
> Do you see sth when you ran qemu/xen in valgrind?

Nothing scaring and no real difference between working and not working
variant.

Meanwhile I've been digging a little bit deeper and found the reason:
libxenstore is setting up a reader thread which is waiting for the
watch to fire. With above commit the stack size of that thread (16kB)
is too small. Setting it to 32kB made qemu work again.

So I'd recommend to have just a thread local palette pointer and
allocate the palette when needed and don't free it after using it but
keep it for reuse. Do you want to write that patch or should I do it?


Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15  7:39 ` [Qemu-devel] " Peter Lieven
  2016-07-15  8:47   ` Juergen Gross
@ 2016-07-15  8:47   ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  8:47 UTC (permalink / raw)
  To: Peter Lieven, xen-devel, qemu-devel; +Cc: Paolo Bonzini, Gerd Hoffmann

On 15/07/16 09:39, Peter Lieven wrote:
> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>> thread local storage for palette") introduced a regression with Xen:
>> Since this commit qemu used as a device model is no longer capable
>> to register Xenstore watches (that's the effect visible to a user).
>> Reverting this commit makes qemu behave well again. I have no idea
>> why that commit would have this effect with Xen, may be some memory
>> is clobbered?
> 
> I personally have no idea, maybe @Paolo has?
> 
> Maybe the corruption happens somewhere else and is just visible
> due to this change.
> 
> Do you see sth when you ran qemu/xen in valgrind?

Nothing scaring and no real difference between working and not working
variant.

Meanwhile I've been digging a little bit deeper and found the reason:
libxenstore is setting up a reader thread which is waiting for the
watch to fire. With above commit the stack size of that thread (16kB)
is too small. Setting it to 32kB made qemu work again.

So I'd recommend to have just a thread local palette pointer and
allocate the palette when needed and don't free it after using it but
keep it for reuse. Do you want to write that patch or should I do it?


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15  8:47   ` Juergen Gross
@ 2016-07-15  9:03     ` Peter Lieven
  2016-07-15  9:23       ` Juergen Gross
  2016-07-15  9:23       ` Juergen Gross
  2016-07-15  9:03     ` Peter Lieven
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15  9:03 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann, Paolo Bonzini

Am 15.07.2016 um 10:47 schrieb Juergen Gross:
> On 15/07/16 09:39, Peter Lieven wrote:
>> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>>> thread local storage for palette") introduced a regression with Xen:
>>> Since this commit qemu used as a device model is no longer capable
>>> to register Xenstore watches (that's the effect visible to a user).
>>> Reverting this commit makes qemu behave well again. I have no idea
>>> why that commit would have this effect with Xen, may be some memory
>>> is clobbered?
>> I personally have no idea, maybe @Paolo has?
>>
>> Maybe the corruption happens somewhere else and is just visible
>> due to this change.
>>
>> Do you see sth when you ran qemu/xen in valgrind?
> Nothing scaring and no real difference between working and not working
> variant.
>
> Meanwhile I've been digging a little bit deeper and found the reason:
> libxenstore is setting up a reader thread which is waiting for the
> watch to fire. With above commit the stack size of that thread (16kB)
> is too small. Setting it to 32kB made qemu work again.
>
> So I'd recommend to have just a thread local palette pointer and
> allocate the palette when needed and don't free it after using it but
> keep it for reuse. Do you want to write that patch or should I do it?

As you like. But as I have introduced this regression, maybe I should
fix it ;-)

Actually I do not understand what libxenstore confuses about 16 and 32kB,
but I have no knowledge about XEN. However, let me know if this here works for you:

diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
index b8581dd..5731cf6 100644
--- a/ui/vnc-enc-tight.c
+++ b/ui/vnc-enc-tight.c
@@ -1457,11 +1457,18 @@ static int send_sub_rect_jpeg(VncState *vs, int x, int y, int w, int h,
 }
 #endif
 
-static __thread VncPalette color_count_palette;
+static __thread VncPalette *color_count_palette = NULL;
+static __thread Notifier vnc_tight_cleanup_notifier;
+
+static void vnc_tight_cleanup(Notifier *n, void *value)
+{
+    printf("thread %d: free tight palette %p\n", qemu_get_thread_id(), color_count_palette);
+    g_free(color_count_palette);
+    color_count_palette = NULL;
+}
 
 static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
 {
-    VncPalette *palette = &color_count_palette;
     uint32_t bg = 0, fg = 0;
     int colors;
     int ret = 0;
@@ -1470,6 +1477,13 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
     bool allow_jpeg = true;
 #endif
 
+    if (!color_count_palette) {
+        color_count_palette = g_malloc(sizeof(VncPalette));
+        vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
+        qemu_thread_atexit_add(&vnc_tight_cleanup_notifier);
+        printf("thread %d: alloc tight palette %p\n", qemu_get_thread_id(), color_count_palette);
+    }
+
     vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
 
     vnc_tight_start(vs);
@@ -1490,17 +1504,17 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
     }
 #endif
 
-    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, palette);
+    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, color_count_palette);
 
 #ifdef CONFIG_VNC_JPEG
     if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
-        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
+        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette,
                                  force_jpeg);
     } else {
-        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
     }
 #else
-    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
 #endif
 
     return ret;

Peter

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15  8:47   ` Juergen Gross
  2016-07-15  9:03     ` Peter Lieven
@ 2016-07-15  9:03     ` Peter Lieven
  2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
  2016-07-15 10:02     ` Paolo Bonzini
  3 siblings, 0 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15  9:03 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, qemu-devel; +Cc: Paolo Bonzini, Gerd Hoffmann

Am 15.07.2016 um 10:47 schrieb Juergen Gross:
> On 15/07/16 09:39, Peter Lieven wrote:
>> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>>> thread local storage for palette") introduced a regression with Xen:
>>> Since this commit qemu used as a device model is no longer capable
>>> to register Xenstore watches (that's the effect visible to a user).
>>> Reverting this commit makes qemu behave well again. I have no idea
>>> why that commit would have this effect with Xen, may be some memory
>>> is clobbered?
>> I personally have no idea, maybe @Paolo has?
>>
>> Maybe the corruption happens somewhere else and is just visible
>> due to this change.
>>
>> Do you see sth when you ran qemu/xen in valgrind?
> Nothing scaring and no real difference between working and not working
> variant.
>
> Meanwhile I've been digging a little bit deeper and found the reason:
> libxenstore is setting up a reader thread which is waiting for the
> watch to fire. With above commit the stack size of that thread (16kB)
> is too small. Setting it to 32kB made qemu work again.
>
> So I'd recommend to have just a thread local palette pointer and
> allocate the palette when needed and don't free it after using it but
> keep it for reuse. Do you want to write that patch or should I do it?

As you like. But as I have introduced this regression, maybe I should
fix it ;-)

Actually I do not understand what libxenstore confuses about 16 and 32kB,
but I have no knowledge about XEN. However, let me know if this here works for you:

diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
index b8581dd..5731cf6 100644
--- a/ui/vnc-enc-tight.c
+++ b/ui/vnc-enc-tight.c
@@ -1457,11 +1457,18 @@ static int send_sub_rect_jpeg(VncState *vs, int x, int y, int w, int h,
 }
 #endif
 
-static __thread VncPalette color_count_palette;
+static __thread VncPalette *color_count_palette = NULL;
+static __thread Notifier vnc_tight_cleanup_notifier;
+
+static void vnc_tight_cleanup(Notifier *n, void *value)
+{
+    printf("thread %d: free tight palette %p\n", qemu_get_thread_id(), color_count_palette);
+    g_free(color_count_palette);
+    color_count_palette = NULL;
+}
 
 static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
 {
-    VncPalette *palette = &color_count_palette;
     uint32_t bg = 0, fg = 0;
     int colors;
     int ret = 0;
@@ -1470,6 +1477,13 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
     bool allow_jpeg = true;
 #endif
 
+    if (!color_count_palette) {
+        color_count_palette = g_malloc(sizeof(VncPalette));
+        vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
+        qemu_thread_atexit_add(&vnc_tight_cleanup_notifier);
+        printf("thread %d: alloc tight palette %p\n", qemu_get_thread_id(), color_count_palette);
+    }
+
     vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
 
     vnc_tight_start(vs);
@@ -1490,17 +1504,17 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
     }
 #endif
 
-    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, palette);
+    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, color_count_palette);
 
 #ifdef CONFIG_VNC_JPEG
     if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
-        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
+        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette,
                                  force_jpeg);
     } else {
-        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
     }
 #else
-    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
 #endif
 
     return ret;

Peter


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15  9:03     ` Peter Lieven
@ 2016-07-15  9:23       ` Juergen Gross
  2016-07-15  9:23       ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  9:23 UTC (permalink / raw)
  To: Peter Lieven, xen-devel, qemu-devel; +Cc: Gerd Hoffmann, Paolo Bonzini

On 15/07/16 11:03, Peter Lieven wrote:
> Am 15.07.2016 um 10:47 schrieb Juergen Gross:
>> On 15/07/16 09:39, Peter Lieven wrote:
>>> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>>>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>>>> thread local storage for palette") introduced a regression with Xen:
>>>> Since this commit qemu used as a device model is no longer capable
>>>> to register Xenstore watches (that's the effect visible to a user).
>>>> Reverting this commit makes qemu behave well again. I have no idea
>>>> why that commit would have this effect with Xen, may be some memory
>>>> is clobbered?
>>> I personally have no idea, maybe @Paolo has?
>>>
>>> Maybe the corruption happens somewhere else and is just visible
>>> due to this change.
>>>
>>> Do you see sth when you ran qemu/xen in valgrind?
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
>>
>> So I'd recommend to have just a thread local palette pointer and
>> allocate the palette when needed and don't free it after using it but
>> keep it for reuse. Do you want to write that patch or should I do it?
> 
> As you like. But as I have introduced this regression, maybe I should
> fix it ;-)

Sure.

> Actually I do not understand what libxenstore confuses about 16 and 32kB,
> but I have no knowledge about XEN. However, let me know if this here works for you:

Thanks, is working again. You can add

Tested-by: Juergen Gross <jgross@suse.com>

when submitting it.

The thread local static variable you added is occupying stack space in
each thread started by qemu. As it is rather large (about 6kB on a 64
bit system) the stack of the thread created by libxenstore is exhausted
resulting in a failing library call.


Juergen

> 
> diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
> index b8581dd..5731cf6 100644
> --- a/ui/vnc-enc-tight.c
> +++ b/ui/vnc-enc-tight.c
> @@ -1457,11 +1457,18 @@ static int send_sub_rect_jpeg(VncState *vs, int x, int y, int w, int h,
>  }
>  #endif
>  
> -static __thread VncPalette color_count_palette;
> +static __thread VncPalette *color_count_palette = NULL;
> +static __thread Notifier vnc_tight_cleanup_notifier;
> +
> +static void vnc_tight_cleanup(Notifier *n, void *value)
> +{
> +    printf("thread %d: free tight palette %p\n", qemu_get_thread_id(), color_count_palette);
> +    g_free(color_count_palette);
> +    color_count_palette = NULL;
> +}
>  
>  static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>  {
> -    VncPalette *palette = &color_count_palette;
>      uint32_t bg = 0, fg = 0;
>      int colors;
>      int ret = 0;
> @@ -1470,6 +1477,13 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>      bool allow_jpeg = true;
>  #endif
>  
> +    if (!color_count_palette) {
> +        color_count_palette = g_malloc(sizeof(VncPalette));
> +        vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
> +        qemu_thread_atexit_add(&vnc_tight_cleanup_notifier);
> +        printf("thread %d: alloc tight palette %p\n", qemu_get_thread_id(), color_count_palette);
> +    }
> +
>      vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
>  
>      vnc_tight_start(vs);
> @@ -1490,17 +1504,17 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>      }
>  #endif
>  
> -    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, palette);
> +    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, color_count_palette);
>  
>  #ifdef CONFIG_VNC_JPEG
>      if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
> -        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
> +        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette,
>                                   force_jpeg);
>      } else {
> -        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
>      }
>  #else
> -    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
>  #endif
>  
>      return ret;
> 
> Peter
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15  9:03     ` Peter Lieven
  2016-07-15  9:23       ` Juergen Gross
@ 2016-07-15  9:23       ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  9:23 UTC (permalink / raw)
  To: Peter Lieven, xen-devel, qemu-devel; +Cc: Paolo Bonzini, Gerd Hoffmann

On 15/07/16 11:03, Peter Lieven wrote:
> Am 15.07.2016 um 10:47 schrieb Juergen Gross:
>> On 15/07/16 09:39, Peter Lieven wrote:
>>> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>>>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>>>> thread local storage for palette") introduced a regression with Xen:
>>>> Since this commit qemu used as a device model is no longer capable
>>>> to register Xenstore watches (that's the effect visible to a user).
>>>> Reverting this commit makes qemu behave well again. I have no idea
>>>> why that commit would have this effect with Xen, may be some memory
>>>> is clobbered?
>>> I personally have no idea, maybe @Paolo has?
>>>
>>> Maybe the corruption happens somewhere else and is just visible
>>> due to this change.
>>>
>>> Do you see sth when you ran qemu/xen in valgrind?
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
>>
>> So I'd recommend to have just a thread local palette pointer and
>> allocate the palette when needed and don't free it after using it but
>> keep it for reuse. Do you want to write that patch or should I do it?
> 
> As you like. But as I have introduced this regression, maybe I should
> fix it ;-)

Sure.

> Actually I do not understand what libxenstore confuses about 16 and 32kB,
> but I have no knowledge about XEN. However, let me know if this here works for you:

Thanks, is working again. You can add

Tested-by: Juergen Gross <jgross@suse.com>

when submitting it.

The thread local static variable you added is occupying stack space in
each thread started by qemu. As it is rather large (about 6kB on a 64
bit system) the stack of the thread created by libxenstore is exhausted
resulting in a failing library call.


Juergen

> 
> diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
> index b8581dd..5731cf6 100644
> --- a/ui/vnc-enc-tight.c
> +++ b/ui/vnc-enc-tight.c
> @@ -1457,11 +1457,18 @@ static int send_sub_rect_jpeg(VncState *vs, int x, int y, int w, int h,
>  }
>  #endif
>  
> -static __thread VncPalette color_count_palette;
> +static __thread VncPalette *color_count_palette = NULL;
> +static __thread Notifier vnc_tight_cleanup_notifier;
> +
> +static void vnc_tight_cleanup(Notifier *n, void *value)
> +{
> +    printf("thread %d: free tight palette %p\n", qemu_get_thread_id(), color_count_palette);
> +    g_free(color_count_palette);
> +    color_count_palette = NULL;
> +}
>  
>  static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>  {
> -    VncPalette *palette = &color_count_palette;
>      uint32_t bg = 0, fg = 0;
>      int colors;
>      int ret = 0;
> @@ -1470,6 +1477,13 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>      bool allow_jpeg = true;
>  #endif
>  
> +    if (!color_count_palette) {
> +        color_count_palette = g_malloc(sizeof(VncPalette));
> +        vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
> +        qemu_thread_atexit_add(&vnc_tight_cleanup_notifier);
> +        printf("thread %d: alloc tight palette %p\n", qemu_get_thread_id(), color_count_palette);
> +    }
> +
>      vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
>  
>      vnc_tight_start(vs);
> @@ -1490,17 +1504,17 @@ static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>      }
>  #endif
>  
> -    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, palette);
> +    colors = tight_fill_palette(vs, x, y, w * h, &bg, &fg, color_count_palette);
>  
>  #ifdef CONFIG_VNC_JPEG
>      if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
> -        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
> +        ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette,
>                                   force_jpeg);
>      } else {
> -        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +        ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
>      }
>  #else
> -    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +    ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, color_count_palette);
>  #endif
>  
>      return ret;
> 
> Peter
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15  8:47   ` Juergen Gross
  2016-07-15  9:03     ` Peter Lieven
  2016-07-15  9:03     ` Peter Lieven
@ 2016-07-15 10:02     ` Paolo Bonzini
  2016-07-15 10:07       ` Peter Lieven
                         ` (2 more replies)
  2016-07-15 10:02     ` Paolo Bonzini
  3 siblings, 3 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:02 UTC (permalink / raw)
  To: Juergen Gross, Peter Lieven, xen-devel, qemu-devel; +Cc: Gerd Hoffmann



On 15/07/2016 10:47, Juergen Gross wrote:
> Nothing scaring and no real difference between working and not working
> variant.
> 
> Meanwhile I've been digging a little bit deeper and found the reason:
> libxenstore is setting up a reader thread which is waiting for the
> watch to fire. With above commit the stack size of that thread (16kB)
> is too small. Setting it to 32kB made qemu work again.

This makes very little sense (not your fault)...  The commit doesn't
change stack usage at all, TLS should not be on the stack.

Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
it's only due to inlining decision on the compiler, in which case
Peter's patch from today is only a bandaid.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15  8:47   ` Juergen Gross
                       ` (2 preceding siblings ...)
  2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
@ 2016-07-15 10:02     ` Paolo Bonzini
  3 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:02 UTC (permalink / raw)
  To: Juergen Gross, Peter Lieven, xen-devel, qemu-devel; +Cc: Gerd Hoffmann



On 15/07/2016 10:47, Juergen Gross wrote:
> Nothing scaring and no real difference between working and not working
> variant.
> 
> Meanwhile I've been digging a little bit deeper and found the reason:
> libxenstore is setting up a reader thread which is waiting for the
> watch to fire. With above commit the stack size of that thread (16kB)
> is too small. Setting it to 32kB made qemu work again.

This makes very little sense (not your fault)...  The commit doesn't
change stack usage at all, TLS should not be on the stack.

Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
it's only due to inlining decision on the compiler, in which case
Peter's patch from today is only a bandaid.

Thanks,

Paolo

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
  2016-07-15 10:07       ` Peter Lieven
@ 2016-07-15 10:07       ` Peter Lieven
  2016-07-15 10:12           ` Paolo Bonzini
  2016-07-15 10:12         ` Gerd Hoffmann
  2 siblings, 1 reply; 28+ messages in thread
From: Peter Lieven @ 2016-07-15 10:07 UTC (permalink / raw)
  To: Paolo Bonzini, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann

Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>
> On 15/07/2016 10:47, Juergen Gross wrote:
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.

But we still allocate the VncPalette for every thread, right? Even
if it has nothing todo with VNC.

Peter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
@ 2016-07-15 10:07       ` Peter Lieven
  2016-07-15 10:07       ` [Qemu-devel] " Peter Lieven
  2016-07-15 10:12         ` Gerd Hoffmann
  2 siblings, 0 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15 10:07 UTC (permalink / raw)
  To: Paolo Bonzini, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann

Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>
> On 15/07/2016 10:47, Juergen Gross wrote:
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.

But we still allocate the VncPalette for every thread, right? Even
if it has nothing todo with VNC.

Peter


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:07       ` [Qemu-devel] " Peter Lieven
@ 2016-07-15 10:12           ` Paolo Bonzini
  0 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:12 UTC (permalink / raw)
  To: Peter Lieven, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann



On 15/07/2016 12:07, Peter Lieven wrote:
> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
> 
> But we still allocate the VncPalette for every thread, right? Even
> if it has nothing todo with VNC.

Yes, I'm just trying to understand the root cause.  Which is that glibc
actually does carve out TLS space from the requested stack size.  That
means that a program that has a lot of TLS variables, or has big TLS
variables, will fail in weird ways.

So that's two reasons why your patch is okay. :)

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
@ 2016-07-15 10:12           ` Paolo Bonzini
  0 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:12 UTC (permalink / raw)
  To: Peter Lieven, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann



On 15/07/2016 12:07, Peter Lieven wrote:
> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
> 
> But we still allocate the VncPalette for every thread, right? Even
> if it has nothing todo with VNC.

Yes, I'm just trying to understand the root cause.  Which is that glibc
actually does carve out TLS space from the requested stack size.  That
means that a program that has a lot of TLS variables, or has big TLS
variables, will fail in weird ways.

So that's two reasons why your patch is okay. :)

Paolo

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
@ 2016-07-15 10:12         ` Gerd Hoffmann
  2016-07-15 10:07       ` [Qemu-devel] " Peter Lieven
  2016-07-15 10:12         ` Gerd Hoffmann
  2 siblings, 0 replies; 28+ messages in thread
From: Gerd Hoffmann @ 2016-07-15 10:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Juergen Gross, Peter Lieven, xen-devel, qemu-devel

On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
> 
> On 15/07/2016 10:47, Juergen Gross wrote:
> > Nothing scaring and no real difference between working and not working
> > variant.
> > 
> > Meanwhile I've been digging a little bit deeper and found the reason:
> > libxenstore is setting up a reader thread which is waiting for the
> > watch to fire. With above commit the stack size of that thread (16kB)
> > is too small. Setting it to 32kB made qemu work again.
> 
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.
> 
> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
> it's only due to inlining decision on the compiler, in which case
> Peter's patch from today is only a bandaid.

Hmm, I guess I hold off the vnc pull request for now ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
@ 2016-07-15 10:12         ` Gerd Hoffmann
  0 siblings, 0 replies; 28+ messages in thread
From: Gerd Hoffmann @ 2016-07-15 10:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Juergen Gross, xen-devel, Peter Lieven, qemu-devel

On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
> 
> On 15/07/2016 10:47, Juergen Gross wrote:
> > Nothing scaring and no real difference between working and not working
> > variant.
> > 
> > Meanwhile I've been digging a little bit deeper and found the reason:
> > libxenstore is setting up a reader thread which is waiting for the
> > watch to fire. With above commit the stack size of that thread (16kB)
> > is too small. Setting it to 32kB made qemu work again.
> 
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.
> 
> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
> it's only due to inlining decision on the compiler, in which case
> Peter's patch from today is only a bandaid.

Hmm, I guess I hold off the vnc pull request for now ...

cheers,
  Gerd


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:12           ` Paolo Bonzini
  (?)
  (?)
@ 2016-07-15 10:13           ` Peter Lieven
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15 10:13 UTC (permalink / raw)
  To: Paolo Bonzini, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann

Am 15.07.2016 um 12:12 schrieb Paolo Bonzini:
>
> On 15/07/2016 12:07, Peter Lieven wrote:
>> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>> Nothing scaring and no real difference between working and not working
>>>> variant.
>>>>
>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>> libxenstore is setting up a reader thread which is waiting for the
>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>> is too small. Setting it to 32kB made qemu work again.
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>> But we still allocate the VncPalette for every thread, right? Even
>> if it has nothing todo with VNC.
> Yes, I'm just trying to understand the root cause.  Which is that glibc
> actually does carve out TLS space from the requested stack size.  That
> means that a program that has a lot of TLS variables, or has big TLS
> variables, will fail in weird ways.
>
> So that's two reasons why your patch is okay. :)

Okay, then I will wait for the analysis and then resubmit that Patch
with a different commit message.

Peter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 10:12           ` Paolo Bonzini
  (?)
@ 2016-07-15 10:13           ` Peter Lieven
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Lieven @ 2016-07-15 10:13 UTC (permalink / raw)
  To: Paolo Bonzini, Juergen Gross, xen-devel, qemu-devel; +Cc: Gerd Hoffmann

Am 15.07.2016 um 12:12 schrieb Paolo Bonzini:
>
> On 15/07/2016 12:07, Peter Lieven wrote:
>> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>> Nothing scaring and no real difference between working and not working
>>>> variant.
>>>>
>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>> libxenstore is setting up a reader thread which is waiting for the
>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>> is too small. Setting it to 32kB made qemu work again.
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>> But we still allocate the VncPalette for every thread, right? Even
>> if it has nothing todo with VNC.
> Yes, I'm just trying to understand the root cause.  Which is that glibc
> actually does carve out TLS space from the requested stack size.  That
> means that a program that has a lot of TLS variables, or has big TLS
> variables, will fail in weird ways.
>
> So that's two reasons why your patch is okay. :)

Okay, then I will wait for the analysis and then resubmit that Patch
with a different commit message.

Peter


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:12         ` Gerd Hoffmann
  (?)
  (?)
@ 2016-07-15 10:35         ` Paolo Bonzini
  2016-07-15 10:41           ` Juergen Gross
  2016-07-15 10:41           ` Juergen Gross
  -1 siblings, 2 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:35 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Juergen Gross, xen-devel, Peter Lieven, qemu-devel



On 15/07/2016 12:12, Gerd Hoffmann wrote:
> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>>
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
>>
>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>> it's only due to inlining decision on the compiler, in which case
>> Peter's patch from today is only a bandaid.
> 
> Hmm, I guess I hold off the vnc pull request for now ...

Go ahead.  I looked at glibc source code and the patch is okay.

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 10:12         ` Gerd Hoffmann
  (?)
@ 2016-07-15 10:35         ` Paolo Bonzini
  -1 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 10:35 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Juergen Gross, xen-devel, Peter Lieven, qemu-devel



On 15/07/2016 12:12, Gerd Hoffmann wrote:
> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>>
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
>>
>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>> it's only due to inlining decision on the compiler, in which case
>> Peter's patch from today is only a bandaid.
> 
> Hmm, I guess I hold off the vnc pull request for now ...

Go ahead.  I looked at glibc source code and the patch is okay.

Paolo

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:35         ` [Qemu-devel] " Paolo Bonzini
@ 2016-07-15 10:41           ` Juergen Gross
  2016-07-15 12:42             ` Paolo Bonzini
  2016-07-15 12:42             ` Paolo Bonzini
  2016-07-15 10:41           ` Juergen Gross
  1 sibling, 2 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15 10:41 UTC (permalink / raw)
  To: Paolo Bonzini, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel

On 15/07/16 12:35, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>
>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>> Nothing scaring and no real difference between working and not working
>>>> variant.
>>>>
>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>> libxenstore is setting up a reader thread which is waiting for the
>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>> is too small. Setting it to 32kB made qemu work again.
>>>
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>>>
>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>> it's only due to inlining decision on the compiler, in which case
>>> Peter's patch from today is only a bandaid.
>>
>> Hmm, I guess I hold off the vnc pull request for now ...
> 
> Go ahead.  I looked at glibc source code and the patch is okay.

Paolo, do you know of any interface to obtain the size of the TLS area
taken from the stack (before calling pthread_create() )? This would
enable me to modify libxenstore to set the stack size to a sensible
value without having to choose a magic number which might fit for qemu,
but not for other users of libxenstore in the future.


Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 10:35         ` [Qemu-devel] " Paolo Bonzini
  2016-07-15 10:41           ` Juergen Gross
@ 2016-07-15 10:41           ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15 10:41 UTC (permalink / raw)
  To: Paolo Bonzini, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel

On 15/07/16 12:35, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>
>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>> Nothing scaring and no real difference between working and not working
>>>> variant.
>>>>
>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>> libxenstore is setting up a reader thread which is waiting for the
>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>> is too small. Setting it to 32kB made qemu work again.
>>>
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>>>
>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>> it's only due to inlining decision on the compiler, in which case
>>> Peter's patch from today is only a bandaid.
>>
>> Hmm, I guess I hold off the vnc pull request for now ...
> 
> Go ahead.  I looked at glibc source code and the patch is okay.

Paolo, do you know of any interface to obtain the size of the TLS area
taken from the stack (before calling pthread_create() )? This would
enable me to modify libxenstore to set the stack size to a sensible
value without having to choose a magic number which might fit for qemu,
but not for other users of libxenstore in the future.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 10:41           ` Juergen Gross
@ 2016-07-15 12:42             ` Paolo Bonzini
  2016-07-15 13:21               ` Juergen Gross
  2016-07-15 13:21               ` [Qemu-devel] [Xen-devel] " Juergen Gross
  2016-07-15 12:42             ` Paolo Bonzini
  1 sibling, 2 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 12:42 UTC (permalink / raw)
  To: Juergen Gross, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel



On 15/07/2016 12:41, Juergen Gross wrote:
> On 15/07/16 12:35, Paolo Bonzini wrote:
>>
>>
>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>>
>>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>>> Nothing scaring and no real difference between working and not working
>>>>> variant.
>>>>>
>>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>>> libxenstore is setting up a reader thread which is waiting for the
>>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>>> is too small. Setting it to 32kB made qemu work again.
>>>>
>>>> This makes very little sense (not your fault)...  The commit doesn't
>>>> change stack usage at all, TLS should not be on the stack.
>>>>
>>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>>> it's only due to inlining decision on the compiler, in which case
>>>> Peter's patch from today is only a bandaid.
>>>
>>> Hmm, I guess I hold off the vnc pull request for now ...
>>
>> Go ahead.  I looked at glibc source code and the patch is okay.
> 
> Paolo, do you know of any interface to obtain the size of the TLS area
> taken from the stack (before calling pthread_create() )? 

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
that _removes_ code to do this from the golang runtime.  The comments
there say that only with glibc before version 2.16 the static TLS size
is taken out of the stack size...

What version of glibc are you using?

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 10:41           ` Juergen Gross
  2016-07-15 12:42             ` Paolo Bonzini
@ 2016-07-15 12:42             ` Paolo Bonzini
  1 sibling, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2016-07-15 12:42 UTC (permalink / raw)
  To: Juergen Gross, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel



On 15/07/2016 12:41, Juergen Gross wrote:
> On 15/07/16 12:35, Paolo Bonzini wrote:
>>
>>
>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>>
>>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>>> Nothing scaring and no real difference between working and not working
>>>>> variant.
>>>>>
>>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>>> libxenstore is setting up a reader thread which is waiting for the
>>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>>> is too small. Setting it to 32kB made qemu work again.
>>>>
>>>> This makes very little sense (not your fault)...  The commit doesn't
>>>> change stack usage at all, TLS should not be on the stack.
>>>>
>>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>>> it's only due to inlining decision on the compiler, in which case
>>>> Peter's patch from today is only a bandaid.
>>>
>>> Hmm, I guess I hold off the vnc pull request for now ...
>>
>> Go ahead.  I looked at glibc source code and the patch is okay.
> 
> Paolo, do you know of any interface to obtain the size of the TLS area
> taken from the stack (before calling pthread_create() )? 

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
that _removes_ code to do this from the golang runtime.  The comments
there say that only with glibc before version 2.16 the static TLS size
is taken out of the stack size...

What version of glibc are you using?

Paolo

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [Xen-devel] Regression with commit 095497ffc66b7f031
  2016-07-15 12:42             ` Paolo Bonzini
  2016-07-15 13:21               ` Juergen Gross
@ 2016-07-15 13:21               ` Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15 13:21 UTC (permalink / raw)
  To: Paolo Bonzini, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel

On 15/07/16 14:42, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:41, Juergen Gross wrote:
>> On 15/07/16 12:35, Paolo Bonzini wrote:
>>>
>>>
>>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>>>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>>>
>>>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>>>> Nothing scaring and no real difference between working and not working
>>>>>> variant.
>>>>>>
>>>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>>>> libxenstore is setting up a reader thread which is waiting for the
>>>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>>>> is too small. Setting it to 32kB made qemu work again.
>>>>>
>>>>> This makes very little sense (not your fault)...  The commit doesn't
>>>>> change stack usage at all, TLS should not be on the stack.
>>>>>
>>>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>>>> it's only due to inlining decision on the compiler, in which case
>>>>> Peter's patch from today is only a bandaid.
>>>>
>>>> Hmm, I guess I hold off the vnc pull request for now ...
>>>
>>> Go ahead.  I looked at glibc source code and the patch is okay.
>>
>> Paolo, do you know of any interface to obtain the size of the TLS area
>> taken from the stack (before calling pthread_create() )? 
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
> that _removes_ code to do this from the golang runtime.  The comments
> there say that only with glibc before version 2.16 the static TLS size
> is taken out of the stack size...
> 
> What version of glibc are you using?

2.19. But according to:

https://sourceware.org/bugzilla/show_bug.cgi?id=11787

the issue is still present today.


Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Regression with commit 095497ffc66b7f031
  2016-07-15 12:42             ` Paolo Bonzini
@ 2016-07-15 13:21               ` Juergen Gross
  2016-07-15 13:21               ` [Qemu-devel] [Xen-devel] " Juergen Gross
  1 sibling, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15 13:21 UTC (permalink / raw)
  To: Paolo Bonzini, Gerd Hoffmann; +Cc: xen-devel, Peter Lieven, qemu-devel

On 15/07/16 14:42, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:41, Juergen Gross wrote:
>> On 15/07/16 12:35, Paolo Bonzini wrote:
>>>
>>>
>>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>>>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>>>
>>>>> On 15/07/2016 10:47, Juergen Gross wrote:
>>>>>> Nothing scaring and no real difference between working and not working
>>>>>> variant.
>>>>>>
>>>>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>>>>> libxenstore is setting up a reader thread which is waiting for the
>>>>>> watch to fire. With above commit the stack size of that thread (16kB)
>>>>>> is too small. Setting it to 32kB made qemu work again.
>>>>>
>>>>> This makes very little sense (not your fault)...  The commit doesn't
>>>>> change stack usage at all, TLS should not be on the stack.
>>>>>
>>>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>>>> it's only due to inlining decision on the compiler, in which case
>>>>> Peter's patch from today is only a bandaid.
>>>>
>>>> Hmm, I guess I hold off the vnc pull request for now ...
>>>
>>> Go ahead.  I looked at glibc source code and the patch is okay.
>>
>> Paolo, do you know of any interface to obtain the size of the TLS area
>> taken from the stack (before calling pthread_create() )? 
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
> that _removes_ code to do this from the golang runtime.  The comments
> there say that only with glibc before version 2.16 the static TLS size
> is taken out of the stack size...
> 
> What version of glibc are you using?

2.19. But according to:

https://sourceware.org/bugzilla/show_bug.cgi?id=11787

the issue is still present today.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Regression with commit 095497ffc66b7f031
@ 2016-07-15  6:32 Juergen Gross
  0 siblings, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2016-07-15  6:32 UTC (permalink / raw)
  To: xen-devel, qemu-devel, Peter Lieven; +Cc: Paolo Bonzini, Gerd Hoffmann

Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
thread local storage for palette") introduced a regression with Xen:
Since this commit qemu used as a device model is no longer capable
to register Xenstore watches (that's the effect visible to a user).
Reverting this commit makes qemu behave well again. I have no idea
why that commit would have this effect with Xen, may be some memory
is clobbered?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-07-15 13:21 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-15  6:32 [Qemu-devel] Regression with commit 095497ffc66b7f031 Juergen Gross
2016-07-15  7:39 ` Peter Lieven
2016-07-15  7:39 ` [Qemu-devel] " Peter Lieven
2016-07-15  8:47   ` Juergen Gross
2016-07-15  9:03     ` Peter Lieven
2016-07-15  9:23       ` Juergen Gross
2016-07-15  9:23       ` Juergen Gross
2016-07-15  9:03     ` Peter Lieven
2016-07-15 10:02     ` [Qemu-devel] " Paolo Bonzini
2016-07-15 10:07       ` Peter Lieven
2016-07-15 10:07       ` [Qemu-devel] " Peter Lieven
2016-07-15 10:12         ` Paolo Bonzini
2016-07-15 10:12           ` Paolo Bonzini
2016-07-15 10:13           ` Peter Lieven
2016-07-15 10:13           ` [Qemu-devel] " Peter Lieven
2016-07-15 10:12       ` Gerd Hoffmann
2016-07-15 10:12         ` Gerd Hoffmann
2016-07-15 10:35         ` Paolo Bonzini
2016-07-15 10:35         ` [Qemu-devel] " Paolo Bonzini
2016-07-15 10:41           ` Juergen Gross
2016-07-15 12:42             ` Paolo Bonzini
2016-07-15 13:21               ` Juergen Gross
2016-07-15 13:21               ` [Qemu-devel] [Xen-devel] " Juergen Gross
2016-07-15 12:42             ` Paolo Bonzini
2016-07-15 10:41           ` Juergen Gross
2016-07-15 10:02     ` Paolo Bonzini
2016-07-15  8:47   ` Juergen Gross
2016-07-15  6:32 Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.