All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
@ 2022-03-07 15:38 Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 1/3] coroutine-ucontext: " Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2022-03-07 15:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Stefan Weil, Stefan Hajnoczi

The coroutine implementation uses __thread variables internally. Compiler
optimizations may cache Thread-Local Storage values across
qemu_coroutine_yield(), leading to stale values being used after the coroutine
is re-entered from another thread.

Kevin pointed out that the coroutine implementation itself is vulnerable to
this problem. As a follow-up to my coroutine TLS patch series I'm sending these
patches to convert __thread variables to the new "qemu/coroutine-tls.h" macros
so they are safe.

Stefan Hajnoczi (3):
  coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
  coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
  coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()

 util/coroutine-ucontext.c | 38 +++++++++++++++++++++++-------------
 util/coroutine-win32.c    | 18 ++++++++++++-----
 util/qemu-coroutine.c     | 41 +++++++++++++++++++++++----------------
 3 files changed, 61 insertions(+), 36 deletions(-)

-- 
2.35.1




^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
@ 2022-03-07 15:38 ` Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 2/3] coroutine: " Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2022-03-07 15:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Stefan Weil, Stefan Hajnoczi

Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.

Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/coroutine-ucontext.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 904b375192..127d5a13c8 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -25,6 +25,7 @@
 #include "qemu/osdep.h"
 #include <ucontext.h>
 #include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
 
 #ifdef CONFIG_VALGRIND_H
 #include <valgrind/valgrind.h>
@@ -66,8 +67,8 @@ typedef struct {
 /**
  * Per-thread coroutine bookkeeping
  */
-static __thread CoroutineUContext leader;
-static __thread Coroutine *current;
+QEMU_DEFINE_STATIC_CO_TLS(Coroutine *, current);
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineUContext, leader);
 
 /*
  * va_args to makecontext() must be type 'int', so passing
@@ -97,14 +98,15 @@ static inline __attribute__((always_inline))
 void finish_switch_fiber(void *fake_stack_save)
 {
 #ifdef CONFIG_ASAN
+    CoroutineUContext *leaderp = get_ptr_leader();
     const void *bottom_old;
     size_t size_old;
 
     __sanitizer_finish_switch_fiber(fake_stack_save, &bottom_old, &size_old);
 
-    if (!leader.stack) {
-        leader.stack = (void *)bottom_old;
-        leader.stack_size = size_old;
+    if (!leaderp->stack) {
+        leaderp->stack = (void *)bottom_old;
+        leaderp->stack_size = size_old;
     }
 #endif
 #ifdef CONFIG_TSAN
@@ -161,8 +163,10 @@ static void coroutine_trampoline(int i0, int i1)
 
     /* Initialize longjmp environment and switch back the caller */
     if (!sigsetjmp(self->env, 0)) {
-        start_switch_fiber_asan(COROUTINE_YIELD, &fake_stack_save, leader.stack,
-                                leader.stack_size);
+        CoroutineUContext *leaderp = get_ptr_leader();
+
+        start_switch_fiber_asan(COROUTINE_YIELD, &fake_stack_save,
+                                leaderp->stack, leaderp->stack_size);
         start_switch_fiber_tsan(&fake_stack_save, self, true); /* true=caller */
         siglongjmp(*(sigjmp_buf *)co->entry_arg, 1);
     }
@@ -297,7 +301,7 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
     int ret;
     void *fake_stack_save = NULL;
 
-    current = to_;
+    set_current(to_);
 
     ret = sigsetjmp(from->env, 0);
     if (ret == 0) {
@@ -315,18 +319,24 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
 
 Coroutine *qemu_coroutine_self(void)
 {
-    if (!current) {
-        current = &leader.base;
+    Coroutine *self = get_current();
+    CoroutineUContext *leaderp = get_ptr_leader();
+
+    if (!self) {
+        self = &leaderp->base;
+        set_current(self);
     }
 #ifdef CONFIG_TSAN
-    if (!leader.tsan_co_fiber) {
-        leader.tsan_co_fiber = __tsan_get_current_fiber();
+    if (!leaderp->tsan_co_fiber) {
+        leaderp->tsan_co_fiber = __tsan_get_current_fiber();
     }
 #endif
-    return current;
+    return self;
 }
 
 bool qemu_in_coroutine(void)
 {
-    return current && current->caller;
+    Coroutine *self = get_current();
+
+    return self && self->caller;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 1/3] coroutine-ucontext: " Stefan Hajnoczi
@ 2022-03-07 15:38 ` Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 3/3] coroutine-win32: " Stefan Hajnoczi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2022-03-07 15:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Stefan Weil, Stefan Hajnoczi

Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.

Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.
The alloc_pool QSLIST needs a typedef so the return value of
get_ptr_alloc_pool() can be stored in a local variable.

One example of why this code is necessary: a coroutine that yields
before calling qemu_coroutine_create() to create another coroutine is
affected by the TLS issue.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/qemu-coroutine.c | 41 ++++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index c03b2422ff..f3e8300c8d 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -18,6 +18,7 @@
 #include "qemu/atomic.h"
 #include "qemu/coroutine.h"
 #include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
 #include "block/aio.h"
 
 /** Initial batch size is 64, and is increased on demand */
@@ -29,17 +30,20 @@ enum {
 static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
 static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE;
 static unsigned int release_pool_size;
-static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool);
-static __thread unsigned int alloc_pool_size;
-static __thread Notifier coroutine_pool_cleanup_notifier;
+
+typedef QSLIST_HEAD(, Coroutine) CoroutineQSList;
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineQSList, alloc_pool);
+QEMU_DEFINE_STATIC_CO_TLS(unsigned int, alloc_pool_size);
+QEMU_DEFINE_STATIC_CO_TLS(Notifier, coroutine_pool_cleanup_notifier);
 
 static void coroutine_pool_cleanup(Notifier *n, void *value)
 {
     Coroutine *co;
     Coroutine *tmp;
+    CoroutineQSList *alloc_pool = get_ptr_alloc_pool();
 
-    QSLIST_FOREACH_SAFE(co, &alloc_pool, pool_next, tmp) {
-        QSLIST_REMOVE_HEAD(&alloc_pool, pool_next);
+    QSLIST_FOREACH_SAFE(co, alloc_pool, pool_next, tmp) {
+        QSLIST_REMOVE_HEAD(alloc_pool, pool_next);
         qemu_coroutine_delete(co);
     }
 }
@@ -49,27 +53,30 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
     Coroutine *co = NULL;
 
     if (CONFIG_COROUTINE_POOL) {
-        co = QSLIST_FIRST(&alloc_pool);
+        CoroutineQSList *alloc_pool = get_ptr_alloc_pool();
+
+        co = QSLIST_FIRST(alloc_pool);
         if (!co) {
             if (release_pool_size > qatomic_read(&pool_batch_size)) {
                 /* Slow path; a good place to register the destructor, too.  */
-                if (!coroutine_pool_cleanup_notifier.notify) {
-                    coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup;
-                    qemu_thread_atexit_add(&coroutine_pool_cleanup_notifier);
+                Notifier *notifier = get_ptr_coroutine_pool_cleanup_notifier();
+                if (!notifier->notify) {
+                    notifier->notify = coroutine_pool_cleanup;
+                    qemu_thread_atexit_add(notifier);
                 }
 
                 /* This is not exact; there could be a little skew between
                  * release_pool_size and the actual size of release_pool.  But
                  * it is just a heuristic, it does not need to be perfect.
                  */
-                alloc_pool_size = qatomic_xchg(&release_pool_size, 0);
-                QSLIST_MOVE_ATOMIC(&alloc_pool, &release_pool);
-                co = QSLIST_FIRST(&alloc_pool);
+                set_alloc_pool_size(qatomic_xchg(&release_pool_size, 0));
+                QSLIST_MOVE_ATOMIC(alloc_pool, &release_pool);
+                co = QSLIST_FIRST(alloc_pool);
             }
         }
         if (co) {
-            QSLIST_REMOVE_HEAD(&alloc_pool, pool_next);
-            alloc_pool_size--;
+            QSLIST_REMOVE_HEAD(alloc_pool, pool_next);
+            set_alloc_pool_size(get_alloc_pool_size() - 1);
         }
     }
 
@@ -93,9 +100,9 @@ static void coroutine_delete(Coroutine *co)
             qatomic_inc(&release_pool_size);
             return;
         }
-        if (alloc_pool_size < qatomic_read(&pool_batch_size)) {
-            QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next);
-            alloc_pool_size++;
+        if (get_alloc_pool_size() < qatomic_read(&pool_batch_size)) {
+            QSLIST_INSERT_HEAD(get_ptr_alloc_pool(), co, pool_next);
+            set_alloc_pool_size(get_alloc_pool_size() + 1);
             return;
         }
     }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 1/3] coroutine-ucontext: " Stefan Hajnoczi
  2022-03-07 15:38 ` [PATCH 2/3] coroutine: " Stefan Hajnoczi
@ 2022-03-07 15:38 ` Stefan Hajnoczi
  2022-03-07 16:03 ` [PATCH 0/3] coroutine: " Philippe Mathieu-Daudé
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2022-03-07 15:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Stefan Weil, Stefan Hajnoczi

Thread-Local Storage variables cannot be used directly from coroutine
code because the compiler may optimize TLS variable accesses across
qemu_coroutine_yield() calls. When the coroutine is re-entered from
another thread the TLS variables from the old thread must no longer be
used.

Use QEMU_DEFINE_STATIC_CO_TLS() for the current and leader variables.

I think coroutine-win32.c could get away with __thread because the
variables are only used in situations where either the stale value is
correct (current) or outside coroutine context (loading leader when
current is NULL). Due to the difficulty of being sure that this is
really safe in all scenarios it seems worth converting it anyway.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/coroutine-win32.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/util/coroutine-win32.c b/util/coroutine-win32.c
index de6bd4fd3e..c02a62c896 100644
--- a/util/coroutine-win32.c
+++ b/util/coroutine-win32.c
@@ -25,6 +25,7 @@
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
 
 typedef struct
 {
@@ -34,8 +35,8 @@ typedef struct
     CoroutineAction action;
 } CoroutineWin32;
 
-static __thread CoroutineWin32 leader;
-static __thread Coroutine *current;
+QEMU_DEFINE_STATIC_CO_TLS(CoroutineWin32, leader);
+QEMU_DEFINE_STATIC_CO_TLS(Coroutine *, current);
 
 /* This function is marked noinline to prevent GCC from inlining it
  * into coroutine_trampoline(). If we allow it to do that then it
@@ -52,7 +53,7 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
     CoroutineWin32 *from = DO_UPCAST(CoroutineWin32, base, from_);
     CoroutineWin32 *to = DO_UPCAST(CoroutineWin32, base, to_);
 
-    current = to_;
+    set_current(to_);
 
     to->action = action;
     SwitchToFiber(to->fiber);
@@ -89,14 +90,21 @@ void qemu_coroutine_delete(Coroutine *co_)
 
 Coroutine *qemu_coroutine_self(void)
 {
+    Coroutine *current = get_current();
+
     if (!current) {
-        current = &leader.base;
-        leader.fiber = ConvertThreadToFiber(NULL);
+        CoroutineWin32 *leader = get_ptr_leader();
+
+        current = &leader->base;
+        set_current(current);
+        leader->fiber = ConvertThreadToFiber(NULL);
     }
     return current;
 }
 
 bool qemu_in_coroutine(void)
 {
+    Coroutine *current = get_current();
+
     return current && current->caller;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2022-03-07 15:38 ` [PATCH 3/3] coroutine-win32: " Stefan Hajnoczi
@ 2022-03-07 16:03 ` Philippe Mathieu-Daudé
  2022-03-16 13:28 ` Stefan Hajnoczi
  2022-05-03 10:48 ` Kevin Wolf
  5 siblings, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-03-07 16:03 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: kwolf, Stefan Weil

On 7/3/22 16:38, Stefan Hajnoczi wrote:
> The coroutine implementation uses __thread variables internally. Compiler
> optimizations may cache Thread-Local Storage values across
> qemu_coroutine_yield(), leading to stale values being used after the coroutine
> is re-entered from another thread.
> 
> Kevin pointed out that the coroutine implementation itself is vulnerable to
> this problem. As a follow-up to my coroutine TLS patch series I'm sending these
> patches to convert __thread variables to the new "qemu/coroutine-tls.h" macros
> so they are safe.
> 
> Stefan Hajnoczi (3):
>    coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
>    coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
>    coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()

Series:
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2022-03-07 16:03 ` [PATCH 0/3] coroutine: " Philippe Mathieu-Daudé
@ 2022-03-16 13:28 ` Stefan Hajnoczi
  2022-05-03 10:48 ` Kevin Wolf
  5 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2022-03-16 13:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Stefan Weil

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

On Mon, Mar 07, 2022 at 03:38:50PM +0000, Stefan Hajnoczi wrote:
> The coroutine implementation uses __thread variables internally. Compiler
> optimizations may cache Thread-Local Storage values across
> qemu_coroutine_yield(), leading to stale values being used after the coroutine
> is re-entered from another thread.
> 
> Kevin pointed out that the coroutine implementation itself is vulnerable to
> this problem. As a follow-up to my coroutine TLS patch series I'm sending these
> patches to convert __thread variables to the new "qemu/coroutine-tls.h" macros
> so they are safe.
> 
> Stefan Hajnoczi (3):
>   coroutine-ucontext: use QEMU_DEFINE_STATIC_CO_TLS()
>   coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
>   coroutine-win32: use QEMU_DEFINE_STATIC_CO_TLS()
> 
>  util/coroutine-ucontext.c | 38 +++++++++++++++++++++++-------------
>  util/coroutine-win32.c    | 18 ++++++++++++-----
>  util/qemu-coroutine.c     | 41 +++++++++++++++++++++++----------------
>  3 files changed, 61 insertions(+), 36 deletions(-)

Kevin: Is this what you had in mind?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS()
  2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2022-03-16 13:28 ` Stefan Hajnoczi
@ 2022-05-03 10:48 ` Kevin Wolf
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-05-03 10:48 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Stefan Weil

Am 07.03.2022 um 16:38 hat Stefan Hajnoczi geschrieben:
> The coroutine implementation uses __thread variables internally. Compiler
> optimizations may cache Thread-Local Storage values across
> qemu_coroutine_yield(), leading to stale values being used after the coroutine
> is re-entered from another thread.
> 
> Kevin pointed out that the coroutine implementation itself is vulnerable to
> this problem. As a follow-up to my coroutine TLS patch series I'm sending these
> patches to convert __thread variables to the new "qemu/coroutine-tls.h" macros
> so they are safe.

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-03 11:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 15:38 [PATCH 0/3] coroutine: use QEMU_DEFINE_STATIC_CO_TLS() Stefan Hajnoczi
2022-03-07 15:38 ` [PATCH 1/3] coroutine-ucontext: " Stefan Hajnoczi
2022-03-07 15:38 ` [PATCH 2/3] coroutine: " Stefan Hajnoczi
2022-03-07 15:38 ` [PATCH 3/3] coroutine-win32: " Stefan Hajnoczi
2022-03-07 16:03 ` [PATCH 0/3] coroutine: " Philippe Mathieu-Daudé
2022-03-16 13:28 ` Stefan Hajnoczi
2022-05-03 10:48 ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.