All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
@ 2021-12-06 14:26 Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 1/4] " Stefan Hajnoczi
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 14:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf,
	Warner Losh, sguelton

v3:
- Added __attribute__((weak)) to get_ptr_*() [Florian]
- Replace rdfsbase with mov %%fs:0,%0 [Florian]

This patch series solves the coroutines TLS problem. Coroutines re-entered from
another thread sometimes see stale TLS values. This happens because compilers
may cache values across yield points, so a value from the previous thread will
be used when the coroutine is re-entered in another thread.

Serge Guelton developed a portable technique and Richard Henderson developed an
inline-friendly architecture-specific technique, see the first patch for
details.

I have audited all __thread variables in QEMU and converted those that can be
used from coroutines. Most actually look safe to me. This patch does not
include a checkpatch.pl rule that requires coroutine-tls.h usage, but perhaps
we should add one for block/ at least?

Stefan Hajnoczi (4):
  tls: add macros for coroutine-safe TLS variables
  util/async: replace __thread with QEMU TLS macros
  rcu: use coroutine TLS macros
  cpus: use coroutine TLS macros for iothread_locked

 include/qemu/coroutine-tls.h | 205 +++++++++++++++++++++++++++++++++++
 include/qemu/rcu.h           |   7 +-
 softmmu/cpus.c               |   8 +-
 tests/unit/rcutorture.c      |  10 +-
 tests/unit/test-rcu-list.c   |   4 +-
 util/async.c                 |  12 +-
 util/rcu.c                   |  10 +-
 7 files changed, 232 insertions(+), 24 deletions(-)
 create mode 100644 include/qemu/coroutine-tls.h

-- 
2.33.1




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC v3 1/4] tls: add macros for coroutine-safe TLS variables
  2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
@ 2021-12-06 14:26 ` Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 2/4] util/async: replace __thread with QEMU TLS macros Stefan Hajnoczi
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 14:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf,
	Warner Losh, sguelton

Compiler optimizations can cache TLS values across coroutine yield
points, resulting in stale values from the previous thread when a
coroutine is re-entered by a new thread.

Serge Guelton developed an __attribute__((noinline)) wrapper and tested
it with clang and gcc. I formatted his idea according to QEMU's coding
style and wrote documentation.

Richard Henderson developed an alternative approach that can be inlined
by the compiler. This is included for architectures where we have inline
assembly that determines the address of a TLS variable.

These macros must be used instead of __thread from now on to prevent
coroutine TLS bugs. Here is an x86_64 TLS variable access before this patch:

  mov    %fs:-0x19c,%edx

And here is the same access using Richard's approach:

  rdfsbase %rax             # %fs contains the base address
  lea    -0x1a8(%rax),%rax  # -0x1a8 is the offset of our variable
  mov    0xc(%rax),%edx     # here we access the TLS variable via %rax

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1952483
Suggested-by: Serge Guelton <sguelton@redhat.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
Richard's suggested code used a MOV instruction on x86_64 but we need
LEA semantics. LEA doesn't support %fs so I switched to RDFSBASE+LEA.
Otherwise Richard's approach is unchanged.
---
 include/qemu/coroutine-tls.h | 205 +++++++++++++++++++++++++++++++++++
 1 file changed, 205 insertions(+)
 create mode 100644 include/qemu/coroutine-tls.h

diff --git a/include/qemu/coroutine-tls.h b/include/qemu/coroutine-tls.h
new file mode 100644
index 0000000000..b87c057243
--- /dev/null
+++ b/include/qemu/coroutine-tls.h
@@ -0,0 +1,205 @@
+/*
+ * QEMU Thread Local Storage for coroutines
+ *
+ * Copyright Red Hat
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ * It is forbidden to access Thread Local Storage in coroutines because
+ * compiler optimizations may cause values to be cached across coroutine
+ * re-entry. Coroutines can run in more than one thread through the course of
+ * their life, leading bugs when stale TLS values from the wrong thread are
+ * used as a result of compiler optimization.
+ *
+ * An example is:
+ *
+ * ..code-block:: c
+ *   :caption: A coroutine that may see the wrong TLS value
+ *
+ *   static __thread AioContext *current_aio_context;
+ *   ...
+ *   static void coroutine_fn foo(void)
+ *   {
+ *       aio_notify(current_aio_context);
+ *       qemu_coroutine_yield();
+ *       aio_notify(current_aio_context); // <-- may be stale after yielding!
+ *   }
+ *
+ * This header provides macros for safely defining variables in Thread Local
+ * Storage:
+ *
+ * ..code-block:: c
+ *   :caption: A coroutine that safely uses TLS
+ *
+ *   QEMU_DEFINE_STATIC_CO_TLS(AioContext *, current_aio_context)
+ *   ...
+ *   static void coroutine_fn foo(void)
+ *   {
+ *       aio_notify(get_current_aio_context());
+ *       qemu_coroutine_yield();
+ *       aio_notify(get_current_aio_context()); // <-- safe
+ *   }
+ */
+
+#ifndef QEMU_COROUTINE_TLS_H
+#define QEMU_COROUTINE_TLS_H
+
+/*
+ * Two techniques are available to stop the compiler from caching TLS values:
+ * 1. Accessor functions with __attribute__((noinline)). This is
+ *    architecture-independent but prevents inlining optimizations.
+ * 2. TLS address-of implemented as asm volatile so it can be inlined safely.
+ *    This enables inlining optimizations but requires architecture-specific
+ *    inline assembly.
+ */
+#if defined(__aarch64__)
+#define QEMU_CO_TLS_ADDR(ret, var)                                           \
+    asm volatile("mrs %0, tpidr_el0\n\t"                                     \
+                 "add %0, %0, #:tprel_hi12:"#var", lsl #12\n\t"              \
+                 "add %0, %0, #:tprel_lo12_nc:"#var                          \
+                 : "=r"(ret))
+#elif defined(__powerpc64__)
+#define QEMU_CO_TLS_ADDR(ret, var)                                           \
+    asm volatile("addis %0,13,"#var"@tprel@ha\n\t"                           \
+                 "add   %0,%0,"#var"@tprel@l"                                \
+                 : "=r"(ret))
+#elif defined(__riscv)
+#define QEMU_CO_TLS_ADDR(ret, var)                                           \
+    asm volatile("lui  %0,%%tprel_hi("#var")\n\t"                            \
+                 "add  %0,%0,%%tprel_add("#var")\n\t"                        \
+                 "addi %0,%0,%%tprel_lo("#var")"                             \
+                 : "=r"(ret))
+#elif defined(__x86_64__)
+#define QEMU_CO_TLS_ADDR(ret, var)                                           \
+    asm volatile("movq %%fs:0, %0\n\t"                                       \
+                 "lea "#var"@tpoff(%0), %0" : "=r"(ret))
+#endif
+
+/**
+ * QEMU_DECLARE_CO_TLS:
+ * @type: the variable's C type
+ * @var: the variable name
+ *
+ * Declare an extern variable in Thread Local Storage from a header file:
+ *
+ * .. code-block:: c
+ *   :caption: Declaring an extern variable in Thread Local Storage
+ *
+ *   QEMU_DECLARE_CO_TLS(int, my_count)
+ *   ...
+ *   int c = get_my_count();
+ *   set_my_count(c + 1);
+ *   *get_ptr_my_count() = 0;
+ *
+ * Use this instead of:
+ *
+ * .. code-block:: c
+ *   :caption: Declaring a TLS variable using __thread
+ *
+ *   extern __thread int my_count;
+ *   ...
+ *   int c = my_count;
+ *   my_count = c + 1;
+ *   *(&my_count) = 0;
+ */
+#ifdef QEMU_CO_TLS_ADDR
+#define QEMU_DECLARE_CO_TLS(type, var)                                       \
+    extern __thread type co_tls_##var;                                       \
+    static inline type get_##var(void)                                       \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); return *p; }               \
+    static inline void set_##var(type v)                                     \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); *p = v; }                  \
+    static inline type *get_ptr_##var(void)                                  \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); return p; }
+#else
+#define QEMU_DECLARE_CO_TLS(type, var)                                       \
+    __attribute__((noinline)) type get_##var(void);                          \
+    __attribute__((noinline)) void set_##var(type v);                        \
+    __attribute__((noinline, weak)) type *get_ptr_##var(void);
+#endif
+
+/**
+ * QEMU_DEFINE_CO_TLS:
+ * @type: the variable's C type
+ * @var: the variable name
+ *
+ * Define an variable in Thread Local Storage that was previously declared from
+ * a header file with QEMU_DECLARE_CO_TLS():
+ *
+ * .. code-block:: c
+ *   :caption: Defining a variable in Thread Local Storage
+ *
+ *   QEMU_DEFINE_CO_TLS(int, my_count)
+ *
+ * Use this instead of:
+ *
+ * .. code-block:: c
+ *   :caption: Defining a TLS variable using __thread
+ *
+ *   __thread int my_count;
+ */
+#ifdef QEMU_CO_TLS_ADDR
+#define QEMU_DEFINE_CO_TLS(type, var)                                        \
+    __thread type co_tls_##var;
+#else
+#define QEMU_DEFINE_CO_TLS(type, var)                                        \
+    static __thread type co_tls_##var;                                       \
+    type get_##var(void) { return co_tls_##var; }                            \
+    void set_##var(type v) { co_tls_##var = v; }                             \
+    type *get_ptr_##var(void) { return &co_tls_##var; }
+#endif
+
+/**
+ * QEMU_DEFINE_STATIC_CO_TLS:
+ * @type: the variable's C type
+ * @var: the variable name
+ *
+ * Define a static variable in Thread Local Storage:
+ *
+ * .. code-block:: c
+ *   :caption: Defining a static variable in Thread Local Storage
+ *
+ *   QEMU_DEFINE_STATIC_CO_TLS(int, my_count)
+ *   ...
+ *   int c = get_my_count();
+ *   set_my_count(c + 1);
+ *   *get_ptr_my_count() = 0;
+ *
+ * Use this instead of:
+ *
+ * .. code-block:: c
+ *   :caption: Defining a static TLS variable using __thread
+ *
+ *   static __thread int my_count;
+ *   ...
+ *   int c = my_count;
+ *   my_count = c + 1;
+ *   *(&my_count) = 0;
+ */
+#ifdef QEMU_CO_TLS_ADDR
+#define QEMU_DEFINE_STATIC_CO_TLS(type, var)                                 \
+    __thread type co_tls_##var;                                              \
+    static __attribute__((unused)) inline type get_##var(void)               \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); return *p; }               \
+    static __attribute__((unused)) inline void set_##var(type v)             \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); *p = v; }                  \
+    static __attribute__((unused)) inline type *get_ptr_##var(void)          \
+    { type *p; QEMU_CO_TLS_ADDR(p, co_tls_##var); return p; }
+#else
+#define QEMU_DEFINE_STATIC_CO_TLS(type, var)                                 \
+    static __thread type co_tls_##var;                                       \
+    static __attribute__((noinline, unused))                                 \
+    type get_##var(void)                                                     \
+    { return co_tls_##var; }                                                 \
+    static __attribute__((noinline, unused))                                 \
+    void set_##var(type v)                                                   \
+    { co_tls_##var = v; }                                                    \
+    static __attribute__((noinline, weak, unused))                           \
+    type *get_ptr_##var(void)                                                \
+    { return &co_tls_##var; }
+#endif
+
+#endif /* QEMU_COROUTINE_TLS_H */
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v3 2/4] util/async: replace __thread with QEMU TLS macros
  2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 1/4] " Stefan Hajnoczi
@ 2021-12-06 14:26 ` Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 3/4] rcu: use coroutine " Stefan Hajnoczi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 14:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf,
	Warner Losh, sguelton

QEMU TLS macros must be used to make TLS variables safe with coroutines.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/async.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/util/async.c b/util/async.c
index 6f6717a34b..ddd9f24419 100644
--- a/util/async.c
+++ b/util/async.c
@@ -32,6 +32,7 @@
 #include "qemu/rcu_queue.h"
 #include "block/raw-aio.h"
 #include "qemu/coroutine_int.h"
+#include "qemu/coroutine-tls.h"
 #include "trace.h"
 
 /***********************************************************/
@@ -669,12 +670,13 @@ void aio_context_release(AioContext *ctx)
     qemu_rec_mutex_unlock(&ctx->lock);
 }
 
-static __thread AioContext *my_aiocontext;
+QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
 
 AioContext *qemu_get_current_aio_context(void)
 {
-    if (my_aiocontext) {
-        return my_aiocontext;
+    AioContext *ctx = get_my_aiocontext();
+    if (ctx) {
+        return ctx;
     }
     if (qemu_mutex_iothread_locked()) {
         /* Possibly in a vCPU thread.  */
@@ -685,6 +687,6 @@ AioContext *qemu_get_current_aio_context(void)
 
 void qemu_set_current_aio_context(AioContext *ctx)
 {
-    assert(!my_aiocontext);
-    my_aiocontext = ctx;
+    assert(!get_my_aiocontext());
+    set_my_aiocontext(ctx);
 }
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v3 3/4] rcu: use coroutine TLS macros
  2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 1/4] " Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 2/4] util/async: replace __thread with QEMU TLS macros Stefan Hajnoczi
@ 2021-12-06 14:26 ` Stefan Hajnoczi
  2021-12-06 14:26 ` [RFC v3 4/4] cpus: use coroutine TLS macros for iothread_locked Stefan Hajnoczi
  2021-12-06 14:34 ` [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Peter Maydell
  4 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 14:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf,
	Warner Losh, sguelton

RCU may be used from coroutines. Standard __thread variables cannot be
used by coroutines. Use the coroutine TLS macros instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/rcu.h         |  7 ++++---
 tests/unit/rcutorture.c    | 10 +++++-----
 tests/unit/test-rcu-list.c |  4 ++--
 util/rcu.c                 | 10 +++++-----
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index e69efbd47f..b063c6fde8 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -29,6 +29,7 @@
 #include "qemu/atomic.h"
 #include "qemu/notify.h"
 #include "qemu/sys_membarrier.h"
+#include "qemu/coroutine-tls.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -76,11 +77,11 @@ struct rcu_reader_data {
     NotifierList force_rcu;
 };
 
-extern __thread struct rcu_reader_data rcu_reader;
+QEMU_DECLARE_CO_TLS(struct rcu_reader_data, rcu_reader)
 
 static inline void rcu_read_lock(void)
 {
-    struct rcu_reader_data *p_rcu_reader = &rcu_reader;
+    struct rcu_reader_data *p_rcu_reader = get_ptr_rcu_reader();
     unsigned ctr;
 
     if (p_rcu_reader->depth++ > 0) {
@@ -96,7 +97,7 @@ static inline void rcu_read_lock(void)
 
 static inline void rcu_read_unlock(void)
 {
-    struct rcu_reader_data *p_rcu_reader = &rcu_reader;
+    struct rcu_reader_data *p_rcu_reader = get_ptr_rcu_reader();
 
     assert(p_rcu_reader->depth != 0);
     if (--p_rcu_reader->depth > 0) {
diff --git a/tests/unit/rcutorture.c b/tests/unit/rcutorture.c
index de6f649058..495a4e6f42 100644
--- a/tests/unit/rcutorture.c
+++ b/tests/unit/rcutorture.c
@@ -122,7 +122,7 @@ static void *rcu_read_perf_test(void *arg)
 
     rcu_register_thread();
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     qatomic_inc(&nthreadsrunning);
     while (goflag == GOFLAG_INIT) {
         g_usleep(1000);
@@ -148,7 +148,7 @@ static void *rcu_update_perf_test(void *arg)
 
     rcu_register_thread();
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     qatomic_inc(&nthreadsrunning);
     while (goflag == GOFLAG_INIT) {
         g_usleep(1000);
@@ -253,7 +253,7 @@ static void *rcu_read_stress_test(void *arg)
 
     rcu_register_thread();
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     while (goflag == GOFLAG_INIT) {
         g_usleep(1000);
     }
@@ -304,7 +304,7 @@ static void *rcu_update_stress_test(void *arg)
     struct rcu_stress *cp = qatomic_read(&rcu_stress_current);
 
     rcu_register_thread();
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
 
     while (goflag == GOFLAG_INIT) {
         g_usleep(1000);
@@ -347,7 +347,7 @@ static void *rcu_fake_update_stress_test(void *arg)
 {
     rcu_register_thread();
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     while (goflag == GOFLAG_INIT) {
         g_usleep(1000);
     }
diff --git a/tests/unit/test-rcu-list.c b/tests/unit/test-rcu-list.c
index 49641e1936..64b81ae058 100644
--- a/tests/unit/test-rcu-list.c
+++ b/tests/unit/test-rcu-list.c
@@ -171,7 +171,7 @@ static void *rcu_q_reader(void *arg)
 
     rcu_register_thread();
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     qatomic_inc(&nthreadsrunning);
     while (qatomic_read(&goflag) == GOFLAG_INIT) {
         g_usleep(1000);
@@ -206,7 +206,7 @@ static void *rcu_q_updater(void *arg)
     long long n_removed_local = 0;
     struct list_element *el, *prev_el;
 
-    *(struct rcu_reader_data **)arg = &rcu_reader;
+    *(struct rcu_reader_data **)arg = get_ptr_rcu_reader();
     qatomic_inc(&nthreadsrunning);
     while (qatomic_read(&goflag) == GOFLAG_INIT) {
         g_usleep(1000);
diff --git a/util/rcu.c b/util/rcu.c
index c91da9f137..b6d6c71cff 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -65,7 +65,7 @@ static inline int rcu_gp_ongoing(unsigned long *ctr)
 /* Written to only by each individual reader. Read by both the reader and the
  * writers.
  */
-__thread struct rcu_reader_data rcu_reader;
+QEMU_DEFINE_CO_TLS(struct rcu_reader_data, rcu_reader)
 
 /* Protected by rcu_registry_lock.  */
 typedef QLIST_HEAD(, rcu_reader_data) ThreadList;
@@ -355,23 +355,23 @@ void drain_call_rcu(void)
 
 void rcu_register_thread(void)
 {
-    assert(rcu_reader.ctr == 0);
+    assert(get_ptr_rcu_reader()->ctr == 0);
     qemu_mutex_lock(&rcu_registry_lock);
-    QLIST_INSERT_HEAD(&registry, &rcu_reader, node);
+    QLIST_INSERT_HEAD(&registry, get_ptr_rcu_reader(), node);
     qemu_mutex_unlock(&rcu_registry_lock);
 }
 
 void rcu_unregister_thread(void)
 {
     qemu_mutex_lock(&rcu_registry_lock);
-    QLIST_REMOVE(&rcu_reader, node);
+    QLIST_REMOVE(get_ptr_rcu_reader(), node);
     qemu_mutex_unlock(&rcu_registry_lock);
 }
 
 void rcu_add_force_rcu_notifier(Notifier *n)
 {
     qemu_mutex_lock(&rcu_registry_lock);
-    notifier_list_add(&rcu_reader.force_rcu, n);
+    notifier_list_add(&get_ptr_rcu_reader()->force_rcu, n);
     qemu_mutex_unlock(&rcu_registry_lock);
 }
 
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC v3 4/4] cpus: use coroutine TLS macros for iothread_locked
  2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2021-12-06 14:26 ` [RFC v3 3/4] rcu: use coroutine " Stefan Hajnoczi
@ 2021-12-06 14:26 ` Stefan Hajnoczi
  2021-12-06 14:34 ` [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Peter Maydell
  4 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 14:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, Stefan Hajnoczi, Paolo Bonzini, Kevin Wolf,
	Warner Losh, sguelton

qemu_mutex_iothread_locked() may be used from coroutines. Standard
__thread variables cannot be used by coroutines. Use the coroutine TLS
macros instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 softmmu/cpus.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 071085f840..b02d3211e1 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -474,11 +474,11 @@ bool qemu_in_vcpu_thread(void)
     return current_cpu && qemu_cpu_is_self(current_cpu);
 }
 
-static __thread bool iothread_locked = false;
+QEMU_DEFINE_STATIC_CO_TLS(bool, iothread_locked)
 
 bool qemu_mutex_iothread_locked(void)
 {
-    return iothread_locked;
+    return get_iothread_locked();
 }
 
 /*
@@ -491,13 +491,13 @@ void qemu_mutex_lock_iothread_impl(const char *file, int line)
 
     g_assert(!qemu_mutex_iothread_locked());
     bql_lock(&qemu_global_mutex, file, line);
-    iothread_locked = true;
+    set_iothread_locked(true);
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
     g_assert(qemu_mutex_iothread_locked());
-    iothread_locked = false;
+    set_iothread_locked(false);
     qemu_mutex_unlock(&qemu_global_mutex);
 }
 
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2021-12-06 14:26 ` [RFC v3 4/4] cpus: use coroutine TLS macros for iothread_locked Stefan Hajnoczi
@ 2021-12-06 14:34 ` Peter Maydell
  2021-12-07 13:34   ` Stefan Hajnoczi
  2021-12-07 13:53   ` Stefan Hajnoczi
  4 siblings, 2 replies; 11+ messages in thread
From: Peter Maydell @ 2021-12-06 14:34 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> v3:
> - Added __attribute__((weak)) to get_ptr_*() [Florian]

Do we really need it *only* on get_ptr_*() ? If we need to
noinline the other two we probably also should use the same
attribute weak to force no optimizations at all.

-- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-06 14:34 ` [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Peter Maydell
@ 2021-12-07 13:34   ` Stefan Hajnoczi
  2021-12-07 13:53   ` Stefan Hajnoczi
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-07 13:34 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

[-- Attachment #1: Type: text/plain, Size: 494 bytes --]

On Mon, Dec 06, 2021 at 02:34:45PM +0000, Peter Maydell wrote:
> On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > v3:
> > - Added __attribute__((weak)) to get_ptr_*() [Florian]
> 
> Do we really need it *only* on get_ptr_*() ? If we need to
> noinline the other two we probably also should use the same
> attribute weak to force no optimizations at all.

I don't know but it does seem safer to use weak in all cases.

Florian and others?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-06 14:34 ` [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Peter Maydell
  2021-12-07 13:34   ` Stefan Hajnoczi
@ 2021-12-07 13:53   ` Stefan Hajnoczi
  2021-12-07 13:55     ` Peter Maydell
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-07 13:53 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

[-- Attachment #1: Type: text/plain, Size: 1194 bytes --]

On Mon, Dec 06, 2021 at 02:34:45PM +0000, Peter Maydell wrote:
> On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > v3:
> > - Added __attribute__((weak)) to get_ptr_*() [Florian]
> 
> Do we really need it *only* on get_ptr_*() ? If we need to
> noinline the other two we probably also should use the same
> attribute weak to force no optimizations at all.

The weak attribute can't be used on static functions, so I think we need
a different approach:

In file included from ../util/async.c:35:
/builds/stefanha/qemu/include/qemu/coroutine-tls.h:201:11: error: weak declaration of 'get_ptr_my_aiocontext' must be public
     type *get_ptr_##var(void)                                                \
           ^~~~~~~~
../util/async.c:673:1: note: in expansion of macro 'QEMU_DEFINE_STATIC_CO_TLS'
 QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
 ^~~~~~~~~~~~~~~~~~~~~~~~~

Adding asm volatile("") seems to work though:
https://godbolt.org/z/3hn8Gh41d

The GCC documentation mentions combining noinline with asm(""):
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noinline-function-attribute

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-07 13:53   ` Stefan Hajnoczi
@ 2021-12-07 13:55     ` Peter Maydell
  2021-12-07 16:29       ` Stefan Hajnoczi
  2021-12-13 14:09       ` Stefan Hajnoczi
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Maydell @ 2021-12-07 13:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

On Tue, 7 Dec 2021 at 13:53, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Mon, Dec 06, 2021 at 02:34:45PM +0000, Peter Maydell wrote:
> > On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > v3:
> > > - Added __attribute__((weak)) to get_ptr_*() [Florian]
> >
> > Do we really need it *only* on get_ptr_*() ? If we need to
> > noinline the other two we probably also should use the same
> > attribute weak to force no optimizations at all.
>
> The weak attribute can't be used on static functions, so I think we need
> a different approach:
>
> In file included from ../util/async.c:35:
> /builds/stefanha/qemu/include/qemu/coroutine-tls.h:201:11: error: weak declaration of 'get_ptr_my_aiocontext' must be public
>      type *get_ptr_##var(void)                                                \
>            ^~~~~~~~
> ../util/async.c:673:1: note: in expansion of macro 'QEMU_DEFINE_STATIC_CO_TLS'
>  QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
>  ^~~~~~~~~~~~~~~~~~~~~~~~~
>
> Adding asm volatile("") seems to work though:
> https://godbolt.org/z/3hn8Gh41d

You can see in the clang disassembly there that this isn't
sufficient. The compiler puts in both calls, but it ignores
the return results and always returns "true" from the function.

-- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-07 13:55     ` Peter Maydell
@ 2021-12-07 16:29       ` Stefan Hajnoczi
  2021-12-13 14:09       ` Stefan Hajnoczi
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-07 16:29 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]

On Tue, Dec 07, 2021 at 01:55:34PM +0000, Peter Maydell wrote:
> On Tue, 7 Dec 2021 at 13:53, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Mon, Dec 06, 2021 at 02:34:45PM +0000, Peter Maydell wrote:
> > > On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > v3:
> > > > - Added __attribute__((weak)) to get_ptr_*() [Florian]
> > >
> > > Do we really need it *only* on get_ptr_*() ? If we need to
> > > noinline the other two we probably also should use the same
> > > attribute weak to force no optimizations at all.
> >
> > The weak attribute can't be used on static functions, so I think we need
> > a different approach:
> >
> > In file included from ../util/async.c:35:
> > /builds/stefanha/qemu/include/qemu/coroutine-tls.h:201:11: error: weak declaration of 'get_ptr_my_aiocontext' must be public
> >      type *get_ptr_##var(void)                                                \
> >            ^~~~~~~~
> > ../util/async.c:673:1: note: in expansion of macro 'QEMU_DEFINE_STATIC_CO_TLS'
> >  QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
> >  ^~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Adding asm volatile("") seems to work though:
> > https://godbolt.org/z/3hn8Gh41d
> 
> You can see in the clang disassembly there that this isn't
> sufficient. The compiler puts in both calls, but it ignores
> the return results and always returns "true" from the function.

You're right! I missed that the return value of the call isn't used >_<.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables
  2021-12-07 13:55     ` Peter Maydell
  2021-12-07 16:29       ` Stefan Hajnoczi
@ 2021-12-13 14:09       ` Stefan Hajnoczi
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2021-12-13 14:09 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Fam Zheng, fweimer, thuth, Daniel Berrange, qemu-block,
	Richard Henderson, qemu-devel, Kevin Wolf, Paolo Bonzini,
	Warner Losh, sguelton

[-- Attachment #1: Type: text/plain, Size: 1725 bytes --]

On Tue, Dec 07, 2021 at 01:55:34PM +0000, Peter Maydell wrote:
> On Tue, 7 Dec 2021 at 13:53, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Mon, Dec 06, 2021 at 02:34:45PM +0000, Peter Maydell wrote:
> > > On Mon, 6 Dec 2021 at 14:33, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > v3:
> > > > - Added __attribute__((weak)) to get_ptr_*() [Florian]
> > >
> > > Do we really need it *only* on get_ptr_*() ? If we need to
> > > noinline the other two we probably also should use the same
> > > attribute weak to force no optimizations at all.
> >
> > The weak attribute can't be used on static functions, so I think we need
> > a different approach:
> >
> > In file included from ../util/async.c:35:
> > /builds/stefanha/qemu/include/qemu/coroutine-tls.h:201:11: error: weak declaration of 'get_ptr_my_aiocontext' must be public
> >      type *get_ptr_##var(void)                                                \
> >            ^~~~~~~~
> > ../util/async.c:673:1: note: in expansion of macro 'QEMU_DEFINE_STATIC_CO_TLS'
> >  QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext)
> >  ^~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Adding asm volatile("") seems to work though:
> > https://godbolt.org/z/3hn8Gh41d
> 
> You can see in the clang disassembly there that this isn't
> sufficient. The compiler puts in both calls, but it ignores
> the return results and always returns "true" from the function.

Specifying a input operand forces the compiler to evaluate the return
value of get_ptr_i():

  static __attribute__((noinline)) long get_ptr_i()
  {
    long l = (long)&i;
    asm volatile("" : "+rm" (l));
    return l;
  }

https://godbolt.org/z/e6ddGdPMv

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-12-13 14:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 14:26 [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Stefan Hajnoczi
2021-12-06 14:26 ` [RFC v3 1/4] " Stefan Hajnoczi
2021-12-06 14:26 ` [RFC v3 2/4] util/async: replace __thread with QEMU TLS macros Stefan Hajnoczi
2021-12-06 14:26 ` [RFC v3 3/4] rcu: use coroutine " Stefan Hajnoczi
2021-12-06 14:26 ` [RFC v3 4/4] cpus: use coroutine TLS macros for iothread_locked Stefan Hajnoczi
2021-12-06 14:34 ` [RFC v3 0/4] tls: add macros for coroutine-safe TLS variables Peter Maydell
2021-12-07 13:34   ` Stefan Hajnoczi
2021-12-07 13:53   ` Stefan Hajnoczi
2021-12-07 13:55     ` Peter Maydell
2021-12-07 16:29       ` Stefan Hajnoczi
2021-12-13 14:09       ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.