qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/19] Add Thread Sanitizer support to QEMU
@ 2020-05-22 16:07 Robert Foley
  2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
                   ` (20 more replies)
  0 siblings, 21 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

This patch series continues the work done by Emilio Cota and others to add
Thread Sanitizer (TSan) support to QEMU.

The starting point for this work was Emilio's branch here:
https://github.com/cota/qemu/commits/tsan
specifically this commit: 0be125fc0afd47218b34d2019abdd19b644f3199

The purpose of this patch is not to fix all the TSan warnings, but to enable
the TSan support so that QEMU developers can start using the tool.  
We found this tool useful and even ran it on our recent changes in
the cpu-locks series.
Clearly there is work to do here to clean up all the warnings. :)  
We have made a start to cleaning up these warnings by getting a VM to boot 
cleanly with no TSan warnings.  
We have also made an effort to introduce enough of the TSan suppression
mechanisms, so that others can continue this work.

This series adds support for:
- configure option for --enable-tsan.
- testing.rst has the full details on how to use TSan with docker
  and also outside of docker.
- Docker builds with TSan.
  - We added an Ubuntu 20.04 docker that supports TSan builds.
  - Something like this will build TSan
    make docker-test-build@ubuntu2004 DEBUG=1 TSAN=1
  - Testing with TSan is also supported with docker,
    although, be forwarned that test-quick currently fails.  
    See "Issues" section below for the current failures.
    make docker-test-quick@ubuntu2004 DEBUG=1 TSAN=1
  - We recommend using the DEBUG=1 option and launching the test 
   (like test-quick) from inside the docker so that when the test is done,
    you can review the warnings from inside the docker.
  - testing.rst has the full details on how to use TSan with docker.
- We added a blacklist file for files/functions
  TSan should ignore at compile time.
- And added a suppression file for TSan to suppress certain warnings at
  run time.  
  We found both of these mechanisms are needed when suppressing warnings.
- It is also worth mentioning that we were able to suppress/fix enough errors
  to allow an Ubuntu 18.04 aarch64 VM to boot with zero TSan warnings.  
  When we started this effort, there were ~300 warnings reported by 
  TSan during the same VM boot !

Issues:
- When running docker-test-quick under TSan there are several tests which hang
  - The unit tests which seem to hang under TSan:
    test-char, test-qdev-global-props, and test-qga.  
  - If we comment out those tests, check-unit finishes, albeit with 
    a couple of warnings. :)


Emilio G. Cota (7):
  cpu: convert queued work to a QSIMPLEQ
  thread: add qemu_spin_destroy
  cputlb: destroy CPUTLB with tlb_destroy
  qht: call qemu_spin_destroy for head buckets
  tcg: call qemu_spin_destroy for tb->jmp_lock
  translate-all: call qemu_spin_destroy for PageDesc
  thread: add tsan annotations to QemuSpin

Lingfeng Yang (1):
  configure: add --enable-tsan flag + fiber annotations for
    coroutine-ucontext

Robert Foley (11):
  tests/docker: Added docker build support for TSan.
  include/qemu: Added tsan.h for annotations.
  accel/tcg: Fixed tsan warnings related to parallel_cpus
  configure: added tsan support for blacklist.
  accel/tcg: Fixed tsan warnings.
  util/async: Fixed tsan warnings
  qht: Fix tsan warnings.
  util: fixed tsan warnings in thread_pool.c
  util: Added tsan annotate for thread name.
  target/arm: Fix tsan warning in cpu.c
  docs: Added details on TSan to testing.rst

 accel/tcg/cpu-exec.c                       |  4 +-
 accel/tcg/cputlb.c                         | 15 ++++
 accel/tcg/tcg-all.c                        |  4 +-
 accel/tcg/tcg-runtime.c                    |  7 +-
 accel/tcg/translate-all.c                  | 25 +++++-
 configure                                  | 40 +++++++++
 cpus-common.c                              | 25 ++----
 cpus.c                                     | 16 +++-
 docs/devel/testing.rst                     | 72 ++++++++++++++++
 exec.c                                     |  1 +
 hw/core/cpu.c                              |  3 +-
 include/exec/exec-all.h                    | 10 ++-
 include/hw/core/cpu.h                      |  6 +-
 include/qemu/thread.h                      | 38 ++++++++-
 include/qemu/tsan.h                        | 48 +++++++++++
 include/tcg/tcg.h                          |  3 +-
 linux-user/syscall.c                       |  4 +-
 target/arm/cpu.c                           |  2 +-
 tcg/tcg.c                                  | 19 ++++-
 tests/docker/Makefile.include              |  2 +
 tests/docker/common.rc                     | 19 +++++
 tests/docker/dockerfiles/ubuntu2004.docker | 65 +++++++++++++++
 tests/tsan/blacklist.tsan                  |  5 ++
 tests/tsan/suppressions.tsan               | 14 ++++
 util/async.c                               | 11 ++-
 util/coroutine-ucontext.c                  | 97 ++++++++++++++++++++--
 util/qemu-thread-posix.c                   |  2 +
 util/qht.c                                 |  4 +
 util/thread-pool.c                         |  5 +-
 29 files changed, 514 insertions(+), 52 deletions(-)
 create mode 100644 include/qemu/tsan.h
 create mode 100644 tests/docker/dockerfiles/ubuntu2004.docker
 create mode 100644 tests/tsan/blacklist.tsan
 create mode 100644 tests/tsan/suppressions.tsan

-- 
2.17.1



^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 16:55   ` Philippe Mathieu-Daudé
  2020-06-17 14:24   ` Stefan Hajnoczi
  2020-05-22 16:07 ` [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ Robert Foley
                   ` (19 subsequent siblings)
  20 siblings, 2 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley, Lingfeng Yang

From: Lingfeng Yang <lfy@google.com>

We tried running QEMU under tsan in 2016, but tsan's lack of support for
longjmp-based fibers was a blocker:
  https://groups.google.com/forum/#!topic/thread-sanitizer/se0YuzfWazw

Fortunately, thread sanitizer gained fiber support in early 2019:
  https://reviews.llvm.org/D54889

This patch brings tsan support upstream by importing the patch that annotated
QEMU's coroutines as tsan fibers in Android's QEMU fork:
  https://android-review.googlesource.com/c/platform/external/qemu/+/844675

Tested with '--enable-tsan --cc=clang-9 --cxx=clang++-9 --disable-werror'
configure flags.

Signed-off-by: Lingfeng Yang <lfy@google.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
[cota: minor modifications + configure changes]
Signed-off-by: Robert Foley <robert.foley@linaro.org>
[RF: minor changes to clean up checkpatch warnings/errors]
---
 configure                 | 39 ++++++++++++++++
 util/coroutine-ucontext.c | 97 +++++++++++++++++++++++++++++++++++----
 2 files changed, 127 insertions(+), 9 deletions(-)

diff --git a/configure b/configure
index 26084fc53a..c95c54fb48 100755
--- a/configure
+++ b/configure
@@ -395,6 +395,7 @@ gprof="no"
 debug_tcg="no"
 debug="no"
 sanitizers="no"
+tsan="no"
 fortify_source=""
 strip_opt="yes"
 tcg_interpreter="no"
@@ -1150,6 +1151,10 @@ for opt do
   ;;
   --disable-sanitizers) sanitizers="no"
   ;;
+  --enable-tsan) tsan="yes"
+  ;;
+  --disable-tsan) tsan="no"
+  ;;
   --enable-sparse) sparse="yes"
   ;;
   --disable-sparse) sparse="no"
@@ -1750,6 +1755,7 @@ Advanced options (experts only):
   --with-pkgversion=VERS   use specified string as sub-version of the package
   --enable-debug           enable common debug build options
   --enable-sanitizers      enable default sanitizers
+  --enable-tsan            enable thread sanitizer
   --disable-strip          disable stripping binaries
   --disable-werror         disable compilation abort on warning
   --disable-stack-protector disable compiler-provided stack protection
@@ -6176,6 +6182,27 @@ if test "$fuzzing" = "yes" ; then
   fi
 fi
 
+# Thread sanitizer is, for now, much noisier than the other sanitizers;
+# keep it separate until that is not the case.
+have_tsan=no
+have_tsan_iface_fiber=no
+if test "$tsan" = "yes" ; then
+  write_c_skeleton
+  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=thread" "" ; then
+      have_tsan=yes
+  fi
+  cat > $TMPC << EOF
+#include <sanitizer/tsan_interface.h>
+int main(void) {
+  __tsan_create_fiber(0);
+  return 0;
+}
+EOF
+  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=thread" "" ; then
+      have_tsan_iface_fiber=yes
+  fi
+fi
+
 ##########################################
 # check for libpmem
 
@@ -6277,6 +6304,14 @@ if test "$have_asan" = "yes"; then
            "Without code annotation, the report may be inferior."
   fi
 fi
+if test "$have_tsan" = "yes" ; then
+  if test "$have_tsan_iface_fiber" = "yes" ; then
+    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
+    QEMU_LDFLAGS="-fsanitize=thread $QEMU_LDFLAGS"
+  else
+    echo "Cannot enable TSAN due to missing fiber annotation interface."
+  fi
+fi
 if test "$have_ubsan" = "yes"; then
   QEMU_CFLAGS="-fsanitize=undefined $QEMU_CFLAGS"
   QEMU_LDFLAGS="-fsanitize=undefined $QEMU_LDFLAGS"
@@ -7365,6 +7400,10 @@ if test "$have_asan_iface_fiber" = "yes" ; then
     echo "CONFIG_ASAN_IFACE_FIBER=y" >> $config_host_mak
 fi
 
+if test "$have_tsan" = "yes" && test "$have_tsan_iface_fiber" = "yes" ; then
+    echo "CONFIG_TSAN=y" >> $config_host_mak
+fi
+
 if test "$has_environ" = "yes" ; then
   echo "CONFIG_HAS_ENVIRON=y" >> $config_host_mak
 fi
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index bd593e61bc..a3dc78e67a 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -37,18 +37,33 @@
 #endif
 #endif
 
+#ifdef CONFIG_TSAN
+#include <sanitizer/tsan_interface.h>
+#endif
+
 typedef struct {
     Coroutine base;
     void *stack;
     size_t stack_size;
     sigjmp_buf env;
 
+    void *tsan_co_fiber;
+    void *tsan_caller_fiber;
+
 #ifdef CONFIG_VALGRIND_H
     unsigned int valgrind_stack_id;
 #endif
 
 } CoroutineUContext;
 
+#define UC_DEBUG 0
+#if UC_DEBUG && defined(CONFIG_TSAN)
+#define UC_TRACE(fmt, ...) fprintf(stderr, "%s:%d:%p " fmt "\n", \
+    __func__, __LINE__, __tsan_get_current_fiber(), ##__VA_ARGS__);
+#else
+#define UC_TRACE(fmt, ...)
+#endif
+
 /**
  * Per-thread coroutine bookkeeping
  */
@@ -65,7 +80,20 @@ union cc_arg {
     int i[2];
 };
 
-static void finish_switch_fiber(void *fake_stack_save)
+/* QEMU_ALWAYS_INLINE only does so if __OPTIMIZE__, so we cannot use it. */
+static inline __attribute__((always_inline))
+void on_new_fiber(CoroutineUContext *co)
+{
+#ifdef CONFIG_TSAN
+    co->tsan_co_fiber = __tsan_create_fiber(0); /* flags: sync on switch */
+    co->tsan_caller_fiber = __tsan_get_current_fiber();
+    UC_TRACE("Create new TSAN co fiber. co: %p co fiber: %p caller fiber: %p ",
+             co, co->tsan_co_fiber, co->tsan_caller_fiber);
+#endif
+}
+
+static inline __attribute__((always_inline))
+void finish_switch_fiber(void *fake_stack_save)
 {
 #ifdef CONFIG_ASAN
     const void *bottom_old;
@@ -78,18 +106,40 @@ static void finish_switch_fiber(void *fake_stack_save)
         leader.stack_size = size_old;
     }
 #endif
+#ifdef CONFIG_TSAN
+    if (fake_stack_save) {
+        __tsan_release(fake_stack_save);
+        __tsan_switch_to_fiber(fake_stack_save, 0);  /* 0=synchronize */
+    }
+#endif
 }
 
-static void start_switch_fiber(void **fake_stack_save,
-                               const void *bottom, size_t size)
+static inline __attribute__((always_inline)) void start_switch_fiber(
+    CoroutineAction action, void **fake_stack_save,
+    const void *bottom, size_t size, void *new_fiber)
 {
 #ifdef CONFIG_ASAN
-    __sanitizer_start_switch_fiber(fake_stack_save, bottom, size);
+    if (action == COROUTINE_TERMINATE) {
+        __sanitizer_start_switch_fiber(
+            action == COROUTINE_TERMINATE ? NULL : fake_stack_save,
+            bottom, size);
+    }
+#endif
+#ifdef CONFIG_TSAN
+    void *curr_fiber =
+        __tsan_get_current_fiber();
+    __tsan_acquire(curr_fiber);
+
+    UC_TRACE("Current fiber: %p.", curr_fiber);
+    *fake_stack_save = curr_fiber;
+    UC_TRACE("Switch to fiber %p", new_fiber);
+    __tsan_switch_to_fiber(new_fiber, 0);  /* 0=synchronize */
 #endif
 }
 
 static void coroutine_trampoline(int i0, int i1)
 {
+    UC_TRACE("Start trampoline");
     union cc_arg arg;
     CoroutineUContext *self;
     Coroutine *co;
@@ -104,21 +154,34 @@ static void coroutine_trampoline(int i0, int i1)
 
     /* Initialize longjmp environment and switch back the caller */
     if (!sigsetjmp(self->env, 0)) {
-        start_switch_fiber(&fake_stack_save,
-                           leader.stack, leader.stack_size);
+        UC_TRACE("Current fiber: %p. Set co %p to env 0x%lx",
+                 __tsan_get_current_fiber(), self, (unsigned long)self->env);
+        start_switch_fiber(
+            COROUTINE_YIELD,
+            &fake_stack_save,
+            leader.stack,
+            leader.stack_size,
+            self->tsan_caller_fiber);
+        UC_TRACE("Jump to co %p caller fiber %p env 0x%lx",
+                 co, self->tsan_caller_fiber, *(unsigned long *)co->entry_arg);
         siglongjmp(*(sigjmp_buf *)co->entry_arg, 1);
     }
 
+    UC_TRACE("After first siglongjmp");
+
     finish_switch_fiber(fake_stack_save);
 
     while (true) {
         co->entry(co->entry_arg);
+        UC_TRACE("switch from co %p to caller co %p fiber %p\n",
+                 co, co->caller, self->tsan_caller_fiber);
         qemu_coroutine_switch(co, co->caller, COROUTINE_TERMINATE);
     }
 }
 
 Coroutine *qemu_coroutine_new(void)
 {
+    UC_TRACE("Start new coroutine");
     CoroutineUContext *co;
     ucontext_t old_uc, uc;
     sigjmp_buf old_env;
@@ -154,12 +217,16 @@ Coroutine *qemu_coroutine_new(void)
 
     arg.p = co;
 
+    on_new_fiber(co);
     makecontext(&uc, (void (*)(void))coroutine_trampoline,
                 2, arg.i[0], arg.i[1]);
 
     /* swapcontext() in, siglongjmp() back out */
     if (!sigsetjmp(old_env, 0)) {
-        start_switch_fiber(&fake_stack_save, co->stack, co->stack_size);
+        start_switch_fiber(
+            COROUTINE_YIELD,
+            &fake_stack_save,
+            co->stack, co->stack_size, co->tsan_co_fiber);
         swapcontext(&old_uc, &uc);
     }
 
@@ -185,6 +252,7 @@ static inline void valgrind_stack_deregister(CoroutineUContext *co)
 
 void qemu_coroutine_delete(Coroutine *co_)
 {
+    UC_TRACE("Nuking co %p from orbit", co_);
     CoroutineUContext *co = DO_UPCAST(CoroutineUContext, base, co_);
 
 #ifdef CONFIG_VALGRIND_H
@@ -209,6 +277,10 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
 {
     CoroutineUContext *from = DO_UPCAST(CoroutineUContext, base, from_);
     CoroutineUContext *to = DO_UPCAST(CoroutineUContext, base, to_);
+    UC_TRACE("from to: %p %p uc: %p %p. fibers: %p %p caller fibers: %p %p\n",
+            from_, to_, from, to,
+            from->tsan_co_fiber, to->tsan_co_fiber,
+            from->tsan_caller_fiber, to->tsan_caller_fiber);
     int ret;
     void *fake_stack_save = NULL;
 
@@ -216,8 +288,8 @@ qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
 
     ret = sigsetjmp(from->env, 0);
     if (ret == 0) {
-        start_switch_fiber(action == COROUTINE_TERMINATE ?
-                           NULL : &fake_stack_save, to->stack, to->stack_size);
+        start_switch_fiber(action, &fake_stack_save,
+                           to->stack, to->stack_size, to->tsan_co_fiber);
         siglongjmp(to->env, action);
     }
 
@@ -231,6 +303,13 @@ Coroutine *qemu_coroutine_self(void)
     if (!current) {
         current = &leader.base;
     }
+#ifdef CONFIG_TSAN
+    if (!leader.tsan_co_fiber) {
+        leader.tsan_co_fiber = __tsan_get_current_fiber();
+        UC_TRACE("For co %p set leader co fiber to %p",
+                 current, leader.tsan_co_fiber);
+    }
+#endif
     return current;
 }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
  2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-24 10:20   ` Philippe Mathieu-Daudé
  2020-05-22 16:07 ` [PATCH 03/19] thread: add qemu_spin_destroy Robert Foley
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

Instead of open-coding it.

While at it, make sure that all accesses to the list are
performed while holding the list's lock.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 cpus-common.c         | 25 ++++++++-----------------
 cpus.c                | 14 ++++++++++++--
 hw/core/cpu.c         |  1 +
 include/hw/core/cpu.h |  6 +++---
 4 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 55d5df8923..210fc7fc39 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -97,7 +97,7 @@ void cpu_list_remove(CPUState *cpu)
 }
 
 struct qemu_work_item {
-    struct qemu_work_item *next;
+    QSIMPLEQ_ENTRY(qemu_work_item) node;
     run_on_cpu_func func;
     run_on_cpu_data data;
     bool free, exclusive, done;
@@ -106,13 +106,7 @@ struct qemu_work_item {
 static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
 {
     qemu_mutex_lock(&cpu->work_mutex);
-    if (cpu->queued_work_first == NULL) {
-        cpu->queued_work_first = wi;
-    } else {
-        cpu->queued_work_last->next = wi;
-    }
-    cpu->queued_work_last = wi;
-    wi->next = NULL;
+    QSIMPLEQ_INSERT_TAIL(&cpu->work_list, wi, node);
     wi->done = false;
     qemu_mutex_unlock(&cpu->work_mutex);
 
@@ -306,17 +300,14 @@ void process_queued_cpu_work(CPUState *cpu)
 {
     struct qemu_work_item *wi;
 
-    if (cpu->queued_work_first == NULL) {
+    qemu_mutex_lock(&cpu->work_mutex);
+    if (QSIMPLEQ_EMPTY(&cpu->work_list)) {
+        qemu_mutex_unlock(&cpu->work_mutex);
         return;
     }
-
-    qemu_mutex_lock(&cpu->work_mutex);
-    while (cpu->queued_work_first != NULL) {
-        wi = cpu->queued_work_first;
-        cpu->queued_work_first = wi->next;
-        if (!cpu->queued_work_first) {
-            cpu->queued_work_last = NULL;
-        }
+    while (!QSIMPLEQ_EMPTY(&cpu->work_list)) {
+        wi = QSIMPLEQ_FIRST(&cpu->work_list);
+        QSIMPLEQ_REMOVE_HEAD(&cpu->work_list, node);
         qemu_mutex_unlock(&cpu->work_mutex);
         if (wi->exclusive) {
             /* Running work items outside the BQL avoids the following deadlock:
diff --git a/cpus.c b/cpus.c
index 5670c96bcf..af44027549 100644
--- a/cpus.c
+++ b/cpus.c
@@ -97,9 +97,19 @@ bool cpu_is_stopped(CPUState *cpu)
     return cpu->stopped || !runstate_is_running();
 }
 
+static inline bool cpu_work_list_empty(CPUState *cpu)
+{
+    bool ret;
+
+    qemu_mutex_lock(&cpu->work_mutex);
+    ret = QSIMPLEQ_EMPTY(&cpu->work_list);
+    qemu_mutex_unlock(&cpu->work_mutex);
+    return ret;
+}
+
 static bool cpu_thread_is_idle(CPUState *cpu)
 {
-    if (cpu->stop || cpu->queued_work_first) {
+    if (cpu->stop || !cpu_work_list_empty(cpu)) {
         return false;
     }
     if (cpu_is_stopped(cpu)) {
@@ -1498,7 +1508,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
             cpu = first_cpu;
         }
 
-        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
+        while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) {
 
             atomic_mb_set(&tcg_current_rr_cpu, cpu);
             current_cpu = cpu;
diff --git a/hw/core/cpu.c b/hw/core/cpu.c
index 5284d384fb..77703d62b7 100644
--- a/hw/core/cpu.c
+++ b/hw/core/cpu.c
@@ -368,6 +368,7 @@ static void cpu_common_initfn(Object *obj)
     cpu->nr_threads = 1;
 
     qemu_mutex_init(&cpu->work_mutex);
+    QSIMPLEQ_INIT(&cpu->work_list);
     QTAILQ_INIT(&cpu->breakpoints);
     QTAILQ_INIT(&cpu->watchpoints);
 
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 07f7698155..d78ff1d165 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -331,8 +331,8 @@ struct qemu_work_item;
  * @opaque: User data.
  * @mem_io_pc: Host Program Counter at which the memory was accessed.
  * @kvm_fd: vCPU file descriptor for KVM.
- * @work_mutex: Lock to prevent multiple access to queued_work_*.
- * @queued_work_first: First asynchronous work pending.
+ * @work_mutex: Lock to prevent multiple access to @work_list.
+ * @work_list: List of pending asynchronous work.
  * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes
  *                        to @trace_dstate).
  * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask).
@@ -376,7 +376,7 @@ struct CPUState {
     sigjmp_buf jmp_env;
 
     QemuMutex work_mutex;
-    struct qemu_work_item *queued_work_first, *queued_work_last;
+    QSIMPLEQ_HEAD(, qemu_work_item) work_list;
 
     CPUAddressSpace *cpu_ases;
     int num_ases;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/19] thread: add qemu_spin_destroy
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
  2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
  2020-05-22 16:07 ` [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 04/19] cputlb: destroy CPUTLB with tlb_destroy Robert Foley
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

It will be used for TSAN annotations.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 include/qemu/thread.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index d22848138e..e50a073889 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -215,6 +215,9 @@ static inline void qemu_spin_init(QemuSpin *spin)
     __sync_lock_release(&spin->value);
 }
 
+static inline void qemu_spin_destroy(QemuSpin *spin)
+{ }
+
 static inline void qemu_spin_lock(QemuSpin *spin)
 {
     while (unlikely(__sync_lock_test_and_set(&spin->value, true))) {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/19] cputlb: destroy CPUTLB with tlb_destroy
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (2 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 03/19] thread: add qemu_spin_destroy Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 05/19] qht: call qemu_spin_destroy for head buckets Robert Foley
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

I was after adding qemu_spin_destroy calls, but while at
it I noticed that we are leaking some memory.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 accel/tcg/cputlb.c      | 15 +++++++++++++++
 exec.c                  |  1 +
 include/exec/exec-all.h |  8 ++++++++
 3 files changed, 24 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index eb2cf9de5e..1e815357c7 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -270,6 +270,21 @@ void tlb_init(CPUState *cpu)
     }
 }
 
+void tlb_destroy(CPUState *cpu)
+{
+    CPUArchState *env = cpu->env_ptr;
+    int i;
+
+    qemu_spin_destroy(&env_tlb(env)->c.lock);
+    for (i = 0; i < NB_MMU_MODES; i++) {
+        CPUTLBDesc *desc = &env_tlb(env)->d[i];
+        CPUTLBDescFast *fast = &env_tlb(env)->f[i];
+
+        g_free(fast->table);
+        g_free(desc->iotlb);
+    }
+}
+
 /* flush_all_helper: run fn across all cpus
  *
  * If the wait flag is set then the src cpu's helper will be queued as
diff --git a/exec.c b/exec.c
index 5162f0d12f..da3d60b034 100644
--- a/exec.c
+++ b/exec.c
@@ -892,6 +892,7 @@ void cpu_exec_unrealizefn(CPUState *cpu)
 {
     CPUClass *cc = CPU_GET_CLASS(cpu);
 
+    tlb_destroy(cpu);
     cpu_list_remove(cpu);
 
     if (cc->vmsd != NULL) {
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 8792bea07a..3cf88272df 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -124,6 +124,11 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
  * @cpu: CPU whose TLB should be initialized
  */
 void tlb_init(CPUState *cpu);
+/**
+ * tlb_destroy - destroy a CPU's TLB
+ * @cpu: CPU whose TLB should be destroyed
+ */
+void tlb_destroy(CPUState *cpu);
 /**
  * tlb_flush_page:
  * @cpu: CPU whose TLB should be flushed
@@ -284,6 +289,9 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 static inline void tlb_init(CPUState *cpu)
 {
 }
+static inline void tlb_destroy(CPUState *cpu)
+{
+}
 static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
 {
 }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/19] qht: call qemu_spin_destroy for head buckets
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (3 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 04/19] cputlb: destroy CPUTLB with tlb_destroy Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 06/19] tcg: call qemu_spin_destroy for tb->jmp_lock Robert Foley
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 util/qht.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/util/qht.c b/util/qht.c
index aa51be3c52..67e5d5b916 100644
--- a/util/qht.c
+++ b/util/qht.c
@@ -348,6 +348,7 @@ static inline void qht_chain_destroy(const struct qht_bucket *head)
     struct qht_bucket *curr = head->next;
     struct qht_bucket *prev;
 
+    qemu_spin_destroy(&head->lock);
     while (curr) {
         prev = curr;
         curr = curr->next;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/19] tcg: call qemu_spin_destroy for tb->jmp_lock
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (4 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 05/19] qht: call qemu_spin_destroy for head buckets Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 07/19] translate-all: call qemu_spin_destroy for PageDesc Robert Foley
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
[RF: Minor changes to fix some checkpatch errors]
---
 accel/tcg/translate-all.c | 10 +++++++++-
 include/tcg/tcg.h         |  3 ++-
 tcg/tcg.c                 | 19 ++++++++++++++++---
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 42ce1dfcff..3708aab36b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -384,6 +384,11 @@ static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
     return 0;
 }
 
+static void tb_destroy(TranslationBlock *tb)
+{
+    qemu_spin_destroy(&tb->jmp_lock);
+}
+
 bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
 {
     TranslationBlock *tb;
@@ -413,6 +418,7 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
                 /* one-shot translation, invalidate it immediately */
                 tb_phys_invalidate(tb, -1);
                 tcg_tb_remove(tb);
+                tb_destroy(tb);
             }
             r = true;
         }
@@ -1230,7 +1236,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count)
     qht_reset_size(&tb_ctx.htable, CODE_GEN_HTABLE_SIZE);
     page_flush_tb();
 
-    tcg_region_reset_all();
+    tcg_region_reset_all(tb_destroy);
     /* XXX: flush processor icache at this point if cache flush is
        expensive */
     atomic_mb_set(&tb_ctx.tb_flush_count, tb_ctx.tb_flush_count + 1);
@@ -1886,6 +1892,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 
         orig_aligned -= ROUND_UP(sizeof(*tb), qemu_icache_linesize);
         atomic_set(&tcg_ctx->code_gen_ptr, (void *)orig_aligned);
+        tb_destroy(tb);
         return existing_tb;
     }
     tcg_tb_insert(tb);
@@ -2235,6 +2242,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
             tb_phys_invalidate(tb->orig_tb, -1);
         }
         tcg_tb_remove(tb);
+        tb_destroy(tb);
     }
 
     /* TODO: If env->pc != tb->pc (i.e. the faulting instruction was not
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index c48bd76b0a..2c5997e14e 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -815,8 +815,9 @@ void *tcg_malloc_internal(TCGContext *s, int size);
 void tcg_pool_reset(TCGContext *s);
 TranslationBlock *tcg_tb_alloc(TCGContext *s);
 
+typedef void (*tb_destroy_func)(TranslationBlock *tb);
 void tcg_region_init(void);
-void tcg_region_reset_all(void);
+void tcg_region_reset_all(tb_destroy_func tb_destroy);
 
 size_t tcg_code_size(void);
 size_t tcg_code_capacity(void);
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a2268d9db0..2680968683 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -502,7 +502,16 @@ size_t tcg_nb_tbs(void)
     return nb_tbs;
 }
 
-static void tcg_region_tree_reset_all(void)
+static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
+{
+    TranslationBlock *tb = v;
+    tb_destroy_func tb_destroy = data;
+
+    tb_destroy(tb);
+    return FALSE;
+}
+
+static void tcg_region_tree_reset_all(tb_destroy_func tb_destroy)
 {
     size_t i;
 
@@ -510,6 +519,10 @@ static void tcg_region_tree_reset_all(void)
     for (i = 0; i < region.n; i++) {
         struct tcg_region_tree *rt = region_trees + i * tree_size;
 
+        if (tb_destroy != NULL) {
+            g_tree_foreach(rt->tree, tcg_region_tree_traverse, tb_destroy);
+        }
+
         /* Increment the refcount first so that destroy acts as a reset */
         g_tree_ref(rt->tree);
         g_tree_destroy(rt->tree);
@@ -586,7 +599,7 @@ static inline bool tcg_region_initial_alloc__locked(TCGContext *s)
 }
 
 /* Call from a safe-work context */
-void tcg_region_reset_all(void)
+void tcg_region_reset_all(tb_destroy_func tb_destroy)
 {
     unsigned int n_ctxs = atomic_read(&n_tcg_ctxs);
     unsigned int i;
@@ -603,7 +616,7 @@ void tcg_region_reset_all(void)
     }
     qemu_mutex_unlock(&region.lock);
 
-    tcg_region_tree_reset_all();
+    tcg_region_tree_reset_all(tb_destroy);
 }
 
 #ifdef CONFIG_USER_ONLY
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/19] translate-all: call qemu_spin_destroy for PageDesc
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (5 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 06/19] tcg: call qemu_spin_destroy for tb->jmp_lock Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 08/19] thread: add tsan annotations to QemuSpin Robert Foley
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

The radix tree is append-only, but we can fail to insert
a PageDesc if the insertion races with another thread.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 accel/tcg/translate-all.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 3708aab36b..3fb71a1503 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -547,6 +547,15 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
 #endif
         existing = atomic_cmpxchg(lp, NULL, pd);
         if (unlikely(existing)) {
+#ifndef CONFIG_USER_ONLY
+            {
+                int i;
+
+                for (i = 0; i < V_L2_SIZE; i++) {
+                    qemu_spin_destroy(&pd[i].lock);
+                }
+            }
+#endif
             g_free(pd);
             pd = existing;
         }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/19] thread: add tsan annotations to QemuSpin
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (6 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 07/19] translate-all: call qemu_spin_destroy for PageDesc Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 09/19] tests/docker: Added docker build support for TSan Robert Foley
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

From: "Emilio G. Cota" <cota@braap.org>

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 include/qemu/thread.h | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index e50a073889..43fc094b96 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -206,6 +206,10 @@ void qemu_thread_atexit_add(struct Notifier *notifier);
  */
 void qemu_thread_atexit_remove(struct Notifier *notifier);
 
+#ifdef CONFIG_TSAN
+#include <sanitizer/tsan_interface.h>
+#endif
+
 struct QemuSpin {
     int value;
 };
@@ -213,23 +217,46 @@ struct QemuSpin {
 static inline void qemu_spin_init(QemuSpin *spin)
 {
     __sync_lock_release(&spin->value);
+#ifdef CONFIG_TSAN
+    __tsan_mutex_create(spin, __tsan_mutex_not_static);
+#endif
 }
 
-static inline void qemu_spin_destroy(QemuSpin *spin)
-{ }
+/* const parameter because the only purpose here is the TSAN annotation */
+static inline void qemu_spin_destroy(const QemuSpin *spin)
+{
+#ifdef CONFIG_TSAN
+    __tsan_mutex_destroy((void *)spin, __tsan_mutex_not_static);
+#endif
+}
 
 static inline void qemu_spin_lock(QemuSpin *spin)
 {
+#ifdef CONFIG_TSAN
+    __tsan_mutex_pre_lock(spin, 0);
+#endif
     while (unlikely(__sync_lock_test_and_set(&spin->value, true))) {
         while (atomic_read(&spin->value)) {
             cpu_relax();
         }
     }
+#ifdef CONFIG_TSAN
+    __tsan_mutex_post_lock(spin, 0, 0);
+#endif
 }
 
 static inline bool qemu_spin_trylock(QemuSpin *spin)
 {
-    return __sync_lock_test_and_set(&spin->value, true);
+#ifdef CONFIG_TSAN
+    __tsan_mutex_pre_lock(spin, __tsan_mutex_try_lock);
+#endif
+    bool busy = __sync_lock_test_and_set(&spin->value, true);
+#ifdef CONFIG_TSAN
+    unsigned flags = __tsan_mutex_try_lock;
+    flags |= busy ? __tsan_mutex_try_lock_failed : 0;
+    __tsan_mutex_post_lock(spin, flags, 0);
+#endif
+    return busy;
 }
 
 static inline bool qemu_spin_locked(QemuSpin *spin)
@@ -239,7 +266,13 @@ static inline bool qemu_spin_locked(QemuSpin *spin)
 
 static inline void qemu_spin_unlock(QemuSpin *spin)
 {
+#ifdef CONFIG_TSAN
+    __tsan_mutex_pre_unlock(spin, 0);
+#endif
     __sync_lock_release(&spin->value);
+#ifdef CONFIG_TSAN
+    __tsan_mutex_post_unlock(spin, 0);
+#endif
 }
 
 struct QemuLockCnt {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/19] tests/docker: Added docker build support for TSan.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (7 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 08/19] thread: add tsan annotations to QemuSpin Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 16:07 ` [PATCH 10/19] include/qemu: Added tsan.h for annotations Robert Foley
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, robert.foley, Philippe Mathieu-Daudé,
	cota, peter.puhov, alex.bennee

Added a new docker for ubuntu 20.04.
This docker has support for Thread Sanitizer
including one patch we need in one of the header files.
https://github.com/llvm/llvm-project/commit/a72dc86cd

This command will build with tsan enabled:
make docker-test-build-ubuntu2004 V=1 TSAN=1

Also added the TSAN suppresion file to disable certain
cases of TSAN warnings.

Cc: Fam Zheng <fam@euphon.net>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 tests/docker/Makefile.include              |  2 +
 tests/docker/common.rc                     | 19 +++++++
 tests/docker/dockerfiles/ubuntu2004.docker | 65 ++++++++++++++++++++++
 tests/tsan/suppressions.tsan               | 14 +++++
 4 files changed, 100 insertions(+)
 create mode 100644 tests/docker/dockerfiles/ubuntu2004.docker
 create mode 100644 tests/tsan/suppressions.tsan

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 43a8678688..e029e54b42 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -202,6 +202,7 @@ endif
 	@echo '                         (default is 1)'
 	@echo '    DEBUG=1              Stop and drop to shell in the created container'
 	@echo '                         before running the command.'
+	@echo '    TSAN=1               Enable use of tsan during the build/test.'
 	@echo '    NETWORK=1            Enable virtual network interface with default backend.'
 	@echo '    NETWORK=$$BACKEND     Enable virtual network interface with $$BACKEND.'
 	@echo '    NOUSER               Define to disable adding current user to containers passwd.'
@@ -239,6 +240,7 @@ docker-run: docker-qemu-src
 			-e EXTRA_CONFIGURE_OPTS="$(EXTRA_CONFIGURE_OPTS)" \
 			-e V=$V -e J=$J -e DEBUG=$(DEBUG)		\
 			-e SHOW_ENV=$(SHOW_ENV) 			\
+	                $(if $(TSAN),,-e TSAN=$(TSAN))		        \
 			$(if $(NOUSER),,				\
 				-e CCACHE_DIR=/var/tmp/ccache 		\
 				-v $(DOCKER_CCACHE_DIR):/var/tmp/ccache:z \
diff --git a/tests/docker/common.rc b/tests/docker/common.rc
index 02cd67a8c5..5df93c6326 100755
--- a/tests/docker/common.rc
+++ b/tests/docker/common.rc
@@ -27,6 +27,25 @@ requires()
 
 configure_qemu()
 {
+    if test -z "$TSAN"; then
+        requires clang tsan
+        echo "Including TSan Support"
+        tsan_log_dir="/tmp/qemu-test/build/tsan"
+        mkdir -p $tsan_log_dir > /dev/null || true
+        EXTRA_CONFIGURE_OPTS="${EXTRA_CONFIGURE_OPTS} --enable-tsan \
+                             --cc=clang-10 --cxx=clang++-10 \
+                             --disable-werror --extra-cflags=-O0"
+        # detect deadlocks is false currently simply because
+        # TSan crashes immediately with deadlock detecter enabled.
+        # We have maxed out the history size to get the best chance of finding
+        # warnings during testing.
+        # Note, to get tsan to fail on warning, use exitcode=66 below.
+        tsan_opts="suppressions=/tmp/qemu-test/src/tests/tsan/suppressions.tsan\
+                   detect_deadlocks=false history_size=7\
+                   halt_on_error=0 exitcode=0 verbose=5\
+                   log_path=$tsan_log_dir/tsan_warnings.txt"
+        export TSAN_OPTIONS="$tsan_opts"
+    fi
     config_opts="--enable-werror \
                  ${TARGET_LIST:+--target-list=${TARGET_LIST}} \
                  --prefix=$INSTALL_DIR \
diff --git a/tests/docker/dockerfiles/ubuntu2004.docker b/tests/docker/dockerfiles/ubuntu2004.docker
new file mode 100644
index 0000000000..6050ce7e8a
--- /dev/null
+++ b/tests/docker/dockerfiles/ubuntu2004.docker
@@ -0,0 +1,65 @@
+FROM ubuntu:20.04
+ENV PACKAGES flex bison \
+    ccache \
+    clang-10\
+    gcc \
+    gettext \
+    git \
+    glusterfs-common \
+    libaio-dev \
+    libattr1-dev \
+    libbrlapi-dev \
+    libbz2-dev \
+    libcacard-dev \
+    libcap-ng-dev \
+    libcurl4-gnutls-dev \
+    libdrm-dev \
+    libepoxy-dev \
+    libfdt-dev \
+    libgbm-dev \
+    libgtk-3-dev \
+    libibverbs-dev \
+    libiscsi-dev \
+    libjemalloc-dev \
+    libjpeg-turbo8-dev \
+    liblzo2-dev \
+    libncurses5-dev \
+    libncursesw5-dev \
+    libnfs-dev \
+    libnss3-dev \
+    libnuma-dev \
+    libpixman-1-dev \
+    librados-dev \
+    librbd-dev \
+    librdmacm-dev \
+    libsasl2-dev \
+    libsdl2-dev \
+    libseccomp-dev \
+    libsnappy-dev \
+    libspice-protocol-dev \
+    libspice-server-dev \
+    libssh-dev \
+    libusb-1.0-0-dev \
+    libusbredirhost-dev \
+    libvdeplug-dev \
+    libvte-2.91-dev \
+    libxen-dev \
+    libzstd-dev \
+    make \
+    python3-yaml \
+    python3-sphinx \
+    sparse \
+    texinfo \
+    xfslibs-dev\
+    vim
+RUN apt-get update && \
+    DEBIAN_FRONTEND=noninteractive apt-get -y install $PACKAGES
+RUN dpkg -l $PACKAGES | sort > /packages.txt
+ENV FEATURES clang tsan pyyaml sdl2
+
+# https://bugs.launchpad.net/qemu/+bug/1838763
+ENV QEMU_CONFIGURE_OPTS --disable-libssh
+
+# Apply patch https://reviews.llvm.org/D75820
+# This is required for TSan in clang-10 to compile with QEMU.
+RUN sed -i 's/^const/static const/g' /usr/lib/llvm-10/lib/clang/10.0.0/include/sanitizer/tsan_interface.h
diff --git a/tests/tsan/suppressions.tsan b/tests/tsan/suppressions.tsan
new file mode 100644
index 0000000000..975b0c2934
--- /dev/null
+++ b/tests/tsan/suppressions.tsan
@@ -0,0 +1,14 @@
+# TSan reports a double lock on RECURSIVE mutexes.
+mutex:aio_context_acquire
+mutex:pthread_mutex_lock
+
+# TSan reports a race betwen pthread_mutex_init() and
+# pthread_mutex_lock()
+race:pthread_mutex_init
+race:pthread_mutex_lock
+
+# TSan is unhappy about these load and store operations.
+race:bswap.h
+race:store_helper
+race:load_helper
+race:tb_set_jmp_target
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/19] include/qemu: Added tsan.h for annotations.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (8 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 09/19] tests/docker: Added docker build support for TSan Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 17:20   ` Emilio G. Cota
  2020-05-22 16:07 ` [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus Robert Foley
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

These annotations will allow us to give tsan
additional hints.  For example, we can inform
tsan about reads/writes to ignore to silence certain
classes of warnings.
We can also annotate threads so that the proper thread
naming shows up in tsan warning results.

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 include/qemu/tsan.h | 48 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100644 include/qemu/tsan.h

diff --git a/include/qemu/tsan.h b/include/qemu/tsan.h
new file mode 100644
index 0000000000..8b7d0acf3e
--- /dev/null
+++ b/include/qemu/tsan.h
@@ -0,0 +1,48 @@
+#ifndef QEMU_TSAN_H
+#define QEMU_TSAN_H
+/*
+ * tsan.h
+ *
+ * This file defines macros used to give ThreadSanitizer
+ * additional information to help suppress warnings.
+ * This is necessary since TSan does not provide a header file
+ * for these annotations.  The standard way to include these
+ * is via the below macros.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifdef CONFIG_TSAN
+#define TSAN_ANNOTATE_HAPPENS_BEFORE(addr) \
+    AnnotateHappensBefore(__FILE__, __LINE__, (void *)(addr))
+#define TSAN_ANNOTATE_HAPPENS_AFTER(addr) \
+    AnnotateHappensAfter(__FILE__, __LINE__, (void *)(addr))
+#define TSAN_ANNOTATE_THREAD_NAME(name) \
+    AnnotateThreadName(__FILE__, __LINE__, (void *)(name))
+#define TSAN_ANNOTATE_IGNORE_READS_BEGIN() \
+    AnnotateIgnoreReadsBegin(__FILE__, __LINE__)
+#define TSAN_ANNOTATE_IGNORE_READS_END() \
+    AnnotateIgnoreReadsEnd(__FILE__, __LINE__)
+#define TSAN_ANNOTATE_IGNORE_WRITES_BEGIN() \
+    AnnotateIgnoreWritesBegin(__FILE__, __LINE__)
+#define TSAN_ANNOTATE_IGNORE_WRITES_END() \
+    AnnotateIgnoreWritesEnd(__FILE__, __LINE__)
+#else
+#define TSAN_ANNOTATE_HAPPENS_BEFORE(addr)
+#define TSAN_ANNOTATE_HAPPENS_AFTER(addr)
+#define TSAN_ANNOTATE_THREAD_NAME(name)
+#define TSAN_ANNOTATE_IGNORE_READS_BEGIN()
+#define TSAN_ANNOTATE_IGNORE_READS_END()
+#define TSAN_ANNOTATE_IGNORE_WRITES_BEGIN()
+#define TSAN_ANNOTATE_IGNORE_WRITES_END()
+#endif
+
+void AnnotateHappensBefore(const char *f, int l, void *addr);
+void AnnotateHappensAfter(const char *f, int l, void *addr);
+void AnnotateThreadName(const char *f, int l, char *name);
+void AnnotateIgnoreReadsBegin(const char *f, int l);
+void AnnotateIgnoreReadsEnd(const char *f, int l);
+void AnnotateIgnoreWritesBegin(const char *f, int l);
+void AnnotateIgnoreWritesEnd(const char *f, int l);
+#endif
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (9 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 10/19] include/qemu: Added tsan.h for annotations Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 17:21   ` Emilio G. Cota
  2020-05-22 16:07 ` [PATCH 12/19] configure: added tsan support for blacklist Robert Foley
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: robert.foley, Richard Henderson, cota, Paolo Bonzini,
	peter.puhov, alex.bennee

Fixed several tsan warnings. e.g.

WARNING: ThreadSanitizer: data race (pid=35425)
  Read of size 1 at 0x557cd83aee28 by thread T7:
    #0 curr_cflags include/exec/exec-all.h:460:13 (qemu-system-aarch64+0x4b7f27)
    #1 cpu_exec accel/tcg/cpu-exec.c:730:26 (qemu-system-aarch64+0x4b7f27)
    #2 tcg_cpu_exec cpus.c:1415:11 (qemu-system-aarch64+0x45b9b6)
    #3 qemu_tcg_cpu_thread_fn cpus.c:1723:17 (qemu-system-aarch64+0x45b9b6)
    #4 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xd431e0)

  Previous write of size 1 at 0x557cd83aee28 by thread T6:
    #0 cpu_exec_step_atomic accel/tcg/cpu-exec.c:254:23 (qemu-system-aarch64+0x4b6caa)
    #1 qemu_tcg_cpu_thread_fn cpus.c:1741:17 (qemu-system-aarch64+0x45baca)
    #2 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xd431e0)

  Location is global 'parallel_cpus' of size 1 at 0x557cd83aee28 (qemu-system-aarch64+0x000001fb3e28)

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 accel/tcg/cpu-exec.c    | 4 ++--
 cpus.c                  | 2 +-
 include/exec/exec-all.h | 2 +-
 linux-user/syscall.c    | 4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d95c4848a4..4cbdef1373 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -250,7 +250,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
         }
 
         /* Since we got here, we know that parallel_cpus must be true.  */
-        parallel_cpus = false;
+        atomic_set(&parallel_cpus, false);
         cc->cpu_exec_enter(cpu);
         /* execute the generated code */
         trace_exec_tb(tb, pc);
@@ -278,7 +278,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
      * the execution.
      */
     g_assert(cpu_in_exclusive_context(cpu));
-    parallel_cpus = true;
+    atomic_set(&parallel_cpus, true);
     end_exclusive();
 }
 
diff --git a/cpus.c b/cpus.c
index af44027549..c5d04486a8 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1966,7 +1966,7 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
         if (qemu_tcg_mttcg_enabled()) {
             /* create a thread per vCPU with TCG (MTTCG) */
-            parallel_cpus = true;
+            atomic_set(&parallel_cpus, true);
             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
                  cpu->cpu_index);
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 3cf88272df..3f2c0290e1 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -496,7 +496,7 @@ static inline uint32_t tb_cflags(const TranslationBlock *tb)
 /* current cflags for hashing/comparison */
 static inline uint32_t curr_cflags(void)
 {
-    return (parallel_cpus ? CF_PARALLEL : 0)
+    return (atomic_read(&parallel_cpus) ? CF_PARALLEL : 0)
          | (use_icount ? CF_USE_ICOUNT : 0);
 }
 
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 05f03919ff..8e39c09c5d 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6022,8 +6022,8 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp,
         /* If this is our first additional thread, we need to ensure we
          * generate code for parallel execution and flush old translations.
          */
-        if (!parallel_cpus) {
-            parallel_cpus = true;
+        if (!atomic_read(&parallel_cpus)) {
+            atomic_set(&parallel_cpus, true);
             tb_flush(cpu);
         }
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/19] configure: added tsan support for blacklist.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (10 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 17:27   ` Emilio G. Cota
  2020-05-22 16:07 ` [PATCH 13/19] accel/tcg: Fixed tsan warnings Robert Foley
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

Initially put several files into blacklist that were
causing the most problems, namely bitops.c and bitmap.c.

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 configure                 | 3 ++-
 tests/tsan/blacklist.tsan | 5 +++++
 2 files changed, 7 insertions(+), 1 deletion(-)
 create mode 100644 tests/tsan/blacklist.tsan

diff --git a/configure b/configure
index c95c54fb48..8a86a0638d 100755
--- a/configure
+++ b/configure
@@ -6306,7 +6306,8 @@ if test "$have_asan" = "yes"; then
 fi
 if test "$have_tsan" = "yes" ; then
   if test "$have_tsan_iface_fiber" = "yes" ; then
-    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
+    QEMU_CFLAGS="-fsanitize=thread -fsanitize-blacklist="\
+	        "\$(SRC_PATH)/tests/tsan/blacklist.tsan $QEMU_CFLAGS"
     QEMU_LDFLAGS="-fsanitize=thread $QEMU_LDFLAGS"
   else
     echo "Cannot enable TSAN due to missing fiber annotation interface."
diff --git a/tests/tsan/blacklist.tsan b/tests/tsan/blacklist.tsan
new file mode 100644
index 0000000000..67dd809e96
--- /dev/null
+++ b/tests/tsan/blacklist.tsan
@@ -0,0 +1,5 @@
+# TSan is not happy about setting/getting of dirty bits,
+# for example, cpu_physical_memory_set_dirty_range,
+# and cpu_physical_memory_get_dirty.
+src:bitops.c
+src:bitmap.c
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/19] accel/tcg: Fixed tsan warnings.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (11 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 12/19] configure: added tsan support for blacklist Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 20:06   ` Emilio G. Cota
  2020-05-26 18:47   ` Paolo Bonzini
  2020-05-22 16:07 ` [PATCH 14/19] util/async: " Robert Foley
                   ` (7 subsequent siblings)
  20 siblings, 2 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: robert.foley, Richard Henderson, cota, Paolo Bonzini,
	peter.puhov, alex.bennee

For example:
WARNING: ThreadSanitizer: data race (pid=35425)
  Write of size 4 at 0x7bbc000000ac by main thread (mutexes: write M875):
    #0 cpu_reset_interrupt hw/core/cpu.c:107:28 (qemu-system-aarch64+0x843790)
    #1 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x616265)
    #2 qemu_set_irq hw/core/irq.c:44:5 (qemu-system-aarch64+0x8462ca)
  Previous atomic read of size 4 at 0x7bbc000000ac by thread T6:
    #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x394c1c)
    #1 cpu_handle_interrupt accel/tcg/cpu-exec.c:534:9 (qemu-system-aarch64+0x4b7e79)
    #2 cpu_exec accel/tcg/cpu-exec.c:720:17 (qemu-system-aarch64+0x4b7e79)
or
WARNING: ThreadSanitizer: data race (pid=25425)
  Read of size 8 at 0x7f8ad8e138d0 by thread T10:
    #0 tb_lookup_cmp accel/tcg/cpu-exec.c:307:13 (qemu-system-aarch64+0x4ac4d2)
    #1 qht_do_lookup util/qht.c:502:34 (qemu-system-aarch64+0xd05264)
  Previous write of size 8 at 0x7f8ad8e138d0 by thread T15 (mutexes: write M728311726235541804):
    #0 tb_link_page accel/tcg/translate-all.c:1625:26 (qemu-system-aarch64+0x4b0bf2)
    #1 tb_gen_code accel/tcg/translate-all.c:1865:19 (qemu-system-aarch64+0x4b0bf2)
    #2 tb_find accel/tcg/cpu-exec.c:407:14 (qemu-system-aarch64+0x4ad77c)

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 accel/tcg/tcg-all.c       | 4 ++--
 accel/tcg/tcg-runtime.c   | 7 ++++++-
 accel/tcg/translate-all.c | 6 +++++-
 hw/core/cpu.c             | 2 +-
 4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 3b4fda5640..f94ea4c4b3 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -54,8 +54,8 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
     int old_mask;
     g_assert(qemu_mutex_iothread_locked());
 
-    old_mask = cpu->interrupt_request;
-    cpu->interrupt_request |= mask;
+    old_mask = atomic_read(&cpu->interrupt_request);
+    atomic_or(&cpu->interrupt_request, mask);
 
     /*
      * If called from iothread context, wake the target cpu in
diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 446465a09a..bd0cd77450 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -31,6 +31,7 @@
 #include "disas/disas.h"
 #include "exec/log.h"
 #include "tcg/tcg.h"
+#include "qemu/tsan.h"
 
 /* 32-bit helpers */
 
@@ -151,6 +152,7 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
     TranslationBlock *tb;
     target_ulong cs_base, pc;
     uint32_t flags;
+    void *tc_ptr;
 
     tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, curr_cflags());
     if (tb == NULL) {
@@ -161,7 +163,10 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
                            TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
                            cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
                            lookup_symbol(pc));
-    return tb->tc.ptr;
+    TSAN_ANNOTATE_IGNORE_READS_BEGIN();
+    tc_ptr = tb->tc.ptr;
+    TSAN_ANNOTATE_IGNORE_READS_END();
+    return tc_ptr;
 }
 
 void HELPER(exit_atomic)(CPUArchState *env)
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 3fb71a1503..6c0e61994c 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -58,6 +58,7 @@
 #include "exec/log.h"
 #include "sysemu/cpus.h"
 #include "sysemu/tcg.h"
+#include "qemu/tsan.h"
 
 /* #define DEBUG_TB_INVALIDATE */
 /* #define DEBUG_TB_FLUSH */
@@ -1704,6 +1705,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
         max_insns = 1;
     }
 
+    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
  buffer_overflow:
     tb = tcg_tb_alloc(tcg_ctx);
     if (unlikely(!tb)) {
@@ -1902,9 +1904,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
         orig_aligned -= ROUND_UP(sizeof(*tb), qemu_icache_linesize);
         atomic_set(&tcg_ctx->code_gen_ptr, (void *)orig_aligned);
         tb_destroy(tb);
+        TSAN_ANNOTATE_IGNORE_WRITES_END();
         return existing_tb;
     }
     tcg_tb_insert(tb);
+    TSAN_ANNOTATE_IGNORE_WRITES_END();
     return tb;
 }
 
@@ -2409,7 +2413,7 @@ void dump_opcount_info(void)
 void cpu_interrupt(CPUState *cpu, int mask)
 {
     g_assert(qemu_mutex_iothread_locked());
-    cpu->interrupt_request |= mask;
+    atomic_or(&cpu->interrupt_request, mask);
     atomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1);
 }
 
diff --git a/hw/core/cpu.c b/hw/core/cpu.c
index 77703d62b7..6c16ccc426 100644
--- a/hw/core/cpu.c
+++ b/hw/core/cpu.c
@@ -104,7 +104,7 @@ void cpu_reset_interrupt(CPUState *cpu, int mask)
     if (need_lock) {
         qemu_mutex_lock_iothread();
     }
-    cpu->interrupt_request &= ~mask;
+    atomic_and(&cpu->interrupt_request, ~mask);
     if (need_lock) {
         qemu_mutex_unlock_iothread();
     }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/19] util/async: Fixed tsan warnings
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (12 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 13/19] accel/tcg: Fixed tsan warnings Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 20:12   ` Emilio G. Cota
  2020-05-26 10:32   ` Stefan Hajnoczi
  2020-05-22 16:07 ` [PATCH 15/19] qht: Fix " Robert Foley
                   ` (6 subsequent siblings)
  20 siblings, 2 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, robert.foley, cota, Stefan Hajnoczi, peter.puhov, alex.bennee

For example:
  Atomic write of size 8 at 0x7b4800113c28 by main thread (mutexes: write M30):
    #0 __tsan_atomic64_exchange <null> (qemu-system-aarch64+0x386f85)
    #1 aio_bh_poll util/async.c:146:5 (qemu-system-aarch64+0xcd1f61)
    #2 aio_dispatch util/aio-posix.c:380:5 (qemu-system-aarch64+0xcd8abb)
    #3 aio_ctx_dispatch util/async.c:298:5 (qemu-system-aarch64+0xcd31b0)
    #4 g_main_context_dispatch <null> (libglib-2.0.so.0+0x4c416)
    #5 qemu_main_loop softmmu/vl.c:1664:9 (qemu-system-aarch64+0x5cc6d6)
    #6 main softmmu/main.c:49:5 (qemu-system-aarch64+0xc62857)

  Previous read of size 8 at 0x7b4800113c28 by thread T3 (mutexes: write M81):
    #0 aio_bh_enqueue util/async.c:81:9 (qemu-system-aarch64+0xcd2267)
    #1 qemu_bh_schedule util/async.c:181:5 (qemu-system-aarch64+0xcd2267)
    #2 worker_thread util/thread-pool.c:113:9 (qemu-system-aarch64+0xcd473c)
    #3 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xcde280)

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 util/async.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/util/async.c b/util/async.c
index 1319eee3bc..51e306bf0c 100644
--- a/util/async.c
+++ b/util/async.c
@@ -33,6 +33,7 @@
 #include "block/raw-aio.h"
 #include "qemu/coroutine_int.h"
 #include "trace.h"
+#include "qemu/tsan.h"
 
 /***********************************************************/
 /* bottom halves (can be seen as timers which expire ASAP) */
@@ -76,10 +77,12 @@ static void aio_bh_enqueue(QEMUBH *bh, unsigned new_flags)
      * 2. ctx is loaded before the callback has a chance to execute and bh
      *    could be freed.
      */
+    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
     old_flags = atomic_fetch_or(&bh->flags, BH_PENDING | new_flags);
     if (!(old_flags & BH_PENDING)) {
         QSLIST_INSERT_HEAD_ATOMIC(&ctx->bh_list, bh, next);
     }
+    TSAN_ANNOTATE_IGNORE_WRITES_END();
 
     aio_notify(ctx);
 }
@@ -143,7 +146,9 @@ int aio_bh_poll(AioContext *ctx)
     BHListSlice *s;
     int ret = 0;
 
+    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
     QSLIST_MOVE_ATOMIC(&slice.bh_list, &ctx->bh_list);
+    TSAN_ANNOTATE_IGNORE_WRITES_END();
     QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
 
     while ((s = QSIMPLEQ_FIRST(&ctx->bh_slice_list))) {
@@ -280,14 +285,16 @@ aio_ctx_check(GSource *source)
     aio_notify_accept(ctx);
 
     QSLIST_FOREACH_RCU(bh, &ctx->bh_list, next) {
-        if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
+        if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
+             == BH_SCHEDULED) {
             return true;
         }
     }
 
     QSIMPLEQ_FOREACH(s, &ctx->bh_slice_list, next) {
         QSLIST_FOREACH_RCU(bh, &s->bh_list, next) {
-            if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
+            if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
+                 == BH_SCHEDULED) {
                 return true;
             }
         }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 15/19] qht: Fix tsan warnings.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (13 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 14/19] util/async: " Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 20:44   ` Emilio G. Cota
  2020-05-22 16:07 ` [PATCH 16/19] util: fixed tsan warnings in thread_pool.c Robert Foley
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

For example:
WARNING: ThreadSanitizer: data race (pid=23406)
  Atomic read of size 4 at 0x7b100003e3c8 by thread T7:
    #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x39a36c)
    #1 qht_do_lookup util/qht.c:495:17 (qemu-system-aarch64+0xd82f7a)
    #2 qht_lookup_custom util/qht.c:539:11 (qemu-system-aarch64+0xd82f7a)
  Previous write of size 8 at 0x7b100003e3c8 by thread T6 (mutexes: write M166769147697783108, write M995435858420506688):
    #0 posix_memalign <null> (qemu-system-aarch64+0x350dd1)
    #1 qemu_try_memalign util/oslib-posix.c:189:11 (qemu-system-aarch64+0xd59317)
    #2 qemu_memalign util/oslib-posix.c:205:27 (qemu-system-aarch64+0xd5943e)
    #3 qht_insert__locked util/qht.c:583:9 (qemu-system-aarch64+0xd837c5)

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 util/qht.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/util/qht.c b/util/qht.c
index 67e5d5b916..739a53ced0 100644
--- a/util/qht.c
+++ b/util/qht.c
@@ -69,6 +69,7 @@
 #include "qemu/qht.h"
 #include "qemu/atomic.h"
 #include "qemu/rcu.h"
+#include "qemu/tsan.h"
 
 //#define QHT_DEBUG
 
@@ -580,10 +581,12 @@ static void *qht_insert__locked(const struct qht *ht, struct qht_map *map,
         b = b->next;
     } while (b);
 
+    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
     b = qemu_memalign(QHT_BUCKET_ALIGN, sizeof(*b));
     memset(b, 0, sizeof(*b));
     new = b;
     i = 0;
+    TSAN_ANNOTATE_IGNORE_WRITES_END();
     atomic_inc(&map->n_added_buckets);
     if (unlikely(qht_map_needs_resize(map)) && needs_resize) {
         *needs_resize = true;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 16/19] util: fixed tsan warnings in thread_pool.c
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (14 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 15/19] qht: Fix " Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-26 20:18   ` Paolo Bonzini
  2020-05-22 16:07 ` [PATCH 17/19] util: Added tsan annotate for thread name Robert Foley
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

For example:
WARNING: ThreadSanitizer: data race (pid=14665)
  Write of size 4 at 0x7b1c00007890 by thread T99:
    #0 worker_thread util/thread-pool.c:112:20 (qemu-system-aarch64+0xd52108)
    #1 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xd5be30)

  Previous read of size 4 at 0x7b1c00007890 by main thread (mutexes: write M875, write M897):
    #0 thread_pool_completion_bh util/thread-pool.c:177:19 (qemu-system-aarch64+0xd51a73)
    #1 aio_bh_call util/async.c:136:5 (qemu-system-aarch64+0xd4f98e)
    #2 aio_bh_poll util/async.c:164:13 (qemu-system-aarch64+0xd4f98e)

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 util/thread-pool.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/util/thread-pool.c b/util/thread-pool.c
index d763cea505..2403669827 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -21,6 +21,7 @@
 #include "trace.h"
 #include "block/thread-pool.h"
 #include "qemu/main-loop.h"
+#include "qemu/tsan.h"
 
 static void do_spawn_thread(ThreadPool *pool);
 
@@ -97,7 +98,9 @@ static void *worker_thread(void *opaque)
         }
 
         req = QTAILQ_FIRST(&pool->request_list);
+        TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
         QTAILQ_REMOVE(&pool->request_list, req, reqs);
+
         req->state = THREAD_ACTIVE;
         qemu_mutex_unlock(&pool->lock);
 
@@ -107,7 +110,7 @@ static void *worker_thread(void *opaque)
         /* Write ret before state.  */
         smp_wmb();
         req->state = THREAD_DONE;
-
+        TSAN_ANNOTATE_IGNORE_WRITES_END();
         qemu_mutex_lock(&pool->lock);
 
         qemu_bh_schedule(pool->completion_bh);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 17/19] util: Added tsan annotate for thread name.
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (15 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 16/19] util: fixed tsan warnings in thread_pool.c Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 20:29   ` Emilio G. Cota
  2020-05-22 16:07 ` [PATCH 18/19] target/arm: Fix tsan warning in cpu.c Robert Foley
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

This allows us to see the name of the thread in tsan
warning reports such as this:

  Thread T7 'CPU 1/TCG' (tid=24317, running) created by main thread at:

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 util/qemu-thread-posix.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 838980aaa5..dcbc82d80f 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -15,6 +15,7 @@
 #include "qemu/atomic.h"
 #include "qemu/notify.h"
 #include "qemu-thread-common.h"
+#include "qemu/tsan.h"
 
 static bool name_threads;
 
@@ -513,6 +514,7 @@ static void *qemu_thread_start(void *args)
 # endif
     }
 #endif
+    TSAN_ANNOTATE_THREAD_NAME(qemu_thread_args->name);
     g_free(qemu_thread_args->name);
     g_free(qemu_thread_args);
     pthread_cleanup_push(qemu_thread_atexit_notify, NULL);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (16 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 17/19] util: Added tsan annotate for thread name Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-22 17:44   ` Peter Maydell
  2020-05-22 16:07 ` [PATCH 19/19] docs: Added details on TSan to testing.rst Robert Foley
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, robert.foley, Richard Henderson, cota,
	peter.puhov, alex.bennee

For example:
WARNING: ThreadSanitizer: data race (pid=11134)
  Atomic write of size 4 at 0x7bbc0000e0ac by main thread (mutexes: write M875):
    #0 __tsan_atomic32_store <null> (qemu-system-aarch64+0x394d84)
    #1 cpu_reset_interrupt hw/core/cpu.c:107:5 (qemu-system-aarch64+0x842f90)
    #2 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x615a55)

  Previous read of size 4 at 0x7bbc0000e0ac by thread T7:
    #0 arm_cpu_has_work target/arm/cpu.c:78:16 (qemu-system-aarch64+0x6178ba)
    #1 cpu_has_work include/hw/core/cpu.h:700:12 (qemu-system-aarch64+0x68be2e)

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 target/arm/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 32bec156f2..cdb90582ee 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -75,7 +75,7 @@ static bool arm_cpu_has_work(CPUState *cs)
     ARMCPU *cpu = ARM_CPU(cs);
 
     return (cpu->power_state != PSCI_OFF)
-        && cs->interrupt_request &
+        && atomic_read(&cs->interrupt_request) &
         (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
          | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
          | CPU_INTERRUPT_EXITTB);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 19/19] docs: Added details on TSan to testing.rst
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (17 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 18/19] target/arm: Fix tsan warning in cpu.c Robert Foley
@ 2020-05-22 16:07 ` Robert Foley
  2020-05-23 20:29   ` Emilio G. Cota
  2020-05-22 21:07 ` [PATCH 00/19] Add Thread Sanitizer support to QEMU no-reply
  2020-05-23 21:36 ` Emilio G. Cota
  20 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.puhov, cota, alex.bennee, robert.foley

This includes details on how to build and test with TSan
both inside a docker and outside.

Signed-off-by: Robert Foley <robert.foley@linaro.org>
---
 docs/devel/testing.rst | 72 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 770a987ea4..5b0a828068 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -397,6 +397,78 @@ list is in the ``make docker`` help text. The frequently used ones are:
 * ``DEBUG=1``: enables debug. See the previous "Debugging a Docker test
   failure" section.
 
+Thread Sanitizer
+================
+TSan is currently supported in the ubuntu2004 docker.
+
+Just add the TSAN=1 argument to use TSan
+
+.. code::
+
+  make docker-test-build@ubuntu2004 TSAN=1
+
+or
+
+.. code::
+  
+  make docker-test-quick@ubuntu2004 TSAN=1
+
+The runtime behavior or TSAN is controlled by the TSAN_OPTIONS environment
+variable.  We set this variable automatically to for example, maximize
+the number of warnings TSan can find and also to specify the location of
+the files with TSan warnings.  
+
+TSan warnings are placed in files located at build/tsan/.
+
+We recommend using DEBUG=1 to allow launching the test from inside the docker,
+and to allow review of the warnings generated by TSan.
+A few important files to note are:
+
+tests/tsan/suppressions.tsan - Has TSan warnings we wish to suppress at runtime.
+In some cases we choose to put suppressions here since the resolution is
+slightly finer than the blacklist, since we can disable by warning type.
+
+tests/tsan/blacklist.tsan - Has TSan warnings we wish to disable
+at compile time.
+
+include/qemu/tsan.h - Defines various annotations which can also be used
+to give TSan more information some example uses are to allow suppressing
+TSan warnings, or annotating thread names so they show up properly in
+the TSan warnings.
+
+TSan without docker
+-------------------
+
+It is possible to build and test with TSan outside of a docker, but with a
+few additional steps required.
+These steps are normally done automatically in the docker.
+
+First, to configure the build for TSan:
+
+.. code::
+
+  ../configure --enable-tsan --cc=clang-10 --cxx=clang++-10 \
+               --disable-werror --extra-cflags="-O0"
+
+There is also a one time patch needed in clang-9 or clang-10:
+
+.. code::
+
+  sed -i 's/^const/static const/g' \
+      /usr/lib/llvm-10/lib/clang/10.0.0/include/sanitizer/tsan_interface.h
+
+When running tests, the TSAN_OPTIONS environment variable needs to be set.
+
+.. code::
+
+  export TSAN_OPTIONS=suppressions=<path to qemu>/tests/tsan/suppressions.tsan \
+         detect_deadlocks=false history_size=7 exitcode=0 \
+         log_path=<build path>/tsan/tsan_warnings.txt
+
+The above exitcode makes TSan continue without error if any warnings are found.
+This allows for running the test and then checking the warnings afterwards.
+If you want TSan to stop and exit with error on warnings, use exitcode=66.
+
 VM testing
 ==========
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-22 16:07 ` [PATCH 18/19] target/arm: Fix tsan warning in cpu.c Robert Foley
@ 2020-05-22 17:44   ` Peter Maydell
  2020-05-22 21:33     ` Robert Foley
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Maydell @ 2020-05-22 17:44 UTC (permalink / raw)
  To: Robert Foley
  Cc: peter.puhov, Richard Henderson, Emilio G. Cota, Alex Bennée,
	QEMU Developers

On Fri, 22 May 2020 at 17:15, Robert Foley <robert.foley@linaro.org> wrote:
>
> For example:
> WARNING: ThreadSanitizer: data race (pid=11134)
>   Atomic write of size 4 at 0x7bbc0000e0ac by main thread (mutexes: write M875):
>     #0 __tsan_atomic32_store <null> (qemu-system-aarch64+0x394d84)
>     #1 cpu_reset_interrupt hw/core/cpu.c:107:5 (qemu-system-aarch64+0x842f90)
>     #2 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x615a55)
>
>   Previous read of size 4 at 0x7bbc0000e0ac by thread T7:
>     #0 arm_cpu_has_work target/arm/cpu.c:78:16 (qemu-system-aarch64+0x6178ba)
>     #1 cpu_has_work include/hw/core/cpu.h:700:12 (qemu-system-aarch64+0x68be2e)
>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  target/arm/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 32bec156f2..cdb90582ee 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -75,7 +75,7 @@ static bool arm_cpu_has_work(CPUState *cs)
>      ARMCPU *cpu = ARM_CPU(cs);
>
>      return (cpu->power_state != PSCI_OFF)
> -        && cs->interrupt_request &
> +        && atomic_read(&cs->interrupt_request) &
>          (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
>           | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
>           | CPU_INTERRUPT_EXITTB);

Every target's has_work function seems to access
cs->interrupt_request without using atomic_read() :
why does Arm need to do something special here?

More generally, the only place that currently
uses atomic_read() on the interrupt_request field
is cpu_handle_interrupt(), so if this field needs
special precautions to access then a lot of code
needs updating.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/19] Add Thread Sanitizer support to QEMU
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (18 preceding siblings ...)
  2020-05-22 16:07 ` [PATCH 19/19] docs: Added details on TSan to testing.rst Robert Foley
@ 2020-05-22 21:07 ` no-reply
  2020-05-23 21:36 ` Emilio G. Cota
  20 siblings, 0 replies; 51+ messages in thread
From: no-reply @ 2020-05-22 21:07 UTC (permalink / raw)
  To: robert.foley; +Cc: peter.puhov, cota, alex.bennee, qemu-devel, robert.foley

Patchew URL: https://patchew.org/QEMU/20200522160755.886-1-robert.foley@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200522160755.886-1-robert.foley@linaro.org
Subject: [PATCH 00/19] Add Thread Sanitizer support to QEMU
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Switched to a new branch 'test'
1737663 docs: Added details on TSan to testing.rst
82aa460 target/arm: Fix tsan warning in cpu.c
4a3bd5a util: Added tsan annotate for thread name.
f2614bb util: fixed tsan warnings in thread_pool.c
56529c7 qht: Fix tsan warnings.
128f63c util/async: Fixed tsan warnings
f86c38c accel/tcg: Fixed tsan warnings.
0d7ee16 configure: added tsan support for blacklist.
cfb2d34 accel/tcg: Fixed tsan warnings related to parallel_cpus
1ef1ed2 include/qemu: Added tsan.h for annotations.
bd287e9 tests/docker: Added docker build support for TSan.
6baf0d3 thread: add tsan annotations to QemuSpin
bbf88d9 translate-all: call qemu_spin_destroy for PageDesc
5f0a213 tcg: call qemu_spin_destroy for tb->jmp_lock
fb19649 qht: call qemu_spin_destroy for head buckets
688ca64 cputlb: destroy CPUTLB with tlb_destroy
be8d1f8 thread: add qemu_spin_destroy
2a326b6 cpu: convert queued work to a QSIMPLEQ
7fb7830 configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext

=== OUTPUT BEGIN ===
1/19 Checking commit 7fb7830797be (configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext)
2/19 Checking commit 2a326b6f7215 (cpu: convert queued work to a QSIMPLEQ)
3/19 Checking commit be8d1f8ff517 (thread: add qemu_spin_destroy)
4/19 Checking commit 688ca64764bf (cputlb: destroy CPUTLB with tlb_destroy)
5/19 Checking commit fb19649f7025 (qht: call qemu_spin_destroy for head buckets)
6/19 Checking commit 5f0a21365e6e (tcg: call qemu_spin_destroy for tb->jmp_lock)
7/19 Checking commit bbf88d92575a (translate-all: call qemu_spin_destroy for PageDesc)
8/19 Checking commit 6baf0d37cfdf (thread: add tsan annotations to QemuSpin)
9/19 Checking commit bd287e96cf4a (tests/docker: Added docker build support for TSan.)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#74: 
new file mode 100644

total: 0 errors, 1 warnings, 118 lines checked

Patch 9/19 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/19 Checking commit 1ef1ed22be4b (include/qemu: Added tsan.h for annotations.)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#18: 
new file mode 100644

total: 0 errors, 1 warnings, 48 lines checked

Patch 10/19 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
11/19 Checking commit cfb2d343b6ed (accel/tcg: Fixed tsan warnings related to parallel_cpus)
12/19 Checking commit 0d7ee16ffe83 (configure: added tsan support for blacklist.)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#28: 
new file mode 100644

total: 0 errors, 1 warnings, 14 lines checked

Patch 12/19 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
13/19 Checking commit f86c38c6aa87 (accel/tcg: Fixed tsan warnings.)
14/19 Checking commit 128f63c37b76 (util/async: Fixed tsan warnings)
15/19 Checking commit 56529c7ff837 (qht: Fix tsan warnings.)
16/19 Checking commit f2614bbafdb4 (util: fixed tsan warnings in thread_pool.c)
17/19 Checking commit 4a3bd5a64414 (util: Added tsan annotate for thread name.)
18/19 Checking commit 82aa460722b3 (target/arm: Fix tsan warning in cpu.c)
19/19 Checking commit 17376630a146 (docs: Added details on TSan to testing.rst)
ERROR: trailing whitespace
#34: FILE: docs/devel/testing.rst:413:
+  $

ERROR: trailing whitespace
#40: FILE: docs/devel/testing.rst:419:
+the files with TSan warnings.  $

total: 2 errors, 0 warnings, 78 lines checked

Patch 19/19 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200522160755.886-1-robert.foley@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-22 17:44   ` Peter Maydell
@ 2020-05-22 21:33     ` Robert Foley
  2020-05-22 22:36       ` Peter Maydell
  0 siblings, 1 reply; 51+ messages in thread
From: Robert Foley @ 2020-05-22 21:33 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Peter Puhov, Richard Henderson, Emilio G. Cota, Alex Bennée,
	QEMU Developers

On Fri, 22 May 2020 at 13:44, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 22 May 2020 at 17:15, Robert Foley <robert.foley@linaro.org> wrote:
> >
> > For example:
> > WARNING: ThreadSanitizer: data race (pid=11134)
> >   Atomic write of size 4 at 0x7bbc0000e0ac by main thread (mutexes: write M875):
> >     #0 __tsan_atomic32_store <null> (qemu-system-aarch64+0x394d84)
> >     #1 cpu_reset_interrupt hw/core/cpu.c:107:5 (qemu-system-aarch64+0x842f90)
> >     #2 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x615a55)
> >
> >   Previous read of size 4 at 0x7bbc0000e0ac by thread T7:
> >     #0 arm_cpu_has_work target/arm/cpu.c:78:16 (qemu-system-aarch64+0x6178ba)
> >     #1 cpu_has_work include/hw/core/cpu.h:700:12 (qemu-system-aarch64+0x68be2e)
> >
> > Cc: Peter Maydell <peter.maydell@linaro.org>
> > Cc: Richard Henderson <richard.henderson@linaro.org>
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > ---
> >  target/arm/cpu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index 32bec156f2..cdb90582ee 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -75,7 +75,7 @@ static bool arm_cpu_has_work(CPUState *cs)
> >      ARMCPU *cpu = ARM_CPU(cs);
> >
> >      return (cpu->power_state != PSCI_OFF)
> > -        && cs->interrupt_request &
> > +        && atomic_read(&cs->interrupt_request) &
> >          (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
> >           | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
> >           | CPU_INTERRUPT_EXITTB);
>
> Every target's has_work function seems to access
> cs->interrupt_request without using atomic_read() :
> why does Arm need to do something special here?
>
> More generally, the only place that currently
> uses atomic_read() on the interrupt_request field
> is cpu_handle_interrupt(), so if this field needs
> special precautions to access then a lot of code
> needs updating.

TSan flagged this case as a potential data race. It does not mean
necessarily that there is an issue here, just that two threads were
accessing the data
without TSan detecting the synchronization.  TSan gives a few options
to silence the
warning, such as changing the locking, making it atomic, or adding
various types
of annotations to tell TSan to ignore it.  So in this case we had a
few options, such as
to change it to atomic or to simply annotate it and silence it.

We started our TSan testing using arm, and have been working to iron out the
TSan warnings there, and there alone initially.  Assuming that we are OK
with making this particular change, to silence the TSan warning,
then certainly it is a good point that we need to consider changing the
other places that access this field, since they will all see similar
TSan warnings.

Of course if we are not OK with these changes to silence the TSan tool,
that's OK too :).  In that case we can certainly just add an
annotation either in the
code or via our suppressions/blacklist and leave the code functionally
unchanged.

Thanks & Regards,
-Rob
>
> thanks
> -- PMM


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-22 21:33     ` Robert Foley
@ 2020-05-22 22:36       ` Peter Maydell
  2020-05-23 17:18         ` Emilio G. Cota
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Maydell @ 2020-05-22 22:36 UTC (permalink / raw)
  To: Robert Foley
  Cc: Peter Puhov, Richard Henderson, Emilio G. Cota, Alex Bennée,
	QEMU Developers

On Fri, 22 May 2020 at 22:33, Robert Foley <robert.foley@linaro.org> wrote:
> On Fri, 22 May 2020 at 13:44, Peter Maydell <peter.maydell@linaro.org> wrote:
> > Every target's has_work function seems to access
> > cs->interrupt_request without using atomic_read() :
> > why does Arm need to do something special here?
> >
> > More generally, the only place that currently
> > uses atomic_read() on the interrupt_request field
> > is cpu_handle_interrupt(), so if this field needs
> > special precautions to access then a lot of code
> > needs updating.
>
> TSan flagged this case as a potential data race. It does not mean
> necessarily that there is an issue here, just that two threads were
> accessing the data
> without TSan detecting the synchronization.  TSan gives a few options
> to silence the
> warning, such as changing the locking, making it atomic, or adding
> various types
> of annotations to tell TSan to ignore it.  So in this case we had a
> few options, such as
> to change it to atomic or to simply annotate it and silence it.
>
> We started our TSan testing using arm, and have been working to iron out the
> TSan warnings there, and there alone initially.  Assuming that we are OK
> with making this particular change, to silence the TSan warning,
> then certainly it is a good point that we need to consider changing the
> other places that access this field, since they will all see similar
> TSan warnings.

So is this:
 (a) a TSan false positive, because we've analysed the use
     of this struct field and know it's not a race because
     [details], but which we're choosing to silence in this way
 (b) an actual race for which the correct fix is to make the
     accesses atomic because [details]

?

Either way, the important part is the analysis which fills
in the "[details]" part, which should be in the commit message...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
@ 2020-05-23 16:55   ` Philippe Mathieu-Daudé
  2020-05-26 12:38     ` Robert Foley
  2020-06-17 14:24   ` Stefan Hajnoczi
  1 sibling, 1 reply; 51+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-05-23 16:55 UTC (permalink / raw)
  To: Robert Foley, qemu-devel; +Cc: peter.puhov, cota, alex.bennee, Lingfeng Yang

Hi Robert,

On 5/22/20 6:07 PM, Robert Foley wrote:
> From: Lingfeng Yang <lfy@google.com>
> 
> We tried running QEMU under tsan in 2016, but tsan's lack of support for
> longjmp-based fibers was a blocker:
>   https://groups.google.com/forum/#!topic/thread-sanitizer/se0YuzfWazw
> 
> Fortunately, thread sanitizer gained fiber support in early 2019:
>   https://reviews.llvm.org/D54889
> 
> This patch brings tsan support upstream by importing the patch that annotated
> QEMU's coroutines as tsan fibers in Android's QEMU fork:
>   https://android-review.googlesource.com/c/platform/external/qemu/+/844675
> 
> Tested with '--enable-tsan --cc=clang-9 --cxx=clang++-9 --disable-werror'
> configure flags.
> 
> Signed-off-by: Lingfeng Yang <lfy@google.com>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> [cota: minor modifications + configure changes]
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> [RF: minor changes to clean up checkpatch warnings/errors]
> ---
>  configure                 | 39 ++++++++++++++++
>  util/coroutine-ucontext.c | 97 +++++++++++++++++++++++++++++++++++----
>  2 files changed, 127 insertions(+), 9 deletions(-)
> 
> diff --git a/configure b/configure
> index 26084fc53a..c95c54fb48 100755
> --- a/configure
> +++ b/configure
> @@ -395,6 +395,7 @@ gprof="no"
>  debug_tcg="no"
>  debug="no"
>  sanitizers="no"
> +tsan="no"
>  fortify_source=""
>  strip_opt="yes"
>  tcg_interpreter="no"
> @@ -1150,6 +1151,10 @@ for opt do
>    ;;
>    --disable-sanitizers) sanitizers="no"
>    ;;
> +  --enable-tsan) tsan="yes"
> +  ;;
> +  --disable-tsan) tsan="no"
> +  ;;
>    --enable-sparse) sparse="yes"
>    ;;
>    --disable-sparse) sparse="no"
> @@ -1750,6 +1755,7 @@ Advanced options (experts only):
>    --with-pkgversion=VERS   use specified string as sub-version of the package
>    --enable-debug           enable common debug build options
>    --enable-sanitizers      enable default sanitizers
> +  --enable-tsan            enable thread sanitizer
>    --disable-strip          disable stripping binaries
>    --disable-werror         disable compilation abort on warning
>    --disable-stack-protector disable compiler-provided stack protection
> @@ -6176,6 +6182,27 @@ if test "$fuzzing" = "yes" ; then
>    fi
>  fi
>  
> +# Thread sanitizer is, for now, much noisier than the other sanitizers;
> +# keep it separate until that is not the case.
> +have_tsan=no
> +have_tsan_iface_fiber=no
> +if test "$tsan" = "yes" ; then
> +  write_c_skeleton
> +  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=thread" "" ; then
> +      have_tsan=yes
> +  fi
> +  cat > $TMPC << EOF
> +#include <sanitizer/tsan_interface.h>
> +int main(void) {
> +  __tsan_create_fiber(0);
> +  return 0;
> +}
> +EOF
> +  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=thread" "" ; then
> +      have_tsan_iface_fiber=yes
> +  fi
> +fi
> +
>  ##########################################
>  # check for libpmem
>  
> @@ -6277,6 +6304,14 @@ if test "$have_asan" = "yes"; then
>             "Without code annotation, the report may be inferior."
>    fi
>  fi
> +if test "$have_tsan" = "yes" ; then
> +  if test "$have_tsan_iface_fiber" = "yes" ; then
> +    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
> +    QEMU_LDFLAGS="-fsanitize=thread $QEMU_LDFLAGS"
> +  else
> +    echo "Cannot enable TSAN due to missing fiber annotation interface."

I tried your series and there were no changes anywhere, then I looked at
how TSan work, started to debug, to finally realize my build was not
using TSan (clang8). Please use to something such:

     if test "$tsan" = "yes" ; then
        error_exit "Cannot enable TSAN due to missing fiber" \
                   "annotation interface."
     fi

> +  fi
> +fi
>  if test "$have_ubsan" = "yes"; then
>    QEMU_CFLAGS="-fsanitize=undefined $QEMU_CFLAGS"
>    QEMU_LDFLAGS="-fsanitize=undefined $QEMU_LDFLAGS"
> @@ -7365,6 +7400,10 @@ if test "$have_asan_iface_fiber" = "yes" ; then
>      echo "CONFIG_ASAN_IFACE_FIBER=y" >> $config_host_mak
>  fi
>  
> +if test "$have_tsan" = "yes" && test "$have_tsan_iface_fiber" = "yes" ; then
> +    echo "CONFIG_TSAN=y" >> $config_host_mak
> +fi
> +
>  if test "$has_environ" = "yes" ; then
>    echo "CONFIG_HAS_ENVIRON=y" >> $config_host_mak
>  fi
[...]



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-22 22:36       ` Peter Maydell
@ 2020-05-23 17:18         ` Emilio G. Cota
  2020-05-26 14:01           ` Robert Foley
  0 siblings, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 17:18 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Peter Puhov, Richard Henderson, Alex Bennée, Robert Foley,
	QEMU Developers

On Fri, May 22, 2020 at 23:36:18 +0100, Peter Maydell wrote:
> So is this:
>  (a) a TSan false positive, because we've analysed the use
>      of this struct field and know it's not a race because
>      [details], but which we're choosing to silence in this way
>  (b) an actual race for which the correct fix is to make the
>      accesses atomic because [details]
> 
> ?

It is (b), as shown in the per-cpu lock series. In particular,
see these two patches:
- [PATCH v9 33/74] cpu: define cpu_interrupt_request helpers
https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg06322.html
- [PATCH v9 39/74] arm: convert to cpu_interrupt_request
https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg06328.html

Since a more thorough fix is included in that other series, I think this
patch should be dropped from this series -- I'll post a reply to patch
00/19 with more details.

Thanks,

		Emilio


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 10/19] include/qemu: Added tsan.h for annotations.
  2020-05-22 16:07 ` [PATCH 10/19] include/qemu: Added tsan.h for annotations Robert Foley
@ 2020-05-23 17:20   ` Emilio G. Cota
  2020-05-23 21:37     ` Emilio G. Cota
  0 siblings, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 17:20 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:46 -0400, Robert Foley wrote:
> These annotations will allow us to give tsan
> additional hints.  For example, we can inform
> tsan about reads/writes to ignore to silence certain
> classes of warnings.
> We can also annotate threads so that the proper thread
> naming shows up in tsan warning results.
> 
> Signed-off-by: Robert Foley <robert.foley@linaro.org>

Reviewed-by: Emilio G. Cota <cota@braap.org>

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus
  2020-05-22 16:07 ` [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus Robert Foley
@ 2020-05-23 17:21   ` Emilio G. Cota
  0 siblings, 0 replies; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 17:21 UTC (permalink / raw)
  To: Robert Foley
  Cc: peter.puhov, Richard Henderson, alex.bennee, qemu-devel, Paolo Bonzini

On Fri, May 22, 2020 at 12:07:47 -0400, Robert Foley wrote:
> Fixed several tsan warnings. e.g.
> 
> WARNING: ThreadSanitizer: data race (pid=35425)
>   Read of size 1 at 0x557cd83aee28 by thread T7:
>     #0 curr_cflags include/exec/exec-all.h:460:13 (qemu-system-aarch64+0x4b7f27)
>     #1 cpu_exec accel/tcg/cpu-exec.c:730:26 (qemu-system-aarch64+0x4b7f27)
>     #2 tcg_cpu_exec cpus.c:1415:11 (qemu-system-aarch64+0x45b9b6)
>     #3 qemu_tcg_cpu_thread_fn cpus.c:1723:17 (qemu-system-aarch64+0x45b9b6)
>     #4 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xd431e0)
> 
>   Previous write of size 1 at 0x557cd83aee28 by thread T6:
>     #0 cpu_exec_step_atomic accel/tcg/cpu-exec.c:254:23 (qemu-system-aarch64+0x4b6caa)
>     #1 qemu_tcg_cpu_thread_fn cpus.c:1741:17 (qemu-system-aarch64+0x45baca)
>     #2 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xd431e0)
> 
>   Location is global 'parallel_cpus' of size 1 at 0x557cd83aee28 (qemu-system-aarch64+0x000001fb3e28)
> 
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>

Reviewed-by: Emilio G. Cota <cota@braap.org>

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/19] configure: added tsan support for blacklist.
  2020-05-22 16:07 ` [PATCH 12/19] configure: added tsan support for blacklist Robert Foley
@ 2020-05-23 17:27   ` Emilio G. Cota
  2020-05-26 14:07     ` Robert Foley
  0 siblings, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 17:27 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:48 -0400, Robert Foley wrote:
> Initially put several files into blacklist that were
> causing the most problems, namely bitops.c and bitmap.c.
> 
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  configure                 | 3 ++-
>  tests/tsan/blacklist.tsan | 5 +++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>  create mode 100644 tests/tsan/blacklist.tsan
> 
> diff --git a/configure b/configure
> index c95c54fb48..8a86a0638d 100755
> --- a/configure
> +++ b/configure
> @@ -6306,7 +6306,8 @@ if test "$have_asan" = "yes"; then
>  fi
>  if test "$have_tsan" = "yes" ; then
>    if test "$have_tsan_iface_fiber" = "yes" ; then
> -    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
> +    QEMU_CFLAGS="-fsanitize=thread -fsanitize-blacklist="\
> +	        "\$(SRC_PATH)/tests/tsan/blacklist.tsan $QEMU_CFLAGS"

I presume the goal here is to fix these races later (my default assumption
is that warnings == races, since most warnings are indeed races). If so,
please consider making the suppression optional (via
"--extra-cflags=-fsanitize-blacklist=path-to-this-file"), since that
way the reports are likely to get more eyeballs.

Thanks,

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/19] accel/tcg: Fixed tsan warnings.
  2020-05-22 16:07 ` [PATCH 13/19] accel/tcg: Fixed tsan warnings Robert Foley
@ 2020-05-23 20:06   ` Emilio G. Cota
  2020-05-26 15:14     ` Robert Foley
  2020-05-26 18:47   ` Paolo Bonzini
  1 sibling, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 20:06 UTC (permalink / raw)
  To: Robert Foley
  Cc: peter.puhov, Richard Henderson, alex.bennee, qemu-devel, Paolo Bonzini

On Fri, May 22, 2020 at 12:07:49 -0400, Robert Foley wrote:
> For example:
> WARNING: ThreadSanitizer: data race (pid=35425)
>   Write of size 4 at 0x7bbc000000ac by main thread (mutexes: write M875):
>     #0 cpu_reset_interrupt hw/core/cpu.c:107:28 (qemu-system-aarch64+0x843790)
>     #1 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x616265)
>     #2 qemu_set_irq hw/core/irq.c:44:5 (qemu-system-aarch64+0x8462ca)
>   Previous atomic read of size 4 at 0x7bbc000000ac by thread T6:
>     #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x394c1c)
>     #1 cpu_handle_interrupt accel/tcg/cpu-exec.c:534:9 (qemu-system-aarch64+0x4b7e79)
>     #2 cpu_exec accel/tcg/cpu-exec.c:720:17 (qemu-system-aarch64+0x4b7e79)
> or
> WARNING: ThreadSanitizer: data race (pid=25425)
>   Read of size 8 at 0x7f8ad8e138d0 by thread T10:
>     #0 tb_lookup_cmp accel/tcg/cpu-exec.c:307:13 (qemu-system-aarch64+0x4ac4d2)
>     #1 qht_do_lookup util/qht.c:502:34 (qemu-system-aarch64+0xd05264)
>   Previous write of size 8 at 0x7f8ad8e138d0 by thread T15 (mutexes: write M728311726235541804):
>     #0 tb_link_page accel/tcg/translate-all.c:1625:26 (qemu-system-aarch64+0x4b0bf2)
>     #1 tb_gen_code accel/tcg/translate-all.c:1865:19 (qemu-system-aarch64+0x4b0bf2)
>     #2 tb_find accel/tcg/cpu-exec.c:407:14 (qemu-system-aarch64+0x4ad77c)

I see you're working through the warnings in this file, but I think it would
be better to forget about files and focus on the data itself.
Therefore this patch should be split in two: (1) cpu-<interrupt_request
and (2) gen_code_buf. (1) requires a lot of changes with a proper audit;
the per-cpu-lock series has a possible solution for that, so I will
ignore those hunks and just comment on (2) below.

> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  accel/tcg/tcg-all.c       | 4 ++--
>  accel/tcg/tcg-runtime.c   | 7 ++++++-
>  accel/tcg/translate-all.c | 6 +++++-
>  hw/core/cpu.c             | 2 +-
>  4 files changed, 14 insertions(+), 5 deletions(-)
> 
(snip)
> diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
> index 446465a09a..bd0cd77450 100644
> --- a/accel/tcg/tcg-runtime.c
> +++ b/accel/tcg/tcg-runtime.c
> @@ -31,6 +31,7 @@
>  #include "disas/disas.h"
>  #include "exec/log.h"
>  #include "tcg/tcg.h"
> +#include "qemu/tsan.h"
>  
>  /* 32-bit helpers */
>  
> @@ -151,6 +152,7 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>      TranslationBlock *tb;
>      target_ulong cs_base, pc;
>      uint32_t flags;
> +    void *tc_ptr;
>  
>      tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, curr_cflags());
>      if (tb == NULL) {
> @@ -161,7 +163,10 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>                             TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
>                             cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
>                             lookup_symbol(pc));
> -    return tb->tc.ptr;
> +    TSAN_ANNOTATE_IGNORE_READS_BEGIN();
> +    tc_ptr = tb->tc.ptr;
> +    TSAN_ANNOTATE_IGNORE_READS_END();
> +    return tc_ptr;

I'm not sure these are needed. At least after applying all other patches
in this series, I don't get a warning here.

>  }
>  
>  void HELPER(exit_atomic)(CPUArchState *env)
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 3fb71a1503..6c0e61994c 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -58,6 +58,7 @@
>  #include "exec/log.h"
>  #include "sysemu/cpus.h"
>  #include "sysemu/tcg.h"
> +#include "qemu/tsan.h"
>  
>  /* #define DEBUG_TB_INVALIDATE */
>  /* #define DEBUG_TB_FLUSH */
> @@ -1704,6 +1705,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>          max_insns = 1;
>      }
>  
> +    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();

Same here, I don't get a warning in this hunk if I remove these,
except for:
---
WARNING: ThreadSanitizer: data race (pid=445867)
  Atomic read of size 1 at 0x7f906e050158 by thread T7:
    #0 __tsan_mutex_post_lock <null> (qemu-system-aarch64+0x481721)
    #1 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:244:5 (qemu-system-aarch64+0x5578e9)
    #2 tb_add_jump /home/cota/src/qemu/accel/tcg/cpu-exec.c:363:5 (qemu-system-aarch64+0x5578e9)
    #3 tb_find /home/cota/src/qemu/accel/tcg/cpu-exec.c:423:9 (qemu-system-aarch64+0x5578e9)

  Previous write of size 1 at 0x7f906e050158 by thread T8:
    #0 __tsan_mutex_create <null> (qemu-system-aarch64+0x481589)
    #1 qemu_spin_init /home/cota/src/qemu/include/qemu/thread.h:221:5 (qemu-system-aarch64+0x559a71)
    #2 tb_gen_code /home/cota/src/qemu/accel/tcg/translate-all.c:1875:5 (qemu-system-aarch64+0x559a71)

  Thread T7 'CPU 0/TCG' (tid=445875, running) created by main thread at:
    #0 pthread_create <null> (qemu-system-aarch64+0x43915b)
    #1 qemu_thread_create /home/cota/src/qemu/util/qemu-thread-posix.c:558:11 (qemu-system-aarch64+0xaf91ff)

  Thread T8 'CPU 1/TCG' (tid=445876, running) created by main thread at:
    #0 pthread_create <null> (qemu-system-aarch64+0x43915b)
    #1 qemu_thread_create /home/cota/src/qemu/util/qemu-thread-posix.c:558:11 (qemu-system-aarch64+0xaf91ff)

SUMMARY: ThreadSanitizer: data race (/home/cota/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x481721) in __tsan_mutex_post_lock
---

Seems like tsan is confusing itself here.

Thanks,
		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/19] util/async: Fixed tsan warnings
  2020-05-22 16:07 ` [PATCH 14/19] util/async: " Robert Foley
@ 2020-05-23 20:12   ` Emilio G. Cota
  2020-05-26 15:19     ` Robert Foley
  2020-05-26 10:32   ` Stefan Hajnoczi
  1 sibling, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 20:12 UTC (permalink / raw)
  To: Robert Foley
  Cc: Fam Zheng, peter.puhov, alex.bennee, qemu-devel, Stefan Hajnoczi

On Fri, May 22, 2020 at 12:07:50 -0400, Robert Foley wrote:
> For example:
>   Atomic write of size 8 at 0x7b4800113c28 by main thread (mutexes: write M30):
>     #0 __tsan_atomic64_exchange <null> (qemu-system-aarch64+0x386f85)
>     #1 aio_bh_poll util/async.c:146:5 (qemu-system-aarch64+0xcd1f61)
>     #2 aio_dispatch util/aio-posix.c:380:5 (qemu-system-aarch64+0xcd8abb)
>     #3 aio_ctx_dispatch util/async.c:298:5 (qemu-system-aarch64+0xcd31b0)
>     #4 g_main_context_dispatch <null> (libglib-2.0.so.0+0x4c416)
>     #5 qemu_main_loop softmmu/vl.c:1664:9 (qemu-system-aarch64+0x5cc6d6)
>     #6 main softmmu/main.c:49:5 (qemu-system-aarch64+0xc62857)
> 
>   Previous read of size 8 at 0x7b4800113c28 by thread T3 (mutexes: write M81):
>     #0 aio_bh_enqueue util/async.c:81:9 (qemu-system-aarch64+0xcd2267)
>     #1 qemu_bh_schedule util/async.c:181:5 (qemu-system-aarch64+0xcd2267)
>     #2 worker_thread util/thread-pool.c:113:9 (qemu-system-aarch64+0xcd473c)
>     #3 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xcde280)
> 
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Fam Zheng <fam@euphon.net>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  util/async.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/util/async.c b/util/async.c
> index 1319eee3bc..51e306bf0c 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -33,6 +33,7 @@
>  #include "block/raw-aio.h"
>  #include "qemu/coroutine_int.h"
>  #include "trace.h"
> +#include "qemu/tsan.h"
>  
>  /***********************************************************/
>  /* bottom halves (can be seen as timers which expire ASAP) */
> @@ -76,10 +77,12 @@ static void aio_bh_enqueue(QEMUBH *bh, unsigned new_flags)
>       * 2. ctx is loaded before the callback has a chance to execute and bh
>       *    could be freed.
>       */
> +    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();

Why do we need these annotations here? It looks like the fix for the
race in the commit message is below (i.e. atomic_read).

In general, I'd expect annotations to come with a comment, since
they should be used sparingly. That is, the assumption is that
tsan is almost always right.

>      old_flags = atomic_fetch_or(&bh->flags, BH_PENDING | new_flags);
>      if (!(old_flags & BH_PENDING)) {
>          QSLIST_INSERT_HEAD_ATOMIC(&ctx->bh_list, bh, next);
>      }
> +    TSAN_ANNOTATE_IGNORE_WRITES_END();
>  
>      aio_notify(ctx);
>  }
> @@ -143,7 +146,9 @@ int aio_bh_poll(AioContext *ctx)
>      BHListSlice *s;
>      int ret = 0;
>  
> +    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
>      QSLIST_MOVE_ATOMIC(&slice.bh_list, &ctx->bh_list);
> +    TSAN_ANNOTATE_IGNORE_WRITES_END();

ditto

>      QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
>  
>      while ((s = QSIMPLEQ_FIRST(&ctx->bh_slice_list))) {
> @@ -280,14 +285,16 @@ aio_ctx_check(GSource *source)
>      aio_notify_accept(ctx);
>  
>      QSLIST_FOREACH_RCU(bh, &ctx->bh_list, next) {
> -        if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
> +        if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
> +             == BH_SCHEDULED) {
>              return true;
>          }
>      }
>  
>      QSIMPLEQ_FOREACH(s, &ctx->bh_slice_list, next) {
>          QSLIST_FOREACH_RCU(bh, &s->bh_list, next) {
> -            if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
> +            if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
> +                 == BH_SCHEDULED) {

This hunk like the real fix. Also, I'd put "fix race" in the commit
title as opposed to "fix warning" since fixing races is the goal, not
fixing warnings.

Thanks,

		Emilio


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 17/19] util: Added tsan annotate for thread name.
  2020-05-22 16:07 ` [PATCH 17/19] util: Added tsan annotate for thread name Robert Foley
@ 2020-05-23 20:29   ` Emilio G. Cota
  0 siblings, 0 replies; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 20:29 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:53 -0400, Robert Foley wrote:
> This allows us to see the name of the thread in tsan
> warning reports such as this:
> 
>   Thread T7 'CPU 1/TCG' (tid=24317, running) created by main thread at:
> 
> Signed-off-by: Robert Foley <robert.foley@linaro.org>

Reviewed-by: Emilio G. Cota <cota@braap.org>

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/19] docs: Added details on TSan to testing.rst
  2020-05-22 16:07 ` [PATCH 19/19] docs: Added details on TSan to testing.rst Robert Foley
@ 2020-05-23 20:29   ` Emilio G. Cota
  0 siblings, 0 replies; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 20:29 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:55 -0400, Robert Foley wrote:
> This includes details on how to build and test with TSan
> both inside a docker and outside.
> 
> Signed-off-by: Robert Foley <robert.foley@linaro.org>

Reviewed-by: Emilio G. Cota <cota@braap.org>

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 15/19] qht: Fix tsan warnings.
  2020-05-22 16:07 ` [PATCH 15/19] qht: Fix " Robert Foley
@ 2020-05-23 20:44   ` Emilio G. Cota
  0 siblings, 0 replies; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 20:44 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:51 -0400, Robert Foley wrote:
> For example:
> WARNING: ThreadSanitizer: data race (pid=23406)
>   Atomic read of size 4 at 0x7b100003e3c8 by thread T7:
>     #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x39a36c)
>     #1 qht_do_lookup util/qht.c:495:17 (qemu-system-aarch64+0xd82f7a)
>     #2 qht_lookup_custom util/qht.c:539:11 (qemu-system-aarch64+0xd82f7a)
>   Previous write of size 8 at 0x7b100003e3c8 by thread T6 (mutexes: write M166769147697783108, write M995435858420506688):
>     #0 posix_memalign <null> (qemu-system-aarch64+0x350dd1)
>     #1 qemu_try_memalign util/oslib-posix.c:189:11 (qemu-system-aarch64+0xd59317)
>     #2 qemu_memalign util/oslib-posix.c:205:27 (qemu-system-aarch64+0xd5943e)
>     #3 qht_insert__locked util/qht.c:583:9 (qemu-system-aarch64+0xd837c5)
> 
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  util/qht.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/util/qht.c b/util/qht.c
> index 67e5d5b916..739a53ced0 100644
> --- a/util/qht.c
> +++ b/util/qht.c
> @@ -69,6 +69,7 @@
>  #include "qemu/qht.h"
>  #include "qemu/atomic.h"
>  #include "qemu/rcu.h"
> +#include "qemu/tsan.h"
>  
>  //#define QHT_DEBUG
>  
> @@ -580,10 +581,12 @@ static void *qht_insert__locked(const struct qht *ht, struct qht_map *map,
>          b = b->next;
>      } while (b);
>  
> +    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
>      b = qemu_memalign(QHT_BUCKET_ALIGN, sizeof(*b));
>      memset(b, 0, sizeof(*b));
>      new = b;
>      i = 0;
> +    TSAN_ANNOTATE_IGNORE_WRITES_END();

I cannot reproduce this warning post-series with detect_deadlocks=0
but my hypothesis is that this is a side effect of tsan not understanding
the seqlock: tsan sees that below we "publish" this piece of memory with
an atomic write (in atomic_rcu_set), and does not see that with
seqlock_write_begin we have a write memory barrier. I wonder if
what we need instead is to annotate the seqlock functions, not the
callers.

Thanks,

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/19] Add Thread Sanitizer support to QEMU
  2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
                   ` (19 preceding siblings ...)
  2020-05-22 21:07 ` [PATCH 00/19] Add Thread Sanitizer support to QEMU no-reply
@ 2020-05-23 21:36 ` Emilio G. Cota
  2020-05-26 15:18   ` Robert Foley
  20 siblings, 1 reply; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 21:36 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Fri, May 22, 2020 at 12:07:36 -0400, Robert Foley wrote:
> This patch series continues the work done by Emilio Cota and others to add
> Thread Sanitizer (TSan) support to QEMU.
> 
> The starting point for this work was Emilio's branch here:
> https://github.com/cota/qemu/commits/tsan
> specifically this commit: 0be125fc0afd47218b34d2019abdd19b644f3199
> 
> The purpose of this patch is not to fix all the TSan warnings, but to enable
> the TSan support so that QEMU developers can start using the tool.  
> We found this tool useful and even ran it on our recent changes in
> the cpu-locks series.
> Clearly there is work to do here to clean up all the warnings. :)  
> We have made a start to cleaning up these warnings by getting a VM to boot 
> cleanly with no TSan warnings.  
> We have also made an effort to introduce enough of the TSan suppression
> mechanisms, so that others can continue this work.

Thank you for doing this work! Great to see this finally coming along.

What are your plans wrt the per-cpu-lock branch? Given that some of
the races reported here are fixed there, I think it would make sense to
defer those patches to the per-cpu-lock series (i.e. patches 2/19, parts
of 13/19, and 18/19) so that this series goes in first (it is a lot
less likely to break anything).

Also, I would not worry too much about rushing to bring warnings to
0; what's important is that with the warnings we now know where to
focus on, and then we can carefully fix races. In particular I think
all annotations we add must have a comment, since otherwise we are
at the risk of blindlessly silencing warnings of real races.

> Issues:
> - When running docker-test-quick under TSan there are several tests which hang
>   - The unit tests which seem to hang under TSan:
>     test-char, test-qdev-global-props, and test-qga.  
>   - If we comment out those tests, check-unit finishes, albeit with 
>     a couple of warnings. :)

I've noticed another issue (that I did not notice on my previous
machine), which is that tsan blows up when in qht we lock all
of the bucket's locks. We then get this assert from tsan, since
it has a static limit of 64 mutexes held at any given time:

FATAL: ThreadSanitizer CHECK failed: /build/llvm-toolchain-10-yegZYJ/llvm-toolchain-10-10.0.0/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40)
    #0 __tsan::TsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) <null> (qemu-system-aarch64+0x49f9f5)
    #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) <null> (qemu-system-aarch64+0x4b651f)
    #2 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::addLock(unsigned long, unsigned long, unsigned int) <null> (qemu-system-aarch64+0x4aacbc)
    #3 __sanitizer::DD::MutexAfterLock(__sanitizer::DDCallback*, __sanitizer::DDMutex*, bool, bool) <null> (qemu-system-aarch64+0x4aa22e)
    #4 __tsan::MutexPostLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int, int) <null> (qemu-system-aarch64+0x49ded8)
    #5 __tsan_mutex_post_lock <null> (qemu-system-aarch64+0x48175a)
    #6 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:245:5 (qemu-system-aarch64+0xb1283b)
    #7 qht_map_lock_buckets /home/cota/src/qemu/util/qht.c:253:9 (qemu-system-aarch64+0xb1283b)

A workaround for now is to run qemu with TSAN_OPTIONS=detect_deadlocks=0,
although selectively disabling tsan for qht_map_lock/unlock_buckets would
be ideal (not sure if it's possible).

Another warning I'm reliably seen is:
WARNING: ThreadSanitizer: double lock of a mutex (pid=489006)
    #0 pthread_mutex_lock <null> (qemu-system-aarch64+0x457596)
    #1 qemu_mutex_lock_impl /home/cota/src/qemu/util/qemu-thread-posix.c:79:11 (qemu-system-aarch64+0xaf7e3c)

  Location is heap block of size 328 at 0x7b4800114900 allocated by main thread:
    #0 calloc <null> (qemu-system-aarch64+0x438b80)
    #1 g_malloc0 <null> (libglib-2.0.so.0+0x57d30)

  Mutex M57 (0x7b4800114960) created at:
    #0 pthread_mutex_init <null> (qemu-system-aarch64+0x43b74d)
    #1 qemu_rec_mutex_init /home/cota/src/qemu/util/qemu-thread-posix.c:119:11 (qemu-system-aarch64+0xaf815b)

But this one seems safe to ignore.

Thanks,
		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 10/19] include/qemu: Added tsan.h for annotations.
  2020-05-23 17:20   ` Emilio G. Cota
@ 2020-05-23 21:37     ` Emilio G. Cota
  0 siblings, 0 replies; 51+ messages in thread
From: Emilio G. Cota @ 2020-05-23 21:37 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, alex.bennee, qemu-devel

On Sat, May 23, 2020 at 13:20:15 -0400, Emilio G. Cota wrote:
> On Fri, May 22, 2020 at 12:07:46 -0400, Robert Foley wrote:
> > These annotations will allow us to give tsan
> > additional hints.  For example, we can inform
> > tsan about reads/writes to ignore to silence certain
> > classes of warnings.
> > We can also annotate threads so that the proper thread
> > naming shows up in tsan warning results.
> > 
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> 
> Reviewed-by: Emilio G. Cota <cota@braap.org>

Although consider prefixing the constants with QEMU_.

Thanks,

		E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ
  2020-05-22 16:07 ` [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ Robert Foley
@ 2020-05-24 10:20   ` Philippe Mathieu-Daudé
  2020-05-26 15:01     ` Robert Foley
  0 siblings, 1 reply; 51+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-05-24 10:20 UTC (permalink / raw)
  To: Robert Foley, qemu-devel; +Cc: peter.puhov, cota, alex.bennee

On 5/22/20 6:07 PM, Robert Foley wrote:
> From: "Emilio G. Cota" <cota@braap.org>
> 
> Instead of open-coding it.

Please use a full sentence (repeating the patch subject):

"Convert queued work to a QSIMPLEQ instead of open-coding it."

(Not all email clients display the subject preceding the content).

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> 
> While at it, make sure that all accesses to the list are
> performed while holding the list's lock.
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  cpus-common.c         | 25 ++++++++-----------------
>  cpus.c                | 14 ++++++++++++--
>  hw/core/cpu.c         |  1 +
>  include/hw/core/cpu.h |  6 +++---
>  4 files changed, 24 insertions(+), 22 deletions(-)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index 55d5df8923..210fc7fc39 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -97,7 +97,7 @@ void cpu_list_remove(CPUState *cpu)
>  }
>  
>  struct qemu_work_item {
> -    struct qemu_work_item *next;
> +    QSIMPLEQ_ENTRY(qemu_work_item) node;
>      run_on_cpu_func func;
>      run_on_cpu_data data;
>      bool free, exclusive, done;
> @@ -106,13 +106,7 @@ struct qemu_work_item {
>  static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
>  {
>      qemu_mutex_lock(&cpu->work_mutex);
> -    if (cpu->queued_work_first == NULL) {
> -        cpu->queued_work_first = wi;
> -    } else {
> -        cpu->queued_work_last->next = wi;
> -    }
> -    cpu->queued_work_last = wi;
> -    wi->next = NULL;
> +    QSIMPLEQ_INSERT_TAIL(&cpu->work_list, wi, node);
>      wi->done = false;
>      qemu_mutex_unlock(&cpu->work_mutex);
>  
> @@ -306,17 +300,14 @@ void process_queued_cpu_work(CPUState *cpu)
>  {
>      struct qemu_work_item *wi;
>  
> -    if (cpu->queued_work_first == NULL) {
> +    qemu_mutex_lock(&cpu->work_mutex);
> +    if (QSIMPLEQ_EMPTY(&cpu->work_list)) {
> +        qemu_mutex_unlock(&cpu->work_mutex);
>          return;
>      }
> -
> -    qemu_mutex_lock(&cpu->work_mutex);
> -    while (cpu->queued_work_first != NULL) {
> -        wi = cpu->queued_work_first;
> -        cpu->queued_work_first = wi->next;
> -        if (!cpu->queued_work_first) {
> -            cpu->queued_work_last = NULL;
> -        }
> +    while (!QSIMPLEQ_EMPTY(&cpu->work_list)) {
> +        wi = QSIMPLEQ_FIRST(&cpu->work_list);
> +        QSIMPLEQ_REMOVE_HEAD(&cpu->work_list, node);
>          qemu_mutex_unlock(&cpu->work_mutex);
>          if (wi->exclusive) {
>              /* Running work items outside the BQL avoids the following deadlock:
> diff --git a/cpus.c b/cpus.c
> index 5670c96bcf..af44027549 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -97,9 +97,19 @@ bool cpu_is_stopped(CPUState *cpu)
>      return cpu->stopped || !runstate_is_running();
>  }
>  
> +static inline bool cpu_work_list_empty(CPUState *cpu)
> +{
> +    bool ret;
> +
> +    qemu_mutex_lock(&cpu->work_mutex);
> +    ret = QSIMPLEQ_EMPTY(&cpu->work_list);
> +    qemu_mutex_unlock(&cpu->work_mutex);
> +    return ret;
> +}
> +
>  static bool cpu_thread_is_idle(CPUState *cpu)
>  {
> -    if (cpu->stop || cpu->queued_work_first) {
> +    if (cpu->stop || !cpu_work_list_empty(cpu)) {
>          return false;
>      }
>      if (cpu_is_stopped(cpu)) {
> @@ -1498,7 +1508,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>              cpu = first_cpu;
>          }
>  
> -        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
> +        while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) {
>  
>              atomic_mb_set(&tcg_current_rr_cpu, cpu);
>              current_cpu = cpu;
> diff --git a/hw/core/cpu.c b/hw/core/cpu.c
> index 5284d384fb..77703d62b7 100644
> --- a/hw/core/cpu.c
> +++ b/hw/core/cpu.c
> @@ -368,6 +368,7 @@ static void cpu_common_initfn(Object *obj)
>      cpu->nr_threads = 1;
>  
>      qemu_mutex_init(&cpu->work_mutex);
> +    QSIMPLEQ_INIT(&cpu->work_list);
>      QTAILQ_INIT(&cpu->breakpoints);
>      QTAILQ_INIT(&cpu->watchpoints);
>  
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 07f7698155..d78ff1d165 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -331,8 +331,8 @@ struct qemu_work_item;
>   * @opaque: User data.
>   * @mem_io_pc: Host Program Counter at which the memory was accessed.
>   * @kvm_fd: vCPU file descriptor for KVM.
> - * @work_mutex: Lock to prevent multiple access to queued_work_*.
> - * @queued_work_first: First asynchronous work pending.
> + * @work_mutex: Lock to prevent multiple access to @work_list.
> + * @work_list: List of pending asynchronous work.
>   * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes
>   *                        to @trace_dstate).
>   * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask).
> @@ -376,7 +376,7 @@ struct CPUState {
>      sigjmp_buf jmp_env;
>  
>      QemuMutex work_mutex;
> -    struct qemu_work_item *queued_work_first, *queued_work_last;
> +    QSIMPLEQ_HEAD(, qemu_work_item) work_list;
>  
>      CPUAddressSpace *cpu_ases;
>      int num_ases;
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/19] util/async: Fixed tsan warnings
  2020-05-22 16:07 ` [PATCH 14/19] util/async: " Robert Foley
  2020-05-23 20:12   ` Emilio G. Cota
@ 2020-05-26 10:32   ` Stefan Hajnoczi
  2020-05-26 15:21     ` Robert Foley
  1 sibling, 1 reply; 51+ messages in thread
From: Stefan Hajnoczi @ 2020-05-26 10:32 UTC (permalink / raw)
  To: Robert Foley; +Cc: Fam Zheng, peter.puhov, cota, alex.bennee, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1816 bytes --]

On Fri, May 22, 2020 at 12:07:50PM -0400, Robert Foley wrote:
> For example:
>   Atomic write of size 8 at 0x7b4800113c28 by main thread (mutexes: write M30):
>     #0 __tsan_atomic64_exchange <null> (qemu-system-aarch64+0x386f85)
>     #1 aio_bh_poll util/async.c:146:5 (qemu-system-aarch64+0xcd1f61)
>     #2 aio_dispatch util/aio-posix.c:380:5 (qemu-system-aarch64+0xcd8abb)
>     #3 aio_ctx_dispatch util/async.c:298:5 (qemu-system-aarch64+0xcd31b0)
>     #4 g_main_context_dispatch <null> (libglib-2.0.so.0+0x4c416)
>     #5 qemu_main_loop softmmu/vl.c:1664:9 (qemu-system-aarch64+0x5cc6d6)
>     #6 main softmmu/main.c:49:5 (qemu-system-aarch64+0xc62857)
> 
>   Previous read of size 8 at 0x7b4800113c28 by thread T3 (mutexes: write M81):
>     #0 aio_bh_enqueue util/async.c:81:9 (qemu-system-aarch64+0xcd2267)
>     #1 qemu_bh_schedule util/async.c:181:5 (qemu-system-aarch64+0xcd2267)
>     #2 worker_thread util/thread-pool.c:113:9 (qemu-system-aarch64+0xcd473c)
>     #3 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xcde280)
> 
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Fam Zheng <fam@euphon.net>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  util/async.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)

Please explain why these warnings should be ignored in the commit
description. I have been CCed on this patch, am unfamiliar with TSAN
annotations, and now need to figure out:
1. Is ignoring the warning safe or should async.c be fixed somehow?
2. How do the annotation macros work and are they being used correectly?

It's likely that people coming across this commit in the future would
also benefit from information in the commit description that makes these
things clear.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-05-23 16:55   ` Philippe Mathieu-Daudé
@ 2020-05-26 12:38     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 12:38 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Peter Puhov, Emilio G. Cota, Alex Bennée, QEMU Developers,
	Lingfeng Yang

On Sat, 23 May 2020 at 12:55, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> Hi Robert,
>
> On 5/22/20 6:07 PM, Robert Foley wrote:
> > From: Lingfeng Yang <lfy@google.com>
> >
> > We tried running QEMU under tsan in 2016, but tsan's lack of support for
> > longjmp-based fibers was a blocker:
<snip>
> > @@ -6277,6 +6304,14 @@ if test "$have_asan" = "yes"; then
> >             "Without code annotation, the report may be inferior."
> >    fi
> >  fi
> > +if test "$have_tsan" = "yes" ; then
> > +  if test "$have_tsan_iface_fiber" = "yes" ; then
> > +    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
> > +    QEMU_LDFLAGS="-fsanitize=thread $QEMU_LDFLAGS"
> > +  else
> > +    echo "Cannot enable TSAN due to missing fiber annotation interface."
>
> I tried your series and there were no changes anywhere, then I looked at
> how TSan work, started to debug, to finally realize my build was not
> using TSan (clang8). Please use to something such:
>
>      if test "$tsan" = "yes" ; then
>         error_exit "Cannot enable TSAN due to missing fiber" \
>                    "annotation interface."
>      fi

This is a good point.  Will make these changes.

Thanks & Regards,
-Rob


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/19] target/arm: Fix tsan warning in cpu.c
  2020-05-23 17:18         ` Emilio G. Cota
@ 2020-05-26 14:01           ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 14:01 UTC (permalink / raw)
  To: Emilio G. Cota
  Cc: Peter Maydell, Richard Henderson, Alex Bennée,
	QEMU Developers, Peter Puhov

On Sat, 23 May 2020 at 13:18, Emilio G. Cota <cota@braap.org> wrote:
>
> On Fri, May 22, 2020 at 23:36:18 +0100, Peter Maydell wrote:
> > So is this:
> >  (a) a TSan false positive, because we've analysed the use
> >      of this struct field and know it's not a race because
> >      [details], but which we're choosing to silence in this way
> >  (b) an actual race for which the correct fix is to make the
> >      accesses atomic because [details]
> >
> > ?
>
> It is (b), as shown in the per-cpu lock series. In particular,
> see these two patches:
> - [PATCH v9 33/74] cpu: define cpu_interrupt_request helpers
> https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg06322.html
> - [PATCH v9 39/74] arm: convert to cpu_interrupt_request
> https://lists.nongnu.org/archive/html/qemu-devel/2020-05/msg06328.html
>
> Since a more thorough fix is included in that other series, I think this
> patch should be dropped from this series -- I'll post a reply to patch
> 00/19 with more details.

I agree, we will re-focus the patch series a bit and drop this patch
from the series.

Thanks & Regards,
-Rob
> Thanks,
>
>                 Emilio


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/19] configure: added tsan support for blacklist.
  2020-05-23 17:27   ` Emilio G. Cota
@ 2020-05-26 14:07     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 14:07 UTC (permalink / raw)
  To: Emilio G. Cota; +Cc: Peter Puhov, Alex Bennée, QEMU Developers

On Sat, 23 May 2020 at 13:27, Emilio G. Cota <cota@braap.org> wrote:
>
> On Fri, May 22, 2020 at 12:07:48 -0400, Robert Foley wrote:
> > Initially put several files into blacklist that were
> > causing the most problems, namely bitops.c and bitmap.c.
> >
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > ---
> >  configure                 | 3 ++-
> >  tests/tsan/blacklist.tsan | 5 +++++
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> >  create mode 100644 tests/tsan/blacklist.tsan
> >
> > diff --git a/configure b/configure
> > index c95c54fb48..8a86a0638d 100755
> > --- a/configure
> > +++ b/configure
> > @@ -6306,7 +6306,8 @@ if test "$have_asan" = "yes"; then
> >  fi
> >  if test "$have_tsan" = "yes" ; then
> >    if test "$have_tsan_iface_fiber" = "yes" ; then
> > -    QEMU_CFLAGS="-fsanitize=thread $QEMU_CFLAGS"
> > +    QEMU_CFLAGS="-fsanitize=thread -fsanitize-blacklist="\
> > +             "\$(SRC_PATH)/tests/tsan/blacklist.tsan $QEMU_CFLAGS"
>
> I presume the goal here is to fix these races later (my default assumption
> is that warnings == races, since most warnings are indeed races). If so,
> please consider making the suppression optional (via
> "--extra-cflags=-fsanitize-blacklist=path-to-this-file"), since that
> way the reports are likely to get more eyeballs.

Yes, the goal is to fix these later.  Will add an explanation of this in
blacklist.tsan.

We will make the blacklist optional, and also add some documentation
on how to use the blacklist in the TSan section of testing.rst.

Thanks & Regards,
-Rob

>
> Thanks,
>
>                 E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ
  2020-05-24 10:20   ` Philippe Mathieu-Daudé
@ 2020-05-26 15:01     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 15:01 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Peter Puhov, Emilio G. Cota, Alex Bennée, QEMU Developers

On Sun, 24 May 2020 at 06:21, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> On 5/22/20 6:07 PM, Robert Foley wrote:
> > From: "Emilio G. Cota" <cota@braap.org>
> >
> > Instead of open-coding it.
>
> Please use a full sentence (repeating the patch subject):
>
> "Convert queued work to a QSIMPLEQ instead of open-coding it."
>
> (Not all email clients display the subject preceding the content).

OK, I will update this commit message.

Thanks & Regards,
-Rob
>
> Otherwise:
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>
> >
> > While at it, make sure that all accesses to the list are
> > performed while holding the list's lock.
> >
> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> > Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> > Signed-off-by: Emilio G. Cota <cota@braap.org>
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > ---
> >  cpus-common.c         | 25 ++++++++-----------------
> >  cpus.c                | 14 ++++++++++++--
> >  hw/core/cpu.c         |  1 +
> >  include/hw/core/cpu.h |  6 +++---
> >  4 files changed, 24 insertions(+), 22 deletions(-)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 55d5df8923..210fc7fc39 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -97,7 +97,7 @@ void cpu_list_remove(CPUState *cpu)
> >  }
> >
> >  struct qemu_work_item {
> > -    struct qemu_work_item *next;
> > +    QSIMPLEQ_ENTRY(qemu_work_item) node;
> >      run_on_cpu_func func;
> >      run_on_cpu_data data;
> >      bool free, exclusive, done;
> > @@ -106,13 +106,7 @@ struct qemu_work_item {
> >  static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
> >  {
> >      qemu_mutex_lock(&cpu->work_mutex);
> > -    if (cpu->queued_work_first == NULL) {
> > -        cpu->queued_work_first = wi;
> > -    } else {
> > -        cpu->queued_work_last->next = wi;
> > -    }
> > -    cpu->queued_work_last = wi;
> > -    wi->next = NULL;
> > +    QSIMPLEQ_INSERT_TAIL(&cpu->work_list, wi, node);
> >      wi->done = false;
> >      qemu_mutex_unlock(&cpu->work_mutex);
> >
> > @@ -306,17 +300,14 @@ void process_queued_cpu_work(CPUState *cpu)
> >  {
> >      struct qemu_work_item *wi;
> >
> > -    if (cpu->queued_work_first == NULL) {
> > +    qemu_mutex_lock(&cpu->work_mutex);
> > +    if (QSIMPLEQ_EMPTY(&cpu->work_list)) {
> > +        qemu_mutex_unlock(&cpu->work_mutex);
> >          return;
> >      }
> > -
> > -    qemu_mutex_lock(&cpu->work_mutex);
> > -    while (cpu->queued_work_first != NULL) {
> > -        wi = cpu->queued_work_first;
> > -        cpu->queued_work_first = wi->next;
> > -        if (!cpu->queued_work_first) {
> > -            cpu->queued_work_last = NULL;
> > -        }
> > +    while (!QSIMPLEQ_EMPTY(&cpu->work_list)) {
> > +        wi = QSIMPLEQ_FIRST(&cpu->work_list);
> > +        QSIMPLEQ_REMOVE_HEAD(&cpu->work_list, node);
> >          qemu_mutex_unlock(&cpu->work_mutex);
> >          if (wi->exclusive) {
> >              /* Running work items outside the BQL avoids the following deadlock:
> > diff --git a/cpus.c b/cpus.c
> > index 5670c96bcf..af44027549 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -97,9 +97,19 @@ bool cpu_is_stopped(CPUState *cpu)
> >      return cpu->stopped || !runstate_is_running();
> >  }
> >
> > +static inline bool cpu_work_list_empty(CPUState *cpu)
> > +{
> > +    bool ret;
> > +
> > +    qemu_mutex_lock(&cpu->work_mutex);
> > +    ret = QSIMPLEQ_EMPTY(&cpu->work_list);
> > +    qemu_mutex_unlock(&cpu->work_mutex);
> > +    return ret;
> > +}
> > +
> >  static bool cpu_thread_is_idle(CPUState *cpu)
> >  {
> > -    if (cpu->stop || cpu->queued_work_first) {
> > +    if (cpu->stop || !cpu_work_list_empty(cpu)) {
> >          return false;
> >      }
> >      if (cpu_is_stopped(cpu)) {
> > @@ -1498,7 +1508,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
> >              cpu = first_cpu;
> >          }
> >
> > -        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
> > +        while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) {
> >
> >              atomic_mb_set(&tcg_current_rr_cpu, cpu);
> >              current_cpu = cpu;
> > diff --git a/hw/core/cpu.c b/hw/core/cpu.c
> > index 5284d384fb..77703d62b7 100644
> > --- a/hw/core/cpu.c
> > +++ b/hw/core/cpu.c
> > @@ -368,6 +368,7 @@ static void cpu_common_initfn(Object *obj)
> >      cpu->nr_threads = 1;
> >
> >      qemu_mutex_init(&cpu->work_mutex);
> > +    QSIMPLEQ_INIT(&cpu->work_list);
> >      QTAILQ_INIT(&cpu->breakpoints);
> >      QTAILQ_INIT(&cpu->watchpoints);
> >
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index 07f7698155..d78ff1d165 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -331,8 +331,8 @@ struct qemu_work_item;
> >   * @opaque: User data.
> >   * @mem_io_pc: Host Program Counter at which the memory was accessed.
> >   * @kvm_fd: vCPU file descriptor for KVM.
> > - * @work_mutex: Lock to prevent multiple access to queued_work_*.
> > - * @queued_work_first: First asynchronous work pending.
> > + * @work_mutex: Lock to prevent multiple access to @work_list.
> > + * @work_list: List of pending asynchronous work.
> >   * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes
> >   *                        to @trace_dstate).
> >   * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask).
> > @@ -376,7 +376,7 @@ struct CPUState {
> >      sigjmp_buf jmp_env;
> >
> >      QemuMutex work_mutex;
> > -    struct qemu_work_item *queued_work_first, *queued_work_last;
> > +    QSIMPLEQ_HEAD(, qemu_work_item) work_list;
> >
> >      CPUAddressSpace *cpu_ases;
> >      int num_ases;
> >
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/19] accel/tcg: Fixed tsan warnings.
  2020-05-23 20:06   ` Emilio G. Cota
@ 2020-05-26 15:14     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 15:14 UTC (permalink / raw)
  To: Emilio G. Cota
  Cc: Peter Puhov, Richard Henderson, Alex Bennée,
	QEMU Developers, Paolo Bonzini

On Sat, 23 May 2020 at 16:06, Emilio G. Cota <cota@braap.org> wrote:
>
> On Fri, May 22, 2020 at 12:07:49 -0400, Robert Foley wrote:
> > For example:
> > WARNING: ThreadSanitizer: data race (pid=35425)
> >   Write of size 4 at 0x7bbc000000ac by main thread (mutexes: write M875):
> >     #0 cpu_reset_interrupt hw/core/cpu.c:107:28 (qemu-system-aarch64+0x843790)
> >     #1 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x616265)
> >     #2 qemu_set_irq hw/core/irq.c:44:5 (qemu-system-aarch64+0x8462ca)
> >   Previous atomic read of size 4 at 0x7bbc000000ac by thread T6:
> >     #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x394c1c)
> >     #1 cpu_handle_interrupt accel/tcg/cpu-exec.c:534:9 (qemu-system-aarch64+0x4b7e79)
> >     #2 cpu_exec accel/tcg/cpu-exec.c:720:17 (qemu-system-aarch64+0x4b7e79)
> > or
> > WARNING: ThreadSanitizer: data race (pid=25425)
> >   Read of size 8 at 0x7f8ad8e138d0 by thread T10:
> >     #0 tb_lookup_cmp accel/tcg/cpu-exec.c:307:13 (qemu-system-aarch64+0x4ac4d2)
> >     #1 qht_do_lookup util/qht.c:502:34 (qemu-system-aarch64+0xd05264)
> >   Previous write of size 8 at 0x7f8ad8e138d0 by thread T15 (mutexes: write M728311726235541804):
> >     #0 tb_link_page accel/tcg/translate-all.c:1625:26 (qemu-system-aarch64+0x4b0bf2)
> >     #1 tb_gen_code accel/tcg/translate-all.c:1865:19 (qemu-system-aarch64+0x4b0bf2)
> >     #2 tb_find accel/tcg/cpu-exec.c:407:14 (qemu-system-aarch64+0x4ad77c)
>
> I see you're working through the warnings in this file, but I think it would
> be better to forget about files and focus on the data itself.
> Therefore this patch should be split in two: (1) cpu-<interrupt_request
> and (2) gen_code_buf. (1) requires a lot of changes with a proper audit;
> the per-cpu-lock series has a possible solution for that, so I will
> ignore those hunks and just comment on (2) below.

We will be dropping the changes in this file which overlap with
the per-cpu-locks patch.
>
> > Cc: Richard Henderson <richard.henderson@linaro.org>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > ---
> >  accel/tcg/tcg-all.c       | 4 ++--
> >  accel/tcg/tcg-runtime.c   | 7 ++++++-
> >  accel/tcg/translate-all.c | 6 +++++-
> >  hw/core/cpu.c             | 2 +-
> >  4 files changed, 14 insertions(+), 5 deletions(-)
> >
> (snip)
> > diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
> > index 446465a09a..bd0cd77450 100644
> > --- a/accel/tcg/tcg-runtime.c
> > +++ b/accel/tcg/tcg-runtime.c
> > @@ -31,6 +31,7 @@
> >  #include "disas/disas.h"
> >  #include "exec/log.h"
> >  #include "tcg/tcg.h"
> > +#include "qemu/tsan.h"
> >
> >  /* 32-bit helpers */
> >
> > @@ -151,6 +152,7 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
> >      TranslationBlock *tb;
> >      target_ulong cs_base, pc;
> >      uint32_t flags;
> > +    void *tc_ptr;
> >
> >      tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, curr_cflags());
> >      if (tb == NULL) {
> > @@ -161,7 +163,10 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
> >                             TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
> >                             cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
> >                             lookup_symbol(pc));
> > -    return tb->tc.ptr;
> > +    TSAN_ANNOTATE_IGNORE_READS_BEGIN();
> > +    tc_ptr = tb->tc.ptr;
> > +    TSAN_ANNOTATE_IGNORE_READS_END();
> > +    return tc_ptr;
>
> I'm not sure these are needed. At least after applying all other patches
> in this series, I don't get a warning here.

I think we will also be dropping the ANNOTATE calls here as we re-focus
the patch series.

Thanks & Regards,
-Rob

>
> >  }
> >
> >  void HELPER(exit_atomic)(CPUArchState *env)
> > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> > index 3fb71a1503..6c0e61994c 100644
> > --- a/accel/tcg/translate-all.c
> > +++ b/accel/tcg/translate-all.c
> > @@ -58,6 +58,7 @@
> >  #include "exec/log.h"
> >  #include "sysemu/cpus.h"
> >  #include "sysemu/tcg.h"
> > +#include "qemu/tsan.h"
> >
> >  /* #define DEBUG_TB_INVALIDATE */
> >  /* #define DEBUG_TB_FLUSH */
> > @@ -1704,6 +1705,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> >          max_insns = 1;
> >      }
> >
> > +    TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
>
> Same here, I don't get a warning in this hunk if I remove these,
> except for:
> ---
> WARNING: ThreadSanitizer: data race (pid=445867)
>   Atomic read of size 1 at 0x7f906e050158 by thread T7:
>     #0 __tsan_mutex_post_lock <null> (qemu-system-aarch64+0x481721)
>     #1 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:244:5 (qemu-system-aarch64+0x5578e9)
>     #2 tb_add_jump /home/cota/src/qemu/accel/tcg/cpu-exec.c:363:5 (qemu-system-aarch64+0x5578e9)
>     #3 tb_find /home/cota/src/qemu/accel/tcg/cpu-exec.c:423:9 (qemu-system-aarch64+0x5578e9)
>
>   Previous write of size 1 at 0x7f906e050158 by thread T8:
>     #0 __tsan_mutex_create <null> (qemu-system-aarch64+0x481589)
>     #1 qemu_spin_init /home/cota/src/qemu/include/qemu/thread.h:221:5 (qemu-system-aarch64+0x559a71)
>     #2 tb_gen_code /home/cota/src/qemu/accel/tcg/translate-all.c:1875:5 (qemu-system-aarch64+0x559a71)
>
>   Thread T7 'CPU 0/TCG' (tid=445875, running) created by main thread at:
>     #0 pthread_create <null> (qemu-system-aarch64+0x43915b)
>     #1 qemu_thread_create /home/cota/src/qemu/util/qemu-thread-posix.c:558:11 (qemu-system-aarch64+0xaf91ff)
>
>   Thread T8 'CPU 1/TCG' (tid=445876, running) created by main thread at:
>     #0 pthread_create <null> (qemu-system-aarch64+0x43915b)
>     #1 qemu_thread_create /home/cota/src/qemu/util/qemu-thread-posix.c:558:11 (qemu-system-aarch64+0xaf91ff)
>
> SUMMARY: ThreadSanitizer: data race (/home/cota/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x481721) in __tsan_mutex_post_lock
> ---
>
> Seems like tsan is confusing itself here.
>
> Thanks,
>                 E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/19] Add Thread Sanitizer support to QEMU
  2020-05-23 21:36 ` Emilio G. Cota
@ 2020-05-26 15:18   ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 15:18 UTC (permalink / raw)
  To: Emilio G. Cota; +Cc: Peter Puhov, Alex Bennée, QEMU Developers

On Sat, 23 May 2020 at 17:36, Emilio G. Cota <cota@braap.org> wrote:
>
> On Fri, May 22, 2020 at 12:07:36 -0400, Robert Foley wrote:
> > This patch series continues the work done by Emilio Cota and others to add
> > Thread Sanitizer (TSan) support to QEMU.
> >
> > The starting point for this work was Emilio's branch here:
> > https://github.com/cota/qemu/commits/tsan
> > specifically this commit: 0be125fc0afd47218b34d2019abdd19b644f3199
> >
> > The purpose of this patch is not to fix all the TSan warnings, but to enable
> > the TSan support so that QEMU developers can start using the tool.
> > We found this tool useful and even ran it on our recent changes in
> > the cpu-locks series.
> > Clearly there is work to do here to clean up all the warnings. :)
> > We have made a start to cleaning up these warnings by getting a VM to boot
> > cleanly with no TSan warnings.
> > We have also made an effort to introduce enough of the TSan suppression
> > mechanisms, so that others can continue this work.
>
> Thank you for doing this work! Great to see this finally coming along.
>
> What are your plans wrt the per-cpu-lock branch? Given that some of
> the races reported here are fixed there, I think it would make sense to
> defer those patches to the per-cpu-lock series (i.e. patches 2/19, parts
> of 13/19, and 18/19) so that this series goes in first (it is a lot
> less likely to break anything).

I agree with you that we should defer my patches which overlap with
the per-cpu-lock patch series.

>
> Also, I would not worry too much about rushing to bring warnings to
> 0; what's important is that with the warnings we now know where to
> focus on, and then we can carefully fix races. In particular I think
> all annotations we add must have a comment, since otherwise we are
> at the risk of blindlessly silencing warnings of real races.

In order to re-focus the series a bit, we are planning to drop the
annotations from the series.  This allows for a bit more focus on
enabling TSan and less on bringing the warnings to 0 as you mentioned.

At the same time, we're also going to add in a bit more details to
testing.rst on how to use the various suppression mechanisms,
annotations, blacklist and suppressions.txt.

>
> > Issues:
> > - When running docker-test-quick under TSan there are several tests which hang
> >   - The unit tests which seem to hang under TSan:
> >     test-char, test-qdev-global-props, and test-qga.
> >   - If we comment out those tests, check-unit finishes, albeit with
> >     a couple of warnings. :)
>
> I've noticed another issue (that I did not notice on my previous
> machine), which is that tsan blows up when in qht we lock all
> of the bucket's locks. We then get this assert from tsan, since
> it has a static limit of 64 mutexes held at any given time:
>
> FATAL: ThreadSanitizer CHECK failed: /build/llvm-toolchain-10-yegZYJ/llvm-toolchain-10-10.0.0/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40)
<snip>

> A workaround for now is to run qemu with TSAN_OPTIONS=detect_deadlocks=0,
> although selectively disabling tsan for qht_map_lock/unlock_buckets would
> be ideal (not sure if it's possible).

We have been using detect_deadlocks=0 to avoid this.
We will see if it is possible to disable tsan for just
qht_map_lock/unlock_buckets

>
> Another warning I'm reliably seen is:
> WARNING: ThreadSanitizer: double lock of a mutex (pid=489006)
>     #0 pthread_mutex_lock <null> (qemu-system-aarch64+0x457596)
>     #1 qemu_mutex_lock_impl /home/cota/src/qemu/util/qemu-thread-posix.c:79:11 (qemu-system-aarch64+0xaf7e3c)
>
>   Location is heap block of size 328 at 0x7b4800114900 allocated by main thread:
>     #0 calloc <null> (qemu-system-aarch64+0x438b80)
>     #1 g_malloc0 <null> (libglib-2.0.so.0+0x57d30)
>
>   Mutex M57 (0x7b4800114960) created at:
>     #0 pthread_mutex_init <null> (qemu-system-aarch64+0x43b74d)
>     #1 qemu_rec_mutex_init /home/cota/src/qemu/util/qemu-thread-posix.c:119:11 (qemu-system-aarch64+0xaf815b)
>
> But this one seems safe to ignore.
>

This one we disabled in the suppressions.txt file.
TSan reports a double lock even when the mutex is flagged as recursive.


Thanks & Regards,
-Rob
> Thanks,
>                 E.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/19] util/async: Fixed tsan warnings
  2020-05-23 20:12   ` Emilio G. Cota
@ 2020-05-26 15:19     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 15:19 UTC (permalink / raw)
  To: Emilio G. Cota
  Cc: Fam Zheng, Peter Puhov, Alex Bennée, QEMU Developers,
	Stefan Hajnoczi

On Sat, 23 May 2020 at 16:12, Emilio G. Cota <cota@braap.org> wrote:
>
> On Fri, May 22, 2020 at 12:07:50 -0400, Robert Foley wrote:
<snip>
>
> >      QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
> >
> >      while ((s = QSIMPLEQ_FIRST(&ctx->bh_slice_list))) {
> > @@ -280,14 +285,16 @@ aio_ctx_check(GSource *source)
> >      aio_notify_accept(ctx);
> >
> >      QSLIST_FOREACH_RCU(bh, &ctx->bh_list, next) {
> > -        if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
> > +        if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
> > +             == BH_SCHEDULED) {
> >              return true;
> >          }
> >      }
> >
> >      QSIMPLEQ_FOREACH(s, &ctx->bh_slice_list, next) {
> >          QSLIST_FOREACH_RCU(bh, &s->bh_list, next) {
> > -            if ((bh->flags & (BH_SCHEDULED | BH_DELETED)) == BH_SCHEDULED) {
> > +            if ((atomic_read(&bh->flags) & (BH_SCHEDULED | BH_DELETED))
> > +                 == BH_SCHEDULED) {
>
> This hunk like the real fix. Also, I'd put "fix race" in the commit
> title as opposed to "fix warning" since fixing races is the goal, not
> fixing warnings.

Good point, will update the commit.

Thanks & Regards,
-Rob
>
> Thanks,
>
>                 Emilio


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/19] util/async: Fixed tsan warnings
  2020-05-26 10:32   ` Stefan Hajnoczi
@ 2020-05-26 15:21     ` Robert Foley
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-05-26 15:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, Peter Puhov, Emilio G. Cota, Alex Bennée,
	QEMU Developers

Hi Stefan,

On Tue, 26 May 2020 at 06:32, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Fri, May 22, 2020 at 12:07:50PM -0400, Robert Foley wrote:
> > For example:
> >   Atomic write of size 8 at 0x7b4800113c28 by main thread (mutexes: write M30):
> >     #0 __tsan_atomic64_exchange <null> (qemu-system-aarch64+0x386f85)
> >     #1 aio_bh_poll util/async.c:146:5 (qemu-system-aarch64+0xcd1f61)
> >     #2 aio_dispatch util/aio-posix.c:380:5 (qemu-system-aarch64+0xcd8abb)
> >     #3 aio_ctx_dispatch util/async.c:298:5 (qemu-system-aarch64+0xcd31b0)
> >     #4 g_main_context_dispatch <null> (libglib-2.0.so.0+0x4c416)
> >     #5 qemu_main_loop softmmu/vl.c:1664:9 (qemu-system-aarch64+0x5cc6d6)
> >     #6 main softmmu/main.c:49:5 (qemu-system-aarch64+0xc62857)
> >
> >   Previous read of size 8 at 0x7b4800113c28 by thread T3 (mutexes: write M81):
> >     #0 aio_bh_enqueue util/async.c:81:9 (qemu-system-aarch64+0xcd2267)
> >     #1 qemu_bh_schedule util/async.c:181:5 (qemu-system-aarch64+0xcd2267)
> >     #2 worker_thread util/thread-pool.c:113:9 (qemu-system-aarch64+0xcd473c)
> >     #3 qemu_thread_start util/qemu-thread-posix.c:519:9 (qemu-system-aarch64+0xcde280)
> >
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > Cc: Fam Zheng <fam@euphon.net>
> > Signed-off-by: Robert Foley <robert.foley@linaro.org>
> > ---
> >  util/async.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
>
> Please explain why these warnings should be ignored in the commit
> description. I have been CCed on this patch, am unfamiliar with TSAN
> annotations, and now need to figure out:
> 1. Is ignoring the warning safe or should async.c be fixed somehow?
> 2. How do the annotation macros work and are they being used correectly?
>
> It's likely that people coming across this commit in the future would
> also benefit from information in the commit description that makes these
> things clear.
>
All good points.

I think we need to add a whole lot more details on TSan and its use
along with the use of the annotation macros among other things.
Will CC you on the updates to testing.rst, which will contain this
information.
This will hopefully provide the context, information and resources needed to
make the kinds of determinations we're looking for in 1. and 2.

As for the annotations, I think we are going to re-focus the patch and
remove the annotations for now.  However, we will plan to update
this commit description with more details on the changes to use atomic reads.

Thanks & Regards,
-Rob
> Thanks,
> Stefan


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/19] accel/tcg: Fixed tsan warnings.
  2020-05-22 16:07 ` [PATCH 13/19] accel/tcg: Fixed tsan warnings Robert Foley
  2020-05-23 20:06   ` Emilio G. Cota
@ 2020-05-26 18:47   ` Paolo Bonzini
  1 sibling, 0 replies; 51+ messages in thread
From: Paolo Bonzini @ 2020-05-26 18:47 UTC (permalink / raw)
  To: Robert Foley, qemu-devel
  Cc: peter.puhov, Richard Henderson, cota, alex.bennee

On 22/05/20 18:07, Robert Foley wrote:
> For example:
> WARNING: ThreadSanitizer: data race (pid=35425)
>   Write of size 4 at 0x7bbc000000ac by main thread (mutexes: write M875):
>     #0 cpu_reset_interrupt hw/core/cpu.c:107:28 (qemu-system-aarch64+0x843790)
>     #1 arm_cpu_set_irq target/arm/cpu.c (qemu-system-aarch64+0x616265)
>     #2 qemu_set_irq hw/core/irq.c:44:5 (qemu-system-aarch64+0x8462ca)
>   Previous atomic read of size 4 at 0x7bbc000000ac by thread T6:
>     #0 __tsan_atomic32_load <null> (qemu-system-aarch64+0x394c1c)
>     #1 cpu_handle_interrupt accel/tcg/cpu-exec.c:534:9 (qemu-system-aarch64+0x4b7e79)
>     #2 cpu_exec accel/tcg/cpu-exec.c:720:17 (qemu-system-aarch64+0x4b7e79)
> or
> WARNING: ThreadSanitizer: data race (pid=25425)
>   Read of size 8 at 0x7f8ad8e138d0 by thread T10:
>     #0 tb_lookup_cmp accel/tcg/cpu-exec.c:307:13 (qemu-system-aarch64+0x4ac4d2)
>     #1 qht_do_lookup util/qht.c:502:34 (qemu-system-aarch64+0xd05264)
>   Previous write of size 8 at 0x7f8ad8e138d0 by thread T15 (mutexes: write M728311726235541804):
>     #0 tb_link_page accel/tcg/translate-all.c:1625:26 (qemu-system-aarch64+0x4b0bf2)
>     #1 tb_gen_code accel/tcg/translate-all.c:1865:19 (qemu-system-aarch64+0x4b0bf2)
>     #2 tb_find accel/tcg/cpu-exec.c:407:14 (qemu-system-aarch64+0x4ad77c)
> 
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Robert Foley <robert.foley@linaro.org>
> ---
>  accel/tcg/tcg-all.c       | 4 ++--
>  accel/tcg/tcg-runtime.c   | 7 ++++++-
>  accel/tcg/translate-all.c | 6 +++++-
>  hw/core/cpu.c             | 2 +-
>  4 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
> index 3b4fda5640..f94ea4c4b3 100644
> --- a/accel/tcg/tcg-all.c
> +++ b/accel/tcg/tcg-all.c
> @@ -54,8 +54,8 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
>      int old_mask;
>      g_assert(qemu_mutex_iothread_locked());
>  
> -    old_mask = cpu->interrupt_request;
> -    cpu->interrupt_request |= mask;
> +    old_mask = atomic_read(&cpu->interrupt_request);
> +    atomic_or(&cpu->interrupt_request, mask);

You can use atomic_fetch_or here.

Paolo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/19] util: fixed tsan warnings in thread_pool.c
  2020-05-22 16:07 ` [PATCH 16/19] util: fixed tsan warnings in thread_pool.c Robert Foley
@ 2020-05-26 20:18   ` Paolo Bonzini
  0 siblings, 0 replies; 51+ messages in thread
From: Paolo Bonzini @ 2020-05-26 20:18 UTC (permalink / raw)
  To: Robert Foley, qemu-devel; +Cc: peter.puhov, cota, alex.bennee

On 22/05/20 18:07, Robert Foley wrote:
>  #include "trace.h"
>  #include "block/thread-pool.h"
>  #include "qemu/main-loop.h"
> +#include "qemu/tsan.h"
>  
>  static void do_spawn_thread(ThreadPool *pool);
>  
> @@ -97,7 +98,9 @@ static void *worker_thread(void *opaque)
>          }
>  
>          req = QTAILQ_FIRST(&pool->request_list);
> +        TSAN_ANNOTATE_IGNORE_WRITES_BEGIN();
>          QTAILQ_REMOVE(&pool->request_list, req, reqs);
> +
>          req->state = THREAD_ACTIVE;
>          qemu_mutex_unlock(&pool->lock);
>  
> @@ -107,7 +110,7 @@ static void *worker_thread(void *opaque)
>          /* Write ret before state.  */
>          smp_wmb();
>          req->state = THREAD_DONE;
> -
> +        TSAN_ANNOTATE_IGNORE_WRITES_END();

You should instead use atomic_read/set for req->state and req->ret.

Paolo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
  2020-05-23 16:55   ` Philippe Mathieu-Daudé
@ 2020-06-17 14:24   ` Stefan Hajnoczi
  2020-06-17 15:56     ` Alex Bennée
  2020-06-17 21:27     ` Robert Foley
  1 sibling, 2 replies; 51+ messages in thread
From: Stefan Hajnoczi @ 2020-06-17 14:24 UTC (permalink / raw)
  To: Robert Foley; +Cc: peter.puhov, cota, alex.bennee, qemu-devel, Lingfeng Yang

[-- Attachment #1: Type: text/plain, Size: 4358 bytes --]

On Fri, May 22, 2020 at 12:07:37PM -0400, Robert Foley wrote:
> +#define UC_DEBUG 0
> +#if UC_DEBUG && defined(CONFIG_TSAN)
> +#define UC_TRACE(fmt, ...) fprintf(stderr, "%s:%d:%p " fmt "\n", \
> +    __func__, __LINE__, __tsan_get_current_fiber(), ##__VA_ARGS__);
> +#else
> +#define UC_TRACE(fmt, ...)
> +#endif

QEMU has tracing support, see docs/devel/tracing.txt. These fprintfs
should be trace events defined in the util/trace-events file.

> +
>  /**
>   * Per-thread coroutine bookkeeping
>   */
> @@ -65,7 +80,20 @@ union cc_arg {
>      int i[2];
>  };
>  
> -static void finish_switch_fiber(void *fake_stack_save)
> +/* QEMU_ALWAYS_INLINE only does so if __OPTIMIZE__, so we cannot use it. */
> +static inline __attribute__((always_inline))

Please document why always_inline is necessary here and in other
functions. Is it for performance or because the __tsan_*() functions
need to be called from a the parent function?

> +void on_new_fiber(CoroutineUContext *co)
> +{
> +#ifdef CONFIG_TSAN
> +    co->tsan_co_fiber = __tsan_create_fiber(0); /* flags: sync on switch */
> +    co->tsan_caller_fiber = __tsan_get_current_fiber();
> +    UC_TRACE("Create new TSAN co fiber. co: %p co fiber: %p caller fiber: %p ",
> +             co, co->tsan_co_fiber, co->tsan_caller_fiber);
> +#endif
> +}
> +
> +static inline __attribute__((always_inline))
> +void finish_switch_fiber(void *fake_stack_save)
>  {
>  #ifdef CONFIG_ASAN
>      const void *bottom_old;
> @@ -78,18 +106,40 @@ static void finish_switch_fiber(void *fake_stack_save)
>          leader.stack_size = size_old;
>      }
>  #endif
> +#ifdef CONFIG_TSAN
> +    if (fake_stack_save) {
> +        __tsan_release(fake_stack_save);
> +        __tsan_switch_to_fiber(fake_stack_save, 0);  /* 0=synchronize */
> +    }
> +#endif
>  }
>  
> -static void start_switch_fiber(void **fake_stack_save,
> -                               const void *bottom, size_t size)
> +static inline __attribute__((always_inline)) void start_switch_fiber(
> +    CoroutineAction action, void **fake_stack_save,
> +    const void *bottom, size_t size, void *new_fiber)
>  {
>  #ifdef CONFIG_ASAN
> -    __sanitizer_start_switch_fiber(fake_stack_save, bottom, size);
> +    if (action == COROUTINE_TERMINATE) {
> +        __sanitizer_start_switch_fiber(
> +            action == COROUTINE_TERMINATE ? NULL : fake_stack_save,

The if statement already checks action == COROUTINE_TERMINATE, why is it
being checked again?

I think the old behavior can be retained by dropping the if statement
like this:

  __sanitizer_start_switch_fiber(action == COROUTINE_TERMINATE ?
                                 NULL : fake_stack_save,
				 bottom, size);

> +            bottom, size);
> +    }
> +#endif
> +#ifdef CONFIG_TSAN
> +    void *curr_fiber =
> +        __tsan_get_current_fiber();
> +    __tsan_acquire(curr_fiber);
> +
> +    UC_TRACE("Current fiber: %p.", curr_fiber);
> +    *fake_stack_save = curr_fiber;
> +    UC_TRACE("Switch to fiber %p", new_fiber);
> +    __tsan_switch_to_fiber(new_fiber, 0);  /* 0=synchronize */
>  #endif
>  }

Please split start_switch_fiber() into two functions:
start_switch_fiber_asan() and start_switch_fiber_tsan(). That way the
asan- and tsan-specific arguments can be kept separate and the
co->tsan_* fields only need to be compiled in when CONFIG_TSAN is
defined.

For example:

  static inline __attribute__((always_inline))
  void start_switch_fiber_tsan(void **fake_stack_save,
                               CoroutineUContext *co,
  			     bool caller)
  {
  #ifdef CONFIG_TSAN
      void *new_fiber = caller ?
                        co->tsan_caller_fiber :
                        co->tsan_co_fiber;
      void *curr_fiber = __tsan_get_current_fiber();
      __tsan_acquire(curr_fiber);

      UC_TRACE("Current fiber: %p.", curr_fiber);
      *fake_stack_save = curr_fiber;
      UC_TRACE("Switch to fiber %p", new_fiber);
      __tsan_switch_to_fiber(new_fiber, 0);  /* 0=synchronize */
  #endif
  }

This does two things:
1. Unrelated ASAN and TSAN code is separate and each function only
   has arguments that are actually needed.
2. The co->tsan_caller_fiber and co->tsan_co_fiber fields are only
   access from within #ifdef CONFIG_TSAN.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-06-17 14:24   ` Stefan Hajnoczi
@ 2020-06-17 15:56     ` Alex Bennée
  2020-06-17 21:27     ` Robert Foley
  1 sibling, 0 replies; 51+ messages in thread
From: Alex Bennée @ 2020-06-17 15:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: peter.puhov, Lingfeng Yang, cota, Robert Foley, qemu-devel


Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Fri, May 22, 2020 at 12:07:37PM -0400, Robert Foley wrote:
>> +#define UC_DEBUG 0
>> +#if UC_DEBUG && defined(CONFIG_TSAN)
>> +#define UC_TRACE(fmt, ...) fprintf(stderr, "%s:%d:%p " fmt "\n", \
>> +    __func__, __LINE__, __tsan_get_current_fiber(), ##__VA_ARGS__);
>> +#else
>> +#define UC_TRACE(fmt, ...)
>> +#endif
>
> QEMU has tracing support, see docs/devel/tracing.txt. These fprintfs
> should be trace events defined in the util/trace-events file.

These got dropped on the final version. You might want to check against
the version I pulled:

  Subject: [PATCH v3 00/13] Add Thread Sanitizer support to QEMU
  Date: Tue,  9 Jun 2020 16:07:25 -0400
  Message-Id: <20200609200738.445-1-robert.foley@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext
  2020-06-17 14:24   ` Stefan Hajnoczi
  2020-06-17 15:56     ` Alex Bennée
@ 2020-06-17 21:27     ` Robert Foley
  1 sibling, 0 replies; 51+ messages in thread
From: Robert Foley @ 2020-06-17 21:27 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Peter Puhov, Emilio G. Cota, Alex Bennée, QEMU Developers,
	Lingfeng Yang

Hi,

On Wed, 17 Jun 2020 at 10:24, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Fri, May 22, 2020 at 12:07:37PM -0400, Robert Foley wrote:
> > +#define UC_DEBUG 0
> > +#if UC_DEBUG && defined(CONFIG_TSAN)
> > +#define UC_TRACE(fmt, ...) fprintf(stderr, "%s:%d:%p " fmt "\n", \
> > +    __func__, __LINE__, __tsan_get_current_fiber(), ##__VA_ARGS__);
> > +#else
> > +#define UC_TRACE(fmt, ...)
> > +#endif
>
> QEMU has tracing support, see docs/devel/tracing.txt. These fprintfs
> should be trace events defined in the util/trace-events file.

Thanks for the details.  We removed this tracing in a later patch.
https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg02506.html
>
> > +
> >  /**
> >   * Per-thread coroutine bookkeeping
> >   */
> > @@ -65,7 +80,20 @@ union cc_arg {
> >      int i[2];
> >  };
> >
> > -static void finish_switch_fiber(void *fake_stack_save)
> > +/* QEMU_ALWAYS_INLINE only does so if __OPTIMIZE__, so we cannot use it. */
> > +static inline __attribute__((always_inline))
>
> Please document why always_inline is necessary here and in other
> functions. Is it for performance or because the __tsan_*() functions
> need to be called from a the parent function?

We will look into this and add documentation here or (if it is no
longer needed),
remove the inline.

<snip>
> > -static void start_switch_fiber(void **fake_stack_save,
> > -                               const void *bottom, size_t size)
> > +static inline __attribute__((always_inline)) void start_switch_fiber(
> > +    CoroutineAction action, void **fake_stack_save,
> > +    const void *bottom, size_t size, void *new_fiber)
> >  {
> >  #ifdef CONFIG_ASAN
> > -    __sanitizer_start_switch_fiber(fake_stack_save, bottom, size);
> > +    if (action == COROUTINE_TERMINATE) {
> > +        __sanitizer_start_switch_fiber(
> > +            action == COROUTINE_TERMINATE ? NULL : fake_stack_save,
>
> The if statement already checks action == COROUTINE_TERMINATE, why is it
> being checked again?
>
> I think the old behavior can be retained by dropping the if statement
> like this:
>
>   __sanitizer_start_switch_fiber(action == COROUTINE_TERMINATE ?
>                                  NULL : fake_stack_save,
>                                  bottom, size);
>
> > +            bottom, size);

Good point.  We did change this by dropping the if in a later patch.
https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg02506.html


> > +    }
> > +#endif
> > +#ifdef CONFIG_TSAN
> > +    void *curr_fiber =
> > +        __tsan_get_current_fiber();
> > +    __tsan_acquire(curr_fiber);
> > +
> > +    UC_TRACE("Current fiber: %p.", curr_fiber);
> > +    *fake_stack_save = curr_fiber;
> > +    UC_TRACE("Switch to fiber %p", new_fiber);
> > +    __tsan_switch_to_fiber(new_fiber, 0);  /* 0=synchronize */
> >  #endif
> >  }
>
> Please split start_switch_fiber() into two functions:
> start_switch_fiber_asan() and start_switch_fiber_tsan(). That way the
> asan- and tsan-specific arguments can be kept separate and the
> co->tsan_* fields only need to be compiled in when CONFIG_TSAN is
> defined.
>
> For example:
>
>   static inline __attribute__((always_inline))
>   void start_switch_fiber_tsan(void **fake_stack_save,
>                                CoroutineUContext *co,
>                              bool caller)
>   {
>   #ifdef CONFIG_TSAN
>       void *new_fiber = caller ?
>                         co->tsan_caller_fiber :
>                         co->tsan_co_fiber;
>       void *curr_fiber = __tsan_get_current_fiber();
>       __tsan_acquire(curr_fiber);
>
>       UC_TRACE("Current fiber: %p.", curr_fiber);
>       *fake_stack_save = curr_fiber;
>       UC_TRACE("Switch to fiber %p", new_fiber);
>       __tsan_switch_to_fiber(new_fiber, 0);  /* 0=synchronize */
>   #endif
>   }
>
> This does two things:
> 1. Unrelated ASAN and TSAN code is separate and each function only
>    has arguments that are actually needed.
> 2. The co->tsan_caller_fiber and co->tsan_co_fiber fields are only
>    access from within #ifdef CONFIG_TSAN.

Makes sense, we will make these changes and submit a cleanup patch.

Thanks & Regards,
-Rob


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2020-06-17 22:04 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-22 16:07 [PATCH 00/19] Add Thread Sanitizer support to QEMU Robert Foley
2020-05-22 16:07 ` [PATCH 01/19] configure: add --enable-tsan flag + fiber annotations for coroutine-ucontext Robert Foley
2020-05-23 16:55   ` Philippe Mathieu-Daudé
2020-05-26 12:38     ` Robert Foley
2020-06-17 14:24   ` Stefan Hajnoczi
2020-06-17 15:56     ` Alex Bennée
2020-06-17 21:27     ` Robert Foley
2020-05-22 16:07 ` [PATCH 02/19] cpu: convert queued work to a QSIMPLEQ Robert Foley
2020-05-24 10:20   ` Philippe Mathieu-Daudé
2020-05-26 15:01     ` Robert Foley
2020-05-22 16:07 ` [PATCH 03/19] thread: add qemu_spin_destroy Robert Foley
2020-05-22 16:07 ` [PATCH 04/19] cputlb: destroy CPUTLB with tlb_destroy Robert Foley
2020-05-22 16:07 ` [PATCH 05/19] qht: call qemu_spin_destroy for head buckets Robert Foley
2020-05-22 16:07 ` [PATCH 06/19] tcg: call qemu_spin_destroy for tb->jmp_lock Robert Foley
2020-05-22 16:07 ` [PATCH 07/19] translate-all: call qemu_spin_destroy for PageDesc Robert Foley
2020-05-22 16:07 ` [PATCH 08/19] thread: add tsan annotations to QemuSpin Robert Foley
2020-05-22 16:07 ` [PATCH 09/19] tests/docker: Added docker build support for TSan Robert Foley
2020-05-22 16:07 ` [PATCH 10/19] include/qemu: Added tsan.h for annotations Robert Foley
2020-05-23 17:20   ` Emilio G. Cota
2020-05-23 21:37     ` Emilio G. Cota
2020-05-22 16:07 ` [PATCH 11/19] accel/tcg: Fixed tsan warnings related to parallel_cpus Robert Foley
2020-05-23 17:21   ` Emilio G. Cota
2020-05-22 16:07 ` [PATCH 12/19] configure: added tsan support for blacklist Robert Foley
2020-05-23 17:27   ` Emilio G. Cota
2020-05-26 14:07     ` Robert Foley
2020-05-22 16:07 ` [PATCH 13/19] accel/tcg: Fixed tsan warnings Robert Foley
2020-05-23 20:06   ` Emilio G. Cota
2020-05-26 15:14     ` Robert Foley
2020-05-26 18:47   ` Paolo Bonzini
2020-05-22 16:07 ` [PATCH 14/19] util/async: " Robert Foley
2020-05-23 20:12   ` Emilio G. Cota
2020-05-26 15:19     ` Robert Foley
2020-05-26 10:32   ` Stefan Hajnoczi
2020-05-26 15:21     ` Robert Foley
2020-05-22 16:07 ` [PATCH 15/19] qht: Fix " Robert Foley
2020-05-23 20:44   ` Emilio G. Cota
2020-05-22 16:07 ` [PATCH 16/19] util: fixed tsan warnings in thread_pool.c Robert Foley
2020-05-26 20:18   ` Paolo Bonzini
2020-05-22 16:07 ` [PATCH 17/19] util: Added tsan annotate for thread name Robert Foley
2020-05-23 20:29   ` Emilio G. Cota
2020-05-22 16:07 ` [PATCH 18/19] target/arm: Fix tsan warning in cpu.c Robert Foley
2020-05-22 17:44   ` Peter Maydell
2020-05-22 21:33     ` Robert Foley
2020-05-22 22:36       ` Peter Maydell
2020-05-23 17:18         ` Emilio G. Cota
2020-05-26 14:01           ` Robert Foley
2020-05-22 16:07 ` [PATCH 19/19] docs: Added details on TSan to testing.rst Robert Foley
2020-05-23 20:29   ` Emilio G. Cota
2020-05-22 21:07 ` [PATCH 00/19] Add Thread Sanitizer support to QEMU no-reply
2020-05-23 21:36 ` Emilio G. Cota
2020-05-26 15:18   ` Robert Foley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).