All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
@ 2017-02-24 11:20 Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 01/24] docs: new design document multi-thread-tcg.txt Alex Bennée
                   ` (25 more replies)
  0 siblings, 26 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée

The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:

  Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)

are available in the git repository at:

  https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1

for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:

  tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)

----------------------------------------------------------------
This is the MTTCG pull-request as posted yesterday.

----------------------------------------------------------------
Alex Bennée (18):
      docs: new design document multi-thread-tcg.txt
      tcg: move TCG_MO/BAR types into own file
      tcg: add kick timer for single-threaded vCPU emulation
      tcg: rename tcg_current_cpu to tcg_current_rr_cpu
      tcg: remove global exit_request
      tcg: enable tb_lock() for SoftMMU
      tcg: enable thread-per-vCPU
      cputlb: add assert_cpu_is_self checks
      cputlb: tweak qemu_ram_addr_from_host_nofail reporting
      cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
      cputlb: add tlb_flush_by_mmuidx async routines
      cputlb: atomically update tlb fields used by tlb_reset_dirty
      cputlb: introduce tlb_flush_*_all_cpus[_synced]
      target-arm/powerctl: defer cpu reset work to CPU context
      target-arm: don't generate WFE/YIELD calls for MTTCG
      target-arm: ensure all cross vCPUs TLB flushes complete
      hw/misc/imx6_src: defer clearing of SRC_SCR reset bits
      tcg: enable MTTCG by default for ARM on x86 hosts

Jan Kiszka (1):
      tcg: drop global lock during TCG code execution

KONRAD Frederic (2):
      tcg: add options for enabling MTTCG
      cputlb: introduce tlb_flush_* async work.

Pranith Kumar (3):
      mttcg: translate-all: Enable locking debug in a debug build
      mttcg: Add missing tb_lock/unlock() in cpu_exec_step()
      tcg: handle EXCP_ATOMIC exception for system emulation

 configure                  |   6 +
 cpu-exec-common.c          |   3 -
 cpu-exec.c                 |  89 ++++++---
 cpus.c                     | 345 ++++++++++++++++++++++++++-------
 cputlb.c                   | 463 +++++++++++++++++++++++++++++++++++++--------
 docs/multi-thread-tcg.txt  | 350 ++++++++++++++++++++++++++++++++++
 exec.c                     |  12 +-
 hw/core/irq.c              |   1 +
 hw/i386/kvmvapic.c         |   4 +-
 hw/intc/arm_gicv3_cpuif.c  |   3 +
 hw/misc/imx6_src.c         |  58 +++++-
 hw/ppc/ppc.c               |  16 +-
 hw/ppc/spapr.c             |   3 +
 include/exec/cputlb.h      |   2 -
 include/exec/exec-all.h    | 132 +++++++++++--
 include/qom/cpu.h          |  16 ++
 include/sysemu/cpus.h      |   2 +
 memory.c                   |   2 +
 qemu-options.hx            |  20 ++
 qom/cpu.c                  |  10 +
 target/arm/arm-powerctl.c  | 202 +++++++++++++-------
 target/arm/arm-powerctl.h  |   2 +
 target/arm/cpu.c           |   4 +-
 target/arm/cpu.h           |  18 +-
 target/arm/helper.c        | 219 ++++++++++-----------
 target/arm/kvm.c           |   7 +-
 target/arm/machine.c       |  41 +++-
 target/arm/op_helper.c     |  50 ++++-
 target/arm/psci.c          |   4 +-
 target/arm/translate-a64.c |   8 +-
 target/arm/translate.c     |  20 +-
 target/i386/smm_helper.c   |   7 +
 target/s390x/misc_helper.c |   5 +-
 target/sparc/ldst_helper.c |   8 +-
 tcg/i386/tcg-target.h      |  11 ++
 tcg/tcg-mo.h               |  48 +++++
 tcg/tcg.h                  |  27 +--
 translate-all.c            |  66 ++-----
 translate-common.c         |  21 +-
 vl.c                       |  49 ++++-
 40 files changed, 1878 insertions(+), 476 deletions(-)
 create mode 100644 docs/multi-thread-tcg.txt
 create mode 100644 tcg/tcg-mo.h


-- 
2.11.0

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 01/24] docs: new design document multi-thread-tcg.txt
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 02/24] mttcg: translate-all: Enable locking debug in a debug build Alex Bennée
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée

This documents the current design for upgrading TCG emulation to take
advantage of modern CPUs by running a thread-per-CPU. The document goes
through the various areas of the code affected by such a change and
proposes design requirements for each part of the solution.

The text marked with (Current solution[s]) to document what the current
approaches being used are.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 docs/multi-thread-tcg.txt | 350 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 350 insertions(+)
 create mode 100644 docs/multi-thread-tcg.txt

diff --git a/docs/multi-thread-tcg.txt b/docs/multi-thread-tcg.txt
new file mode 100644
index 0000000000..a99b4564c6
--- /dev/null
+++ b/docs/multi-thread-tcg.txt
@@ -0,0 +1,350 @@
+Copyright (c) 2015-2016 Linaro Ltd.
+
+This work is licensed under the terms of the GNU GPL, version 2 or
+later. See the COPYING file in the top-level directory.
+
+Introduction
+============
+
+This document outlines the design for multi-threaded TCG system-mode
+emulation. The current user-mode emulation mirrors the thread
+structure of the translated executable. Some of the work will be
+applicable to both system and linux-user emulation.
+
+The original system-mode TCG implementation was single threaded and
+dealt with multiple CPUs with simple round-robin scheduling. This
+simplified a lot of things but became increasingly limited as systems
+being emulated gained additional cores and per-core performance gains
+for host systems started to level off.
+
+vCPU Scheduling
+===============
+
+We introduce a new running mode where each vCPU will run on its own
+user-space thread. This will be enabled by default for all FE/BE
+combinations that have had the required work done to support this
+safely.
+
+In the general case of running translated code there should be no
+inter-vCPU dependencies and all vCPUs should be able to run at full
+speed. Synchronisation will only be required while accessing internal
+shared data structures or when the emulated architecture requires a
+coherent representation of the emulated machine state.
+
+Shared Data Structures
+======================
+
+Main Run Loop
+-------------
+
+Even when there is no code being generated there are a number of
+structures associated with the hot-path through the main run-loop.
+These are associated with looking up the next translation block to
+execute. These include:
+
+    tb_jmp_cache (per-vCPU, cache of recent jumps)
+    tb_ctx.htable (global hash table, phys address->tb lookup)
+
+As TB linking only occurs when blocks are in the same page this code
+is critical to performance as looking up the next TB to execute is the
+most common reason to exit the generated code.
+
+DESIGN REQUIREMENT: Make access to lookup structures safe with
+multiple reader/writer threads. Minimise any lock contention to do it.
+
+The hot-path avoids using locks where possible. The tb_jmp_cache is
+updated with atomic accesses to ensure consistent results. The fall
+back QHT based hash table is also designed for lockless lookups. Locks
+are only taken when code generation is required or TranslationBlocks
+have their block-to-block jumps patched.
+
+Global TCG State
+----------------
+
+We need to protect the entire code generation cycle including any post
+generation patching of the translated code. This also implies a shared
+translation buffer which contains code running on all cores. Any
+execution path that comes to the main run loop will need to hold a
+mutex for code generation. This also includes times when we need flush
+code or entries from any shared lookups/caches. Structures held on a
+per-vCPU basis won't need locking unless other vCPUs will need to
+modify them.
+
+DESIGN REQUIREMENT: Add locking around all code generation and TB
+patching.
+
+(Current solution)
+
+Mainly as part of the linux-user work all code generation is
+serialised with a tb_lock(). For the SoftMMU tb_lock() also takes the
+place of mmap_lock() in linux-user.
+
+Translation Blocks
+------------------
+
+Currently the whole system shares a single code generation buffer
+which when full will force a flush of all translations and start from
+scratch again. Some operations also force a full flush of translations
+including:
+
+  - debugging operations (breakpoint insertion/removal)
+  - some CPU helper functions
+
+This is done with the async_safe_run_on_cpu() mechanism to ensure all
+vCPUs are quiescent when changes are being made to shared global
+structures.
+
+More granular translation invalidation events are typically due
+to a change of the state of a physical page:
+
+  - code modification (self modify code, patching code)
+  - page changes (new page mapping in linux-user mode)
+
+While setting the invalid flag in a TranslationBlock will stop it
+being used when looked up in the hot-path there are a number of other
+book-keeping structures that need to be safely cleared.
+
+Any TranslationBlocks which have been patched to jump directly to the
+now invalid blocks need the jump patches reversing so they will return
+to the C code.
+
+There are a number of look-up caches that need to be properly updated
+including the:
+
+  - jump lookup cache
+  - the physical-to-tb lookup hash table
+  - the global page table
+
+The global page table (l1_map) which provides a multi-level look-up
+for PageDesc structures which contain pointers to the start of a
+linked list of all Translation Blocks in that page (see page_next).
+
+Both the jump patching and the page cache involve linked lists that
+the invalidated TranslationBlock needs to be removed from.
+
+DESIGN REQUIREMENT: Safely handle invalidation of TBs
+                      - safely patch/revert direct jumps
+                      - remove central PageDesc lookup entries
+                      - ensure lookup caches/hashes are safely updated
+
+(Current solution)
+
+The direct jump themselves are updated atomically by the TCG
+tb_set_jmp_target() code. Modification to the linked lists that allow
+searching for linked pages are done under the protect of the
+tb_lock().
+
+The global page table is protected by the tb_lock() in system-mode and
+mmap_lock() in linux-user mode.
+
+The lookup caches are updated atomically and the lookup hash uses QHT
+which is designed for concurrent safe lookup.
+
+
+Memory maps and TLBs
+--------------------
+
+The memory handling code is fairly critical to the speed of memory
+access in the emulated system. The SoftMMU code is designed so the
+hot-path can be handled entirely within translated code. This is
+handled with a per-vCPU TLB structure which once populated will allow
+a series of accesses to the page to occur without exiting the
+translated code. It is possible to set flags in the TLB address which
+will ensure the slow-path is taken for each access. This can be done
+to support:
+
+  - Memory regions (dividing up access to PIO, MMIO and RAM)
+  - Dirty page tracking (for code gen, SMC detection, migration and display)
+  - Virtual TLB (for translating guest address->real address)
+
+When the TLB tables are updated by a vCPU thread other than their own
+we need to ensure it is done in a safe way so no inconsistent state is
+seen by the vCPU thread.
+
+Some operations require updating a number of vCPUs TLBs at the same
+time in a synchronised manner.
+
+DESIGN REQUIREMENTS:
+
+  - TLB Flush All/Page
+    - can be across-vCPUs
+    - cross vCPU TLB flush may need other vCPU brought to halt
+    - change may need to be visible to the calling vCPU immediately
+  - TLB Flag Update
+    - usually cross-vCPU
+    - want change to be visible as soon as possible
+  - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs)
+    - This is a per-vCPU table - by definition can't race
+    - updated by its own thread when the slow-path is forced
+
+(Current solution)
+
+We have updated cputlb.c to defer operations when a cross-vCPU
+operation with async_run_on_cpu() which ensures each vCPU sees a
+coherent state when it next runs its work (in a few instructions
+time).
+
+A new set up operations (tlb_flush_*_all_cpus) take an additional flag
+which when set will force synchronisation by setting the source vCPUs
+work as "safe work" and exiting the cpu run loop. This ensure by the
+time execution restarts all flush operations have completed.
+
+TLB flag updates are all done atomically and are also protected by the
+tb_lock() which is used by the functions that update the TLB in bulk.
+
+(Known limitation)
+
+Not really a limitation but the wait mechanism is overly strict for
+some architectures which only need flushes completed by a barrier
+instruction. This could be a future optimisation.
+
+Emulated hardware state
+-----------------------
+
+Currently thanks to KVM work any access to IO memory is automatically
+protected by the global iothread mutex, also known as the BQL (Big
+Qemu Lock). Any IO region that doesn't use global mutex is expected to
+do its own locking.
+
+However IO memory isn't the only way emulated hardware state can be
+modified. Some architectures have model specific registers that
+trigger hardware emulation features. Generally any translation helper
+that needs to update more than a single vCPUs of state should take the
+BQL.
+
+As the BQL, or global iothread mutex is shared across the system we
+push the use of the lock as far down into the TCG code as possible to
+minimise contention.
+
+(Current solution)
+
+MMIO access automatically serialises hardware emulation by way of the
+BQL. Currently ARM targets serialise all ARM_CP_IO register accesses
+and also defer the reset/startup of vCPUs to the vCPU context by way
+of async_run_on_cpu().
+
+Updates to interrupt state are also protected by the BQL as they can
+often be cross vCPU.
+
+Memory Consistency
+==================
+
+Between emulated guests and host systems there are a range of memory
+consistency models. Even emulating weakly ordered systems on strongly
+ordered hosts needs to ensure things like store-after-load re-ordering
+can be prevented when the guest wants to.
+
+Memory Barriers
+---------------
+
+Barriers (sometimes known as fences) provide a mechanism for software
+to enforce a particular ordering of memory operations from the point
+of view of external observers (e.g. another processor core). They can
+apply to any memory operations as well as just loads or stores.
+
+The Linux kernel has an excellent write-up on the various forms of
+memory barrier and the guarantees they can provide [1].
+
+Barriers are often wrapped around synchronisation primitives to
+provide explicit memory ordering semantics. However they can be used
+by themselves to provide safe lockless access by ensuring for example
+a change to a signal flag will only be visible once the changes to
+payload are.
+
+DESIGN REQUIREMENT: Add a new tcg_memory_barrier op
+
+This would enforce a strong load/store ordering so all loads/stores
+complete at the memory barrier. On single-core non-SMP strongly
+ordered backends this could become a NOP.
+
+Aside from explicit standalone memory barrier instructions there are
+also implicit memory ordering semantics which comes with each guest
+memory access instruction. For example all x86 load/stores come with
+fairly strong guarantees of sequential consistency where as ARM has
+special variants of load/store instructions that imply acquire/release
+semantics.
+
+In the case of a strongly ordered guest architecture being emulated on
+a weakly ordered host the scope for a heavy performance impact is
+quite high.
+
+DESIGN REQUIREMENTS: Be efficient with use of memory barriers
+       - host systems with stronger implied guarantees can skip some barriers
+       - merge consecutive barriers to the strongest one
+
+(Current solution)
+
+The system currently has a tcg_gen_mb() which will add memory barrier
+operations if code generation is being done in a parallel context. The
+tcg_optimize() function attempts to merge barriers up to their
+strongest form before any load/store operations. The solution was
+originally developed and tested for linux-user based systems. All
+backends have been converted to emit fences when required. So far the
+following front-ends have been updated to emit fences when required:
+
+    - target-i386
+    - target-arm
+    - target-aarch64
+    - target-alpha
+    - target-mips
+
+Memory Control and Maintenance
+------------------------------
+
+This includes a class of instructions for controlling system cache
+behaviour. While QEMU doesn't model cache behaviour these instructions
+are often seen when code modification has taken place to ensure the
+changes take effect.
+
+Synchronisation Primitives
+--------------------------
+
+There are two broad types of synchronisation primitives found in
+modern ISAs: atomic instructions and exclusive regions.
+
+The first type offer a simple atomic instruction which will guarantee
+some sort of test and conditional store will be truly atomic w.r.t.
+other cores sharing access to the memory. The classic example is the
+x86 cmpxchg instruction.
+
+The second type offer a pair of load/store instructions which offer a
+guarantee that an region of memory has not been touched between the
+load and store instructions. An example of this is ARM's ldrex/strex
+pair where the strex instruction will return a flag indicating a
+successful store only if no other CPU has accessed the memory region
+since the ldrex.
+
+Traditionally TCG has generated a series of operations that work
+because they are within the context of a single translation block so
+will have completed before another CPU is scheduled. However with
+the ability to have multiple threads running to emulate multiple CPUs
+we will need to explicitly expose these semantics.
+
+DESIGN REQUIREMENTS:
+  - Support classic atomic instructions
+  - Support load/store exclusive (or load link/store conditional) pairs
+  - Generic enough infrastructure to support all guest architectures
+CURRENT OPEN QUESTIONS:
+  - How problematic is the ABA problem in general?
+
+(Current solution)
+
+The TCG provides a number of atomic helpers (tcg_gen_atomic_*) which
+can be used directly or combined to emulate other instructions like
+ARM's ldrex/strex instructions. While they are susceptible to the ABA
+problem so far common guests have not implemented patterns where
+this may be a problem - typically presenting a locking ABI which
+assumes cmpxchg like semantics.
+
+The code also includes a fall-back for cases where multi-threaded TCG
+ops can't work (e.g. guest atomic width > host atomic width). In this
+case an EXCP_ATOMIC exit occurs and the instruction is emulated with
+an exclusive lock which ensures all emulation is serialised.
+
+While the atomic helpers look good enough for now there may be a need
+to look at solutions that can more closely model the guest
+architectures semantics.
+
+==========
+
+[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 02/24] mttcg: translate-all: Enable locking debug in a debug build
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 01/24] docs: new design document multi-thread-tcg.txt Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 03/24] mttcg: Add missing tb_lock/unlock() in cpu_exec_step() Alex Bennée
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Pranith Kumar, Richard Henderson, Alex Bennée,
	Paolo Bonzini, Peter Crosthwaite

From: Pranith Kumar <bobby.prani@gmail.com>

Enable tcg lock debug asserts in a debug build by default instead of
relying on DEBUG_LOCKING. None of the other DEBUG_* macros have
asserts, so this patch removes DEBUG_LOCKING and enable these asserts
in a debug build.

CC: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[AJB: tweak ifdefs so can be early in series]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 translate-all.c | 52 ++++++++++++++++------------------------------------
 1 file changed, 16 insertions(+), 36 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index 5f44ec844e..8a861cb583 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -59,7 +59,6 @@
 
 /* #define DEBUG_TB_INVALIDATE */
 /* #define DEBUG_TB_FLUSH */
-/* #define DEBUG_LOCKING */
 /* make various TB consistency checks */
 /* #define DEBUG_TB_CHECK */
 
@@ -74,20 +73,10 @@
  * access to the memory related structures are protected with the
  * mmap_lock.
  */
-#ifdef DEBUG_LOCKING
-#define DEBUG_MEM_LOCKS 1
-#else
-#define DEBUG_MEM_LOCKS 0
-#endif
-
 #ifdef CONFIG_SOFTMMU
 #define assert_memory_lock() do { /* nothing */ } while (0)
 #else
-#define assert_memory_lock() do {               \
-        if (DEBUG_MEM_LOCKS) {                  \
-            g_assert(have_mmap_lock());         \
-        }                                       \
-    } while (0)
+#define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
 
 #define SMC_BITMAP_USE_THRESHOLD 10
@@ -169,10 +158,18 @@ static void page_table_config_init(void)
     assert(v_l2_levels >= 0);
 }
 
+#ifdef CONFIG_USER_ONLY
+#define assert_tb_locked() tcg_debug_assert(have_tb_lock)
+#define assert_tb_unlocked() tcg_debug_assert(!have_tb_lock)
+#else
+#define assert_tb_locked()  do { /* nothing */ } while (0)
+#define assert_tb_unlocked()  do { /* nothing */ } while (0)
+#endif
+
 void tb_lock(void)
 {
 #ifdef CONFIG_USER_ONLY
-    assert(!have_tb_lock);
+    assert_tb_unlocked();
     qemu_mutex_lock(&tcg_ctx.tb_ctx.tb_lock);
     have_tb_lock++;
 #endif
@@ -181,7 +178,7 @@ void tb_lock(void)
 void tb_unlock(void)
 {
 #ifdef CONFIG_USER_ONLY
-    assert(have_tb_lock);
+    assert_tb_locked();
     have_tb_lock--;
     qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock);
 #endif
@@ -197,23 +194,6 @@ void tb_lock_reset(void)
 #endif
 }
 
-#ifdef DEBUG_LOCKING
-#define DEBUG_TB_LOCKS 1
-#else
-#define DEBUG_TB_LOCKS 0
-#endif
-
-#ifdef CONFIG_SOFTMMU
-#define assert_tb_lock() do { /* nothing */ } while (0)
-#else
-#define assert_tb_lock() do {               \
-        if (DEBUG_TB_LOCKS) {               \
-            g_assert(have_tb_lock);         \
-        }                                   \
-    } while (0)
-#endif
-
-
 static TranslationBlock *tb_find_pc(uintptr_t tc_ptr);
 
 void cpu_gen_init(void)
@@ -847,7 +827,7 @@ static TranslationBlock *tb_alloc(target_ulong pc)
 {
     TranslationBlock *tb;
 
-    assert_tb_lock();
+    assert_tb_locked();
 
     if (tcg_ctx.tb_ctx.nb_tbs >= tcg_ctx.code_gen_max_blocks) {
         return NULL;
@@ -862,7 +842,7 @@ static TranslationBlock *tb_alloc(target_ulong pc)
 /* Called with tb_lock held.  */
 void tb_free(TranslationBlock *tb)
 {
-    assert_tb_lock();
+    assert_tb_locked();
 
     /* In practice this is mostly used for single use temporary TB
        Ignore the hard cases and just back up if this TB happens to
@@ -1104,7 +1084,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
     uint32_t h;
     tb_page_addr_t phys_pc;
 
-    assert_tb_lock();
+    assert_tb_locked();
 
     atomic_set(&tb->invalid, true);
 
@@ -1421,7 +1401,7 @@ static void tb_invalidate_phys_range_1(tb_page_addr_t start, tb_page_addr_t end)
 #ifdef CONFIG_SOFTMMU
 void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
 {
-    assert_tb_lock();
+    assert_tb_locked();
     tb_invalidate_phys_range_1(start, end);
 }
 #else
@@ -1464,7 +1444,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
 #endif /* TARGET_HAS_PRECISE_SMC */
 
     assert_memory_lock();
-    assert_tb_lock();
+    assert_tb_locked();
 
     p = page_find(start >> TARGET_PAGE_BITS);
     if (!p) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 03/24] mttcg: Add missing tb_lock/unlock() in cpu_exec_step()
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 01/24] docs: new design document multi-thread-tcg.txt Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 02/24] mttcg: translate-all: Enable locking debug in a debug build Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 04/24] tcg: move TCG_MO/BAR types into own file Alex Bennée
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Pranith Kumar, Alex Bennée, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson

From: Pranith Kumar <bobby.prani@gmail.com>

The recent patch enabling lock assertions uncovered the missing lock
acquisition in cpu_exec_step(). This patch adds them.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cpu-exec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 142a5862fc..ec84fdb3d7 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -233,14 +233,18 @@ static void cpu_exec_step(CPUState *cpu)
     uint32_t flags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
+    tb_lock();
     tb = tb_gen_code(cpu, pc, cs_base, flags,
                      1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
     tb->orig_tb = NULL;
+    tb_unlock();
     /* execute the generated code */
     trace_exec_tb_nocache(tb, pc);
     cpu_tb_exec(cpu, tb);
+    tb_lock();
     tb_phys_invalidate(tb, -1);
     tb_free(tb);
+    tb_unlock();
 }
 
 void cpu_exec_step_atomic(CPUState *cpu)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 04/24] tcg: move TCG_MO/BAR types into own file
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (2 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 03/24] mttcg: Add missing tb_lock/unlock() in cpu_exec_step() Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 05/24] tcg: add options for enabling MTTCG Alex Bennée
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée, Richard Henderson

We'll be using the memory ordering definitions to define values for
both the host and guest. To avoid fighting with circular header
dependencies just move these types into their own minimal header.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-mo.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg.h    | 18 +-----------------
 2 files changed, 49 insertions(+), 17 deletions(-)
 create mode 100644 tcg/tcg-mo.h

diff --git a/tcg/tcg-mo.h b/tcg/tcg-mo.h
new file mode 100644
index 0000000000..c2c55704e1
--- /dev/null
+++ b/tcg/tcg-mo.h
@@ -0,0 +1,48 @@
+/*
+ * Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef TCG_MO_H
+#define TCG_MO_H
+
+typedef enum {
+    /* Used to indicate the type of accesses on which ordering
+       is to be ensured.  Modeled after SPARC barriers.
+
+       This is of the form TCG_MO_A_B where A is before B in program order.
+    */
+    TCG_MO_LD_LD  = 0x01,
+    TCG_MO_ST_LD  = 0x02,
+    TCG_MO_LD_ST  = 0x04,
+    TCG_MO_ST_ST  = 0x08,
+    TCG_MO_ALL    = 0x0F,  /* OR of the above */
+
+    /* Used to indicate the kind of ordering which is to be ensured by the
+       instruction.  These types are derived from x86/aarch64 instructions.
+       It should be noted that these are different from C11 semantics.  */
+    TCG_BAR_LDAQ  = 0x10,  /* Following ops will not come forward */
+    TCG_BAR_STRL  = 0x20,  /* Previous ops will not be delayed */
+    TCG_BAR_SC    = 0x30,  /* No ops cross barrier; OR of the above */
+} TCGBar;
+
+#endif /* TCG_MO_H */
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 631c6f69b1..f946452049 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -29,6 +29,7 @@
 #include "cpu.h"
 #include "exec/tb-context.h"
 #include "qemu/bitops.h"
+#include "tcg-mo.h"
 #include "tcg-target.h"
 
 /* XXX: make safe guess about sizes */
@@ -498,23 +499,6 @@ static inline intptr_t QEMU_ARTIFICIAL GET_TCGV_PTR(TCGv_ptr t)
 #define TCG_CALL_DUMMY_TCGV     MAKE_TCGV_I32(-1)
 #define TCG_CALL_DUMMY_ARG      ((TCGArg)(-1))
 
-typedef enum {
-    /* Used to indicate the type of accesses on which ordering
-       is to be ensured.  Modeled after SPARC barriers.  */
-    TCG_MO_LD_LD  = 0x01,
-    TCG_MO_ST_LD  = 0x02,
-    TCG_MO_LD_ST  = 0x04,
-    TCG_MO_ST_ST  = 0x08,
-    TCG_MO_ALL    = 0x0F,  /* OR of the above */
-
-    /* Used to indicate the kind of ordering which is to be ensured by the
-       instruction.  These types are derived from x86/aarch64 instructions.
-       It should be noted that these are different from C11 semantics.  */
-    TCG_BAR_LDAQ  = 0x10,  /* Following ops will not come forward */
-    TCG_BAR_STRL  = 0x20,  /* Previous ops will not be delayed */
-    TCG_BAR_SC    = 0x30,  /* No ops cross barrier; OR of the above */
-} TCGBar;
-
 /* Conditions.  Note that these are laid out for easy manipulation by
    the functions below:
      bit 0 is used for inverting;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 05/24] tcg: add options for enabling MTTCG
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (3 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 04/24] tcg: move TCG_MO/BAR types into own file Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 06/24] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, KONRAD Frederic, Alex Bennée, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson

From: KONRAD Frederic <fred.konrad@greensocs.com>

We know there will be cases where MTTCG won't work until additional work
is done in the front/back ends to support. It will however be useful to
be able to turn it on.

As a result MTTCG will default to off unless the combination is
supported. However the user can turn it on for the sake of testing.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: move to -accel tcg,thread=multi|single, defaults]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cpus.c                | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/qom/cpu.h     |  9 +++++++
 include/sysemu/cpus.h |  2 ++
 qemu-options.hx       | 20 ++++++++++++++
 tcg/tcg.h             |  9 +++++++
 vl.c                  | 49 +++++++++++++++++++++++++++++++++-
 6 files changed, 161 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 0bcb5b50b6..c94b3307e5 100644
--- a/cpus.c
+++ b/cpus.c
@@ -25,6 +25,7 @@
 /* Needed early for CONFIG_BSD etc. */
 #include "qemu/osdep.h"
 #include "qemu-common.h"
+#include "qemu/config-file.h"
 #include "cpu.h"
 #include "monitor/monitor.h"
 #include "qapi/qmp/qerror.h"
@@ -45,6 +46,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/bitmap.h"
 #include "qemu/seqlock.h"
+#include "tcg.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
 #include "sysemu/replay.h"
@@ -150,6 +152,77 @@ typedef struct TimersState {
 } TimersState;
 
 static TimersState timers_state;
+bool mttcg_enabled;
+
+/*
+ * We default to false if we know other options have been enabled
+ * which are currently incompatible with MTTCG. Otherwise when each
+ * guest (target) has been updated to support:
+ *   - atomic instructions
+ *   - memory ordering primitives (barriers)
+ * they can set the appropriate CONFIG flags in ${target}-softmmu.mak
+ *
+ * Once a guest architecture has been converted to the new primitives
+ * there are two remaining limitations to check.
+ *
+ * - The guest can't be oversized (e.g. 64 bit guest on 32 bit host)
+ * - The host must have a stronger memory order than the guest
+ *
+ * It may be possible in future to support strong guests on weak hosts
+ * but that will require tagging all load/stores in a guest with their
+ * implicit memory order requirements which would likely slow things
+ * down a lot.
+ */
+
+static bool check_tcg_memory_orders_compatible(void)
+{
+#if defined(TCG_GUEST_DEFAULT_MO) && defined(TCG_TARGET_DEFAULT_MO)
+    return (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) == 0;
+#else
+    return false;
+#endif
+}
+
+static bool default_mttcg_enabled(void)
+{
+    QemuOpts *icount_opts = qemu_find_opts_singleton("icount");
+    const char *rr = qemu_opt_get(icount_opts, "rr");
+
+    if (rr || TCG_OVERSIZED_GUEST) {
+        return false;
+    } else {
+#ifdef TARGET_SUPPORTS_MTTCG
+        return check_tcg_memory_orders_compatible();
+#else
+        return false;
+#endif
+    }
+}
+
+void qemu_tcg_configure(QemuOpts *opts, Error **errp)
+{
+    const char *t = qemu_opt_get(opts, "thread");
+    if (t) {
+        if (strcmp(t, "multi") == 0) {
+            if (TCG_OVERSIZED_GUEST) {
+                error_setg(errp, "No MTTCG when guest word size > hosts");
+            } else {
+                if (!check_tcg_memory_orders_compatible()) {
+                    error_report("Guest expects a stronger memory ordering "
+                                 "than the host provides");
+                    error_printf("This may cause strange/hard to debug errors");
+                }
+                mttcg_enabled = true;
+            }
+        } else if (strcmp(t, "single") == 0) {
+            mttcg_enabled = false;
+        } else {
+            error_setg(errp, "Invalid 'thread' setting %s", t);
+        }
+    } else {
+        mttcg_enabled = default_mttcg_enabled();
+    }
+}
 
 int64_t cpu_get_icount_raw(void)
 {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index f69b2407ea..2cf4ecf144 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -416,6 +416,15 @@ extern struct CPUTailQ cpus;
 extern __thread CPUState *current_cpu;
 
 /**
+ * qemu_tcg_mttcg_enabled:
+ * Check whether we are running MultiThread TCG or not.
+ *
+ * Returns: %true if we are in MTTCG mode %false otherwise.
+ */
+extern bool mttcg_enabled;
+#define qemu_tcg_mttcg_enabled() (mttcg_enabled)
+
+/**
  * cpu_paging_enabled:
  * @cpu: The CPU whose state is to be inspected.
  *
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 3728a1ea7e..a73b5d4bce 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -36,4 +36,6 @@ extern int smp_threads;
 
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg);
 
+void qemu_tcg_configure(QemuOpts *opts, Error **errp);
+
 #endif
diff --git a/qemu-options.hx b/qemu-options.hx
index 9936cf38f3..bf458f83c3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -95,6 +95,26 @@ STEXI
 Select CPU model (@code{-cpu help} for list and additional feature selection)
 ETEXI
 
+DEF("accel", HAS_ARG, QEMU_OPTION_accel,
+    "-accel [accel=]accelerator[,thread=single|multi]\n"
+    "               select accelerator ('-accel help for list')\n"
+    "               thread=single|multi (enable multi-threaded TCG)", QEMU_ARCH_ALL)
+STEXI
+@item -accel @var{name}[,prop=@var{value}[,...]]
+@findex -accel
+This is used to enable an accelerator. Depending on the target architecture,
+kvm, xen, or tcg can be available. By default, tcg is used. If there is more
+than one accelerator specified, the next one is used if the previous one fails
+to initialize.
+@table @option
+@item thread=single|multi
+Controls number of TCG threads. When the TCG is multi-threaded there will be one
+thread per vCPU therefor taking advantage of additional host cores. The default
+is to enable multi-threading where both the back-end and front-ends support it and
+no incompatible TCG features have been enabled (e.g. icount/replay).
+@end table
+ETEXI
+
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
     "-smp [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
     "                set the number of CPUs to 'n' [default=1]\n"
diff --git a/tcg/tcg.h b/tcg/tcg.h
index f946452049..4c7f258220 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -80,6 +80,15 @@ typedef uint64_t tcg_target_ulong;
 #error unsupported
 #endif
 
+/* Oversized TCG guests make things like MTTCG hard
+ * as we can't use atomics for cputlb updates.
+ */
+#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
+#define TCG_OVERSIZED_GUEST 1
+#else
+#define TCG_OVERSIZED_GUEST 0
+#endif
+
 #if TCG_TARGET_NB_REGS <= 32
 typedef uint32_t TCGRegSet;
 #elif TCG_TARGET_NB_REGS <= 64
diff --git a/vl.c b/vl.c
index b5d0a19811..ea7f4320af 100644
--- a/vl.c
+++ b/vl.c
@@ -300,6 +300,26 @@ static QemuOptsList qemu_machine_opts = {
     },
 };
 
+static QemuOptsList qemu_accel_opts = {
+    .name = "accel",
+    .implied_opt_name = "accel",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_accel_opts.head),
+    .merge_lists = true,
+    .desc = {
+        {
+            .name = "accel",
+            .type = QEMU_OPT_STRING,
+            .help = "Select the type of accelerator",
+        },
+        {
+            .name = "thread",
+            .type = QEMU_OPT_STRING,
+            .help = "Enable/disable multi-threaded TCG",
+        },
+        { /* end of list */ }
+    },
+};
+
 static QemuOptsList qemu_boot_opts = {
     .name = "boot-opts",
     .implied_opt_name = "order",
@@ -2928,7 +2948,8 @@ int main(int argc, char **argv, char **envp)
     const char *boot_once = NULL;
     DisplayState *ds;
     int cyls, heads, secs, translation;
-    QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
+    QemuOpts *opts, *machine_opts;
+    QemuOpts *hda_opts = NULL, *icount_opts = NULL, *accel_opts = NULL;
     QemuOptsList *olist;
     int optind;
     const char *optarg;
@@ -2983,6 +3004,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_trace_opts);
     qemu_add_opts(&qemu_option_rom_opts);
     qemu_add_opts(&qemu_machine_opts);
+    qemu_add_opts(&qemu_accel_opts);
     qemu_add_opts(&qemu_mem_opts);
     qemu_add_opts(&qemu_smp_opts);
     qemu_add_opts(&qemu_boot_opts);
@@ -3675,6 +3697,26 @@ int main(int argc, char **argv, char **envp)
                 qdev_prop_register_global(&kvm_pit_lost_tick_policy);
                 break;
             }
+            case QEMU_OPTION_accel:
+                accel_opts = qemu_opts_parse_noisily(qemu_find_opts("accel"),
+                                                     optarg, true);
+                optarg = qemu_opt_get(accel_opts, "accel");
+
+                olist = qemu_find_opts("machine");
+                if (strcmp("kvm", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=kvm", false);
+                } else if (strcmp("xen", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=xen", false);
+                } else if (strcmp("tcg", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=tcg", false);
+                } else {
+                    if (!is_help_option(optarg)) {
+                        error_printf("Unknown accelerator: %s", optarg);
+                    }
+                    error_printf("Supported accelerators: kvm, xen, tcg\n");
+                    exit(1);
+                }
+                break;
             case QEMU_OPTION_usb:
                 olist = qemu_find_opts("machine");
                 qemu_opts_parse_noisily(olist, "usb=on", false);
@@ -3983,6 +4025,8 @@ int main(int argc, char **argv, char **envp)
 
     replay_configure(icount_opts);
 
+    qemu_tcg_configure(accel_opts, &error_fatal);
+
     machine_class = select_machine();
 
     set_memory_options(&ram_slots, &maxram_size, machine_class);
@@ -4349,6 +4393,9 @@ int main(int argc, char **argv, char **envp)
         if (!tcg_enabled()) {
             error_report("-icount is not allowed with hardware virtualization");
             exit(1);
+        } else if (qemu_tcg_mttcg_enabled()) {
+            error_report("-icount does not currently work with MTTCG");
+            exit(1);
         }
         configure_icount(icount_opts, &error_abort);
         qemu_opts_del(icount_opts);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 06/24] tcg: add kick timer for single-threaded vCPU emulation
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (4 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 05/24] tcg: add options for enabling MTTCG Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 07/24] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

Currently we rely on the side effect of the main loop grabbing the
iothread_mutex to give any long running basic block chains a kick to
ensure the next vCPU is scheduled. As this code is being re-factored and
rationalised we now do it explicitly here.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
---
 cpus.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/cpus.c b/cpus.c
index c94b3307e5..9fe56c9a76 100644
--- a/cpus.c
+++ b/cpus.c
@@ -768,6 +768,53 @@ void configure_icount(QemuOpts *opts, Error **errp)
 }
 
 /***********************************************************/
+/* TCG vCPU kick timer
+ *
+ * The kick timer is responsible for moving single threaded vCPU
+ * emulation on to the next vCPU. If more than one vCPU is running a
+ * timer event with force a cpu->exit so the next vCPU can get
+ * scheduled.
+ *
+ * The timer is removed if all vCPUs are idle and restarted again once
+ * idleness is complete.
+ */
+
+static QEMUTimer *tcg_kick_vcpu_timer;
+
+static void qemu_cpu_kick_no_halt(void);
+
+#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
+
+static inline int64_t qemu_tcg_next_kick(void)
+{
+    return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
+}
+
+static void kick_tcg_thread(void *opaque)
+{
+    timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    qemu_cpu_kick_no_halt();
+}
+
+static void start_tcg_kick_timer(void)
+{
+    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
+        tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+                                           kick_tcg_thread, NULL);
+        timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    }
+}
+
+static void stop_tcg_kick_timer(void)
+{
+    if (tcg_kick_vcpu_timer) {
+        timer_del(tcg_kick_vcpu_timer);
+        tcg_kick_vcpu_timer = NULL;
+    }
+}
+
+
+/***********************************************************/
 void hw_error(const char *fmt, ...)
 {
     va_list ap;
@@ -1021,9 +1068,12 @@ static void qemu_wait_io_event_common(CPUState *cpu)
 static void qemu_tcg_wait_io_event(CPUState *cpu)
 {
     while (all_cpu_threads_idle()) {
+        stop_tcg_kick_timer();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
     }
 
+    start_tcg_kick_timer();
+
     while (iothread_requesting_mutex) {
         qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
     }
@@ -1223,6 +1273,15 @@ static void deal_with_unplugged_cpus(void)
     }
 }
 
+/* Single-threaded TCG
+ *
+ * In the single-threaded case each vCPU is simulated in turn. If
+ * there is more than a single vCPU we create a simple timer to kick
+ * the vCPU and ensure we don't get stuck in a tight loop in one vCPU.
+ * This is done explicitly rather than relying on side-effects
+ * elsewhere.
+ */
+
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
@@ -1249,6 +1308,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         }
     }
 
+    start_tcg_kick_timer();
+
     /* process any pending work */
     atomic_mb_set(&exit_request, 1);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 07/24] tcg: rename tcg_current_cpu to tcg_current_rr_cpu
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (5 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 06/24] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution Alex Bennée
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
---
 cpu-exec-common.c       |  1 -
 cpu-exec.c              |  3 ---
 cpus.c                  | 41 ++++++++++++++++++++++-------------------
 include/exec/exec-all.h |  1 -
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/cpu-exec-common.c b/cpu-exec-common.c
index 767d9c6f0c..e2bc053372 100644
--- a/cpu-exec-common.c
+++ b/cpu-exec-common.c
@@ -24,7 +24,6 @@
 #include "exec/memory-internal.h"
 
 bool exit_request;
-CPUState *tcg_current_cpu;
 
 /* exit the current TB, but without causing any exception to be raised */
 void cpu_loop_exit_noexc(CPUState *cpu)
diff --git a/cpu-exec.c b/cpu-exec.c
index ec84fdb3d7..06a6b25564 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -612,7 +612,6 @@ int cpu_exec(CPUState *cpu)
         return EXCP_HALTED;
     }
 
-    atomic_mb_set(&tcg_current_cpu, cpu);
     rcu_read_lock();
 
     if (unlikely(atomic_mb_read(&exit_request))) {
@@ -666,7 +665,5 @@ int cpu_exec(CPUState *cpu)
     /* fail safe : never use current_cpu outside cpu_exec() */
     current_cpu = NULL;
 
-    /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
-    atomic_set(&tcg_current_cpu, NULL);
     return ret;
 }
diff --git a/cpus.c b/cpus.c
index 9fe56c9a76..860034a794 100644
--- a/cpus.c
+++ b/cpus.c
@@ -780,8 +780,7 @@ void configure_icount(QemuOpts *opts, Error **errp)
  */
 
 static QEMUTimer *tcg_kick_vcpu_timer;
-
-static void qemu_cpu_kick_no_halt(void);
+static CPUState *tcg_current_rr_cpu;
 
 #define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
 
@@ -790,10 +789,23 @@ static inline int64_t qemu_tcg_next_kick(void)
     return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
 }
 
+/* Kick the currently round-robin scheduled vCPU */
+static void qemu_cpu_kick_rr_cpu(void)
+{
+    CPUState *cpu;
+    atomic_mb_set(&exit_request, 1);
+    do {
+        cpu = atomic_mb_read(&tcg_current_rr_cpu);
+        if (cpu) {
+            cpu_exit(cpu);
+        }
+    } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
+}
+
 static void kick_tcg_thread(void *opaque)
 {
     timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
-    qemu_cpu_kick_no_halt();
+    qemu_cpu_kick_rr_cpu();
 }
 
 static void start_tcg_kick_timer(void)
@@ -813,7 +825,6 @@ static void stop_tcg_kick_timer(void)
     }
 }
 
-
 /***********************************************************/
 void hw_error(const char *fmt, ...)
 {
@@ -1324,6 +1335,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         }
 
         for (; cpu != NULL && !exit_request; cpu = CPU_NEXT(cpu)) {
+            atomic_mb_set(&tcg_current_rr_cpu, cpu);
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
                               (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0);
@@ -1343,6 +1355,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
 
         } /* for cpu.. */
+        /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
+        atomic_set(&tcg_current_rr_cpu, NULL);
 
         /* Pairs with smp_wmb in qemu_cpu_kick.  */
         atomic_mb_set(&exit_request, 0);
@@ -1421,24 +1435,13 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
 #endif
 }
 
-static void qemu_cpu_kick_no_halt(void)
-{
-    CPUState *cpu;
-    /* Ensure whatever caused the exit has reached the CPU threads before
-     * writing exit_request.
-     */
-    atomic_mb_set(&exit_request, 1);
-    cpu = atomic_mb_read(&tcg_current_cpu);
-    if (cpu) {
-        cpu_exit(cpu);
-    }
-}
-
 void qemu_cpu_kick(CPUState *cpu)
 {
     qemu_cond_broadcast(cpu->halt_cond);
     if (tcg_enabled()) {
-        qemu_cpu_kick_no_halt();
+        cpu_exit(cpu);
+        /* Also ensure current RR cpu is kicked */
+        qemu_cpu_kick_rr_cpu();
     } else {
         if (hax_enabled()) {
             /*
@@ -1486,7 +1489,7 @@ void qemu_mutex_lock_iothread(void)
         atomic_dec(&iothread_requesting_mutex);
     } else {
         if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_no_halt();
+            qemu_cpu_kick_rr_cpu();
             qemu_mutex_lock(&qemu_global_mutex);
         }
         atomic_dec(&iothread_requesting_mutex);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 21ab7bf3fd..4e34fc4cc1 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -405,7 +405,6 @@ bool memory_region_is_unassigned(MemoryRegion *mr);
 extern int singlestep;
 
 /* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */
-extern CPUState *tcg_current_cpu;
 extern bool exit_request;
 
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (6 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 07/24] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-27 12:48   ` Laurent Desnogues
  2017-02-24 11:20 ` [Qemu-devel] [PULL 09/24] tcg: remove global exit_request Alex Bennée
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Jan Kiszka, KONRAD Frederic, Emilio G . Cota,
	Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Eduardo Habkost, Michael S. Tsirkin,
	David Gibson, Alexander Graf, open list:ARM cores,
	open list:PowerPC

From: Jan Kiszka <jan.kiszka@siemens.com>

This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
[PM: target-arm changes]
Acked-by: Peter Maydell <peter.maydell@linaro.org>
---
 cpu-exec.c                 | 23 +++++++++++++++++++++--
 cpus.c                     | 28 +++++-----------------------
 cputlb.c                   | 21 ++++++++++++++++++++-
 exec.c                     | 12 +++++++++---
 hw/core/irq.c              |  1 +
 hw/i386/kvmvapic.c         |  4 ++--
 hw/intc/arm_gicv3_cpuif.c  |  3 +++
 hw/ppc/ppc.c               | 16 +++++++++++++++-
 hw/ppc/spapr.c             |  3 +++
 include/qom/cpu.h          |  1 +
 memory.c                   |  2 ++
 qom/cpu.c                  | 10 ++++++++++
 target/arm/helper.c        |  6 ++++++
 target/arm/op_helper.c     | 43 +++++++++++++++++++++++++++++++++++++++----
 target/i386/smm_helper.c   |  7 +++++++
 target/s390x/misc_helper.c |  5 ++++-
 translate-all.c            |  9 +++++++--
 translate-common.c         | 21 +++++++++++----------
 18 files changed, 166 insertions(+), 49 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 06a6b25564..1bd3d72002 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
 #include "qemu/rcu.h"
 #include "exec/tb-hash.h"
 #include "exec/log.h"
+#include "qemu/main-loop.h"
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
@@ -388,8 +389,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
         if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
             && replay_interrupt()) {
             X86CPU *x86_cpu = X86_CPU(cpu);
+            qemu_mutex_lock_iothread();
             apic_poll_irq(x86_cpu->apic_state);
             cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
+            qemu_mutex_unlock_iothread();
         }
 #endif
         if (!cpu_has_work(cpu)) {
@@ -443,7 +446,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
 #else
             if (replay_exception()) {
                 CPUClass *cc = CPU_GET_CLASS(cpu);
+                qemu_mutex_lock_iothread();
                 cc->do_interrupt(cpu);
+                qemu_mutex_unlock_iothread();
                 cpu->exception_index = -1;
             } else if (!replay_has_interrupt()) {
                 /* give a chance to iothread in replay mode */
@@ -469,9 +474,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
                                         TranslationBlock **last_tb)
 {
     CPUClass *cc = CPU_GET_CLASS(cpu);
-    int interrupt_request = cpu->interrupt_request;
 
-    if (unlikely(interrupt_request)) {
+    if (unlikely(atomic_read(&cpu->interrupt_request))) {
+        int interrupt_request;
+        qemu_mutex_lock_iothread();
+        interrupt_request = cpu->interrupt_request;
         if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
             /* Mask out external interrupts for this step. */
             interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
@@ -479,6 +486,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
         if (interrupt_request & CPU_INTERRUPT_DEBUG) {
             cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
             cpu->exception_index = EXCP_DEBUG;
+            qemu_mutex_unlock_iothread();
             return true;
         }
         if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
@@ -488,6 +496,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
             cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
             cpu->halted = 1;
             cpu->exception_index = EXCP_HLT;
+            qemu_mutex_unlock_iothread();
             return true;
         }
 #if defined(TARGET_I386)
@@ -498,12 +507,14 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
             cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
             do_cpu_init(x86_cpu);
             cpu->exception_index = EXCP_HALTED;
+            qemu_mutex_unlock_iothread();
             return true;
         }
 #else
         else if (interrupt_request & CPU_INTERRUPT_RESET) {
             replay_interrupt();
             cpu_reset(cpu);
+            qemu_mutex_unlock_iothread();
             return true;
         }
 #endif
@@ -526,7 +537,12 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
                the program flow was changed */
             *last_tb = NULL;
         }
+
+        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
+        qemu_mutex_unlock_iothread();
     }
+
+
     if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
         atomic_set(&cpu->exit_request, 0);
         cpu->exception_index = EXCP_INTERRUPT;
@@ -643,6 +659,9 @@ int cpu_exec(CPUState *cpu)
 #endif /* buggy compiler */
         cpu->can_do_io = 1;
         tb_lock_reset();
+        if (qemu_mutex_iothread_locked()) {
+            qemu_mutex_unlock_iothread();
+        }
     }
 
     /* if an exception is pending, we execute it here */
diff --git a/cpus.c b/cpus.c
index 860034a794..0ae8f69be5 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1027,8 +1027,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
 #endif /* _WIN32 */
 
 static QemuMutex qemu_global_mutex;
-static QemuCond qemu_io_proceeded_cond;
-static unsigned iothread_requesting_mutex;
 
 static QemuThread io_thread;
 
@@ -1042,7 +1040,6 @@ void qemu_init_cpu_loop(void)
     qemu_init_sigbus();
     qemu_cond_init(&qemu_cpu_cond);
     qemu_cond_init(&qemu_pause_cond);
-    qemu_cond_init(&qemu_io_proceeded_cond);
     qemu_mutex_init(&qemu_global_mutex);
 
     qemu_thread_get_self(&io_thread);
@@ -1085,10 +1082,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
 
     start_tcg_kick_timer();
 
-    while (iothread_requesting_mutex) {
-        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
-    }
-
     CPU_FOREACH(cpu) {
         qemu_wait_io_event_common(cpu);
     }
@@ -1249,9 +1242,11 @@ static int tcg_cpu_exec(CPUState *cpu)
         cpu->icount_decr.u16.low = decr;
         cpu->icount_extra = count;
     }
+    qemu_mutex_unlock_iothread();
     cpu_exec_start(cpu);
     ret = cpu_exec(cpu);
     cpu_exec_end(cpu);
+    qemu_mutex_lock_iothread();
 #ifdef CONFIG_PROFILER
     tcg_time += profile_getclock() - ti;
 #endif
@@ -1479,27 +1474,14 @@ bool qemu_mutex_iothread_locked(void)
 
 void qemu_mutex_lock_iothread(void)
 {
-    atomic_inc(&iothread_requesting_mutex);
-    /* In the simple case there is no need to bump the VCPU thread out of
-     * TCG code execution.
-     */
-    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
-        !first_cpu || !first_cpu->created) {
-        qemu_mutex_lock(&qemu_global_mutex);
-        atomic_dec(&iothread_requesting_mutex);
-    } else {
-        if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_rr_cpu();
-            qemu_mutex_lock(&qemu_global_mutex);
-        }
-        atomic_dec(&iothread_requesting_mutex);
-        qemu_cond_broadcast(&qemu_io_proceeded_cond);
-    }
+    g_assert(!qemu_mutex_iothread_locked());
+    qemu_mutex_lock(&qemu_global_mutex);
     iothread_locked = true;
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
+    g_assert(qemu_mutex_iothread_locked());
     iothread_locked = false;
     qemu_mutex_unlock(&qemu_global_mutex);
 }
diff --git a/cputlb.c b/cputlb.c
index 6c39927455..1cc9d9da51 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/memory.h"
@@ -495,6 +496,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
     uint64_t val;
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     cpu->mem_io_pc = retaddr;
@@ -503,7 +505,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     }
 
     cpu->mem_io_vaddr = addr;
+
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
+
     return val;
 }
 
@@ -514,15 +525,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
         cpu_io_recompile(cpu, retaddr);
     }
-
     cpu->mem_io_vaddr = addr;
     cpu->mem_io_pc = retaddr;
+
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 /* Return true if ADDR is present in the victim tlb, and has been copied
diff --git a/exec.c b/exec.c
index 865a1e8295..3adf2b1861 100644
--- a/exec.c
+++ b/exec.c
@@ -2134,9 +2134,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
                 }
                 cpu->watchpoint_hit = wp;
 
-                /* The tb_lock will be reset when cpu_loop_exit or
-                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
-                 * main loop.
+                /* Both tb_lock and iothread_mutex will be reset when
+                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
+                 * back into the cpu_exec main loop.
                  */
                 tb_lock();
                 tb_check_watchpoint(cpu);
@@ -2371,8 +2371,14 @@ static void io_mem_init(void)
     memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
     memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
                           NULL, UINT64_MAX);
+
+    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
+     * which can be called without the iothread mutex.
+     */
     memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
                           NULL, UINT64_MAX);
+    memory_region_clear_global_locking(&io_mem_notdirty);
+
     memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
                           NULL, UINT64_MAX);
 }
diff --git a/hw/core/irq.c b/hw/core/irq.c
index 49ff2e64fe..b98d1d69f5 100644
--- a/hw/core/irq.c
+++ b/hw/core/irq.c
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "qemu-common.h"
 #include "hw/irq.h"
 #include "qom/object.h"
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 7135633863..82a49556af 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -457,8 +457,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
     resume_all_vcpus();
 
     if (!kvm_enabled()) {
-        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
-         * back into the cpu_exec loop. */
+        /* Both tb_lock and iothread_mutex will be reset when
+         *  longjmps back into the cpu_exec loop. */
         tb_lock();
         tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
         cpu_loop_exit_noexc(cs);
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index c25ee03556..f775aba507 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -14,6 +14,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/bitops.h"
+#include "qemu/main-loop.h"
 #include "trace.h"
 #include "gicv3_internal.h"
 #include "cpu.h"
@@ -733,6 +734,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
     ARMCPU *cpu = ARM_CPU(cs->cpu);
     CPUARMState *env = &cpu->env;
 
+    g_assert(qemu_mutex_iothread_locked());
+
     trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
                              cs->hppi.grp, cs->hppi.prio);
 
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index d171e60b5c..5f93083d4a 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -62,7 +62,16 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
 {
     CPUState *cs = CPU(cpu);
     CPUPPCState *env = &cpu->env;
-    unsigned int old_pending = env->pending_interrupts;
+    unsigned int old_pending;
+    bool locked = false;
+
+    /* We may already have the BQL if coming from the reset path */
+    if (!qemu_mutex_iothread_locked()) {
+        locked = true;
+        qemu_mutex_lock_iothread();
+    }
+
+    old_pending = env->pending_interrupts;
 
     if (level) {
         env->pending_interrupts |= 1 << n_IRQ;
@@ -80,9 +89,14 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
 #endif
     }
 
+
     LOG_IRQ("%s: %p n_IRQ %d level %d => pending %08" PRIx32
                 "req %08x\n", __func__, env, n_IRQ, level,
                 env->pending_interrupts, CPU(cpu)->interrupt_request);
+
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 /* PowerPC 6xx / 7xx internal IRQ controller */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e465d7ac98..b1e374f3f9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1010,6 +1010,9 @@ static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
 {
     CPUPPCState *env = &cpu->env;
 
+    /* The TCG path should also be holding the BQL at this point */
+    g_assert(qemu_mutex_iothread_locked());
+
     if (msr_pr) {
         hcall_dprintf("Hypercall made with MSR[PR]=1\n");
         env->gpr[3] = H_PRIVILEGE;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 2cf4ecf144..10db89b16a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -329,6 +329,7 @@ struct CPUState {
     bool unplug;
     bool crash_occurred;
     bool exit_request;
+    /* updates protected by BQL */
     uint32_t interrupt_request;
     int singlestep_enabled;
     int64_t icount_extra;
diff --git a/memory.c b/memory.c
index ed8b5aa83e..d61caee867 100644
--- a/memory.c
+++ b/memory.c
@@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
     AddressSpace *as;
 
     assert(memory_region_transaction_depth);
+    assert(qemu_mutex_iothread_locked());
+
     --memory_region_transaction_depth;
     if (!memory_region_transaction_depth) {
         if (memory_region_update_pending) {
diff --git a/qom/cpu.c b/qom/cpu.c
index ed87c50cea..58784bcbea 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
     error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
 }
 
+/* Resetting the IRQ comes from across the code base so we take the
+ * BQL here if we need to.  cpu_interrupt assumes it is held.*/
 void cpu_reset_interrupt(CPUState *cpu, int mask)
 {
+    bool need_lock = !qemu_mutex_iothread_locked();
+
+    if (need_lock) {
+        qemu_mutex_lock_iothread();
+    }
     cpu->interrupt_request &= ~mask;
+    if (need_lock) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 void cpu_exit(CPUState *cpu)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 47250bcf16..753a69d40d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6769,6 +6769,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
         arm_cpu_do_interrupt_aarch32(cs);
     }
 
+    /* Hooks may change global state so BQL should be held, also the
+     * BQL needs to be held for any modification of
+     * cs->interrupt_request.
+     */
+    g_assert(qemu_mutex_iothread_locked());
+
     arm_call_el_change_hook(cpu);
 
     if (!kvm_enabled()) {
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index fb366fdc35..5f3e3bdae2 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -18,6 +18,7 @@
  */
 #include "qemu/osdep.h"
 #include "qemu/log.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "internals.h"
@@ -487,7 +488,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
      */
     env->regs[15] &= (env->thumb ? ~1 : ~3);
 
+    qemu_mutex_lock_iothread();
     arm_call_el_change_hook(arm_env_get_cpu(env));
+    qemu_mutex_unlock_iothread();
 }
 
 /* Access to user mode registers from privileged modes.  */
@@ -735,28 +738,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
 {
     const ARMCPRegInfo *ri = rip;
 
-    ri->writefn(env, ri, value);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        ri->writefn(env, ri, value);
+        qemu_mutex_unlock_iothread();
+    } else {
+        ri->writefn(env, ri, value);
+    }
 }
 
 uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
 {
     const ARMCPRegInfo *ri = rip;
+    uint32_t res;
 
-    return ri->readfn(env, ri);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        res = ri->readfn(env, ri);
+        qemu_mutex_unlock_iothread();
+    } else {
+        res = ri->readfn(env, ri);
+    }
+
+    return res;
 }
 
 void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
 {
     const ARMCPRegInfo *ri = rip;
 
-    ri->writefn(env, ri, value);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        ri->writefn(env, ri, value);
+        qemu_mutex_unlock_iothread();
+    } else {
+        ri->writefn(env, ri, value);
+    }
 }
 
 uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
 {
     const ARMCPRegInfo *ri = rip;
+    uint64_t res;
+
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        res = ri->readfn(env, ri);
+        qemu_mutex_unlock_iothread();
+    } else {
+        res = ri->readfn(env, ri);
+    }
 
-    return ri->readfn(env, ri);
+    return res;
 }
 
 void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
@@ -989,7 +1022,9 @@ void HELPER(exception_return)(CPUARMState *env)
                       cur_el, new_el, env->pc);
     }
 
+    qemu_mutex_lock_iothread();
     arm_call_el_change_hook(arm_env_get_cpu(env));
+    qemu_mutex_unlock_iothread();
 
     return;
 
diff --git a/target/i386/smm_helper.c b/target/i386/smm_helper.c
index 4dd6a2c544..f051a77c4a 100644
--- a/target/i386/smm_helper.c
+++ b/target/i386/smm_helper.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/log.h"
@@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
 #define SMM_REVISION_ID 0x00020000
 #endif
 
+/* Called with iothread lock taken */
 void cpu_smm_update(X86CPU *cpu)
 {
     CPUX86State *env = &cpu->env;
     bool smm_enabled = (env->hflags & HF_SMM_MASK);
 
+    g_assert(qemu_mutex_iothread_locked());
+
     if (cpu->smram) {
         memory_region_set_enabled(cpu->smram, smm_enabled);
     }
@@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
     }
     env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
     env->hflags &= ~HF_SMM_MASK;
+
+    qemu_mutex_lock_iothread();
     cpu_smm_update(cpu);
+    qemu_mutex_unlock_iothread();
 
     qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
     log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index c9604ea9c7..3cb942e8bb 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -25,6 +25,7 @@
 #include "exec/helper-proto.h"
 #include "sysemu/kvm.h"
 #include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "exec/address-spaces.h"
 #ifdef CONFIG_KVM
 #include <linux/kvm.h>
@@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
 /* SCLP service call */
 uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
 {
+    qemu_mutex_lock_iothread();
     int r = sclp_service_call(env, r1, r2);
     if (r < 0) {
         program_interrupt(env, -r, 4);
-        return 0;
+        r = 0;
     }
+    qemu_mutex_unlock_iothread();
     return r;
 }
 
diff --git a/translate-all.c b/translate-all.c
index 8a861cb583..f810259c41 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -55,6 +55,7 @@
 #include "translate-all.h"
 #include "qemu/bitmap.h"
 #include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "exec/log.h"
 
 /* #define DEBUG_TB_INVALIDATE */
@@ -1523,7 +1524,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
 #ifdef CONFIG_SOFTMMU
 /* len must be <= 8 and start must be a multiple of len.
  * Called via softmmu_template.h when code areas are written to with
- * tb_lock held.
+ * iothread mutex not held.
  */
 void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
 {
@@ -1725,7 +1726,10 @@ void tb_check_watchpoint(CPUState *cpu)
 
 #ifndef CONFIG_USER_ONLY
 /* in deterministic execution mode, instructions doing device I/Os
-   must be at the end of the TB */
+ * must be at the end of the TB.
+ *
+ * Called by softmmu_template.h, with iothread mutex not held.
+ */
 void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
 {
 #if defined(TARGET_MIPS) || defined(TARGET_SH4)
@@ -1937,6 +1941,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
 
 void cpu_interrupt(CPUState *cpu, int mask)
 {
+    g_assert(qemu_mutex_iothread_locked());
     cpu->interrupt_request |= mask;
     cpu->tcg_exit_req = 1;
 }
diff --git a/translate-common.c b/translate-common.c
index 5e989cdf70..d504dd0d33 100644
--- a/translate-common.c
+++ b/translate-common.c
@@ -21,6 +21,7 @@
 #include "qemu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/cpus.h"
+#include "qemu/main-loop.h"
 
 uintptr_t qemu_real_host_page_size;
 intptr_t qemu_real_host_page_mask;
@@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
 static void tcg_handle_interrupt(CPUState *cpu, int mask)
 {
     int old_mask;
+    g_assert(qemu_mutex_iothread_locked());
 
     old_mask = cpu->interrupt_request;
     cpu->interrupt_request |= mask;
@@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
      */
     if (!qemu_cpu_is_self(cpu)) {
         qemu_cpu_kick(cpu);
-        return;
-    }
-
-    if (use_icount) {
-        cpu->icount_decr.u16.high = 0xffff;
-        if (!cpu->can_do_io
-            && (mask & ~old_mask) != 0) {
-            cpu_abort(cpu, "Raised interrupt while not in I/O function");
-        }
     } else {
-        cpu->tcg_exit_req = 1;
+        if (use_icount) {
+            cpu->icount_decr.u16.high = 0xffff;
+            if (!cpu->can_do_io
+                && (mask & ~old_mask) != 0) {
+                cpu_abort(cpu, "Raised interrupt while not in I/O function");
+            }
+        } else {
+            cpu->tcg_exit_req = 1;
+        }
     }
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 09/24] tcg: remove global exit_request
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (7 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 10/24] tcg: enable tb_lock() for SoftMMU Alex Bennée
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along. As an cpu->exit_request may legitimately get squashed
while processing the EXCP_INTERRUPT exception we also check
cpu->queued_work_first to ensure queued work is expedited as soon as
possible.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cpu-exec-common.c       |  2 --
 cpu-exec.c              | 20 +++++++-------------
 cpus.c                  | 19 +++++++++++--------
 include/exec/exec-all.h |  3 ---
 4 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/cpu-exec-common.c b/cpu-exec-common.c
index e2bc053372..0504a9457b 100644
--- a/cpu-exec-common.c
+++ b/cpu-exec-common.c
@@ -23,8 +23,6 @@
 #include "exec/exec-all.h"
 #include "exec/memory-internal.h"
 
-bool exit_request;
-
 /* exit the current TB, but without causing any exception to be raised */
 void cpu_loop_exit_noexc(CPUState *cpu)
 {
diff --git a/cpu-exec.c b/cpu-exec.c
index 1bd3d72002..85f14d4194 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -568,15 +568,13 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
     *tb_exit = ret & TB_EXIT_MASK;
     switch (*tb_exit) {
     case TB_EXIT_REQUESTED:
-        /* Something asked us to stop executing
-         * chained TBs; just continue round the main
-         * loop. Whatever requested the exit will also
-         * have set something else (eg exit_request or
-         * interrupt_request) which we will handle
-         * next time around the loop.  But we need to
-         * ensure the zeroing of tcg_exit_req (see cpu_tb_exec)
-         * comes before the next read of cpu->exit_request
-         * or cpu->interrupt_request.
+        /* Something asked us to stop executing chained TBs; just
+         * continue round the main loop. Whatever requested the exit
+         * will also have set something else (eg interrupt_request)
+         * which we will handle next time around the loop.  But we
+         * need to ensure the tcg_exit_req read in generated code
+         * comes before the next read of cpu->exit_request or
+         * cpu->interrupt_request.
          */
         smp_mb();
         *last_tb = NULL;
@@ -630,10 +628,6 @@ int cpu_exec(CPUState *cpu)
 
     rcu_read_lock();
 
-    if (unlikely(atomic_mb_read(&exit_request))) {
-        cpu->exit_request = 1;
-    }
-
     cc->cpu_exec_enter(cpu);
 
     /* Calculate difference between guest clock and host clock.
diff --git a/cpus.c b/cpus.c
index 0ae8f69be5..e165d18785 100644
--- a/cpus.c
+++ b/cpus.c
@@ -793,7 +793,6 @@ static inline int64_t qemu_tcg_next_kick(void)
 static void qemu_cpu_kick_rr_cpu(void)
 {
     CPUState *cpu;
-    atomic_mb_set(&exit_request, 1);
     do {
         cpu = atomic_mb_read(&tcg_current_rr_cpu);
         if (cpu) {
@@ -1316,11 +1315,11 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
     start_tcg_kick_timer();
 
-    /* process any pending work */
-    atomic_mb_set(&exit_request, 1);
-
     cpu = first_cpu;
 
+    /* process any pending work */
+    cpu->exit_request = 1;
+
     while (1) {
         /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
         qemu_account_warp_timer();
@@ -1329,7 +1328,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             cpu = first_cpu;
         }
 
-        for (; cpu != NULL && !exit_request; cpu = CPU_NEXT(cpu)) {
+        while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
+
             atomic_mb_set(&tcg_current_rr_cpu, cpu);
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
@@ -1349,12 +1349,15 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                 break;
             }
 
-        } /* for cpu.. */
+            cpu = CPU_NEXT(cpu);
+        } /* while (cpu && !cpu->exit_request).. */
+
         /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
         atomic_set(&tcg_current_rr_cpu, NULL);
 
-        /* Pairs with smp_wmb in qemu_cpu_kick.  */
-        atomic_mb_set(&exit_request, 0);
+        if (cpu && cpu->exit_request) {
+            atomic_mb_set(&cpu->exit_request, 0);
+        }
 
         handle_icount_deadline();
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 4e34fc4cc1..82f0e12327 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -404,7 +404,4 @@ bool memory_region_is_unassigned(MemoryRegion *mr);
 /* vl.c */
 extern int singlestep;
 
-/* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */
-extern bool exit_request;
-
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 10/24] tcg: enable tb_lock() for SoftMMU
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (8 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 09/24] tcg: remove global exit_request Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU Alex Bennée
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

tb_lock() has long been used for linux-user mode to protect code
generation. By enabling it now we prepare for MTTCG and ensure all code
generation is serialised by this lock. The other major structure that
needs protecting is the l1_map and its PageDesc structures. For the
SoftMMU case we also use tb_lock() to protect these structures instead
of linux-user mmap_lock() which as the name suggests serialises updates
to the structure as a result of guest mmap operations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 translate-all.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index f810259c41..9bac061c9b 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -75,7 +75,7 @@
  * mmap_lock.
  */
 #ifdef CONFIG_SOFTMMU
-#define assert_memory_lock() do { /* nothing */ } while (0)
+#define assert_memory_lock() tcg_debug_assert(have_tb_lock)
 #else
 #define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
@@ -135,9 +135,7 @@ TCGContext tcg_ctx;
 bool parallel_cpus;
 
 /* translation block context */
-#ifdef CONFIG_USER_ONLY
 __thread int have_tb_lock;
-#endif
 
 static void page_table_config_init(void)
 {
@@ -159,40 +157,29 @@ static void page_table_config_init(void)
     assert(v_l2_levels >= 0);
 }
 
-#ifdef CONFIG_USER_ONLY
 #define assert_tb_locked() tcg_debug_assert(have_tb_lock)
 #define assert_tb_unlocked() tcg_debug_assert(!have_tb_lock)
-#else
-#define assert_tb_locked()  do { /* nothing */ } while (0)
-#define assert_tb_unlocked()  do { /* nothing */ } while (0)
-#endif
 
 void tb_lock(void)
 {
-#ifdef CONFIG_USER_ONLY
     assert_tb_unlocked();
     qemu_mutex_lock(&tcg_ctx.tb_ctx.tb_lock);
     have_tb_lock++;
-#endif
 }
 
 void tb_unlock(void)
 {
-#ifdef CONFIG_USER_ONLY
     assert_tb_locked();
     have_tb_lock--;
     qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock);
-#endif
 }
 
 void tb_lock_reset(void)
 {
-#ifdef CONFIG_USER_ONLY
     if (have_tb_lock) {
         qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock);
         have_tb_lock = 0;
     }
-#endif
 }
 
 static TranslationBlock *tb_find_pc(uintptr_t tc_ptr);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (9 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 10/24] tcg: enable tb_lock() for SoftMMU Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-27 12:48   ` Laurent Vivier
  2017-02-24 11:20 ` [Qemu-devel] [PULL 12/24] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
                   ` (14 subsequent siblings)
  25 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, KONRAD Frederic, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson

There are a couple of changes that occur at the same time here:

  - introduce a single vCPU qemu_tcg_cpu_thread_fn

  One of these is spawned per vCPU with its own Thread and Condition
  variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
  single threaded function.

  - the TLS current_cpu variable is now live for the lifetime of MTTCG
    vCPU threads. This is for future work where async jobs need to know
    the vCPU context they are operating in.

The user to switch on multi-thread behaviour and spawn a thread
per-vCPU. For a simple test kvm-unit-test like:

  ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi

Will now use 4 vCPU threads and have an expected FAIL (instead of the
unexpected PASS) as the default mode of the test has no protection when
incrementing a shared variable.

We enable the parallel_cpus flag to ensure we generate correct barrier
and atomic code if supported by the front and backends. This doesn't
automatically enable MTTCG until default_mttcg_enabled() is updated to
check the configuration is supported.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: Some fixes, conditionally, commit rewording]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cpu-exec.c |   4 --
 cpus.c     | 134 +++++++++++++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 103 insertions(+), 35 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 85f14d4194..2edd26e823 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -396,7 +396,6 @@ static inline bool cpu_handle_halt(CPUState *cpu)
         }
 #endif
         if (!cpu_has_work(cpu)) {
-            current_cpu = NULL;
             return true;
         }
 
@@ -675,8 +674,5 @@ int cpu_exec(CPUState *cpu)
     cc->cpu_exec_exit(cpu);
     rcu_read_unlock();
 
-    /* fail safe : never use current_cpu outside cpu_exec() */
-    current_cpu = NULL;
-
     return ret;
 }
diff --git a/cpus.c b/cpus.c
index e165d18785..bfee326d30 100644
--- a/cpus.c
+++ b/cpus.c
@@ -809,7 +809,7 @@ static void kick_tcg_thread(void *opaque)
 
 static void start_tcg_kick_timer(void)
 {
-    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
+    if (!mttcg_enabled && !tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
         tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
                                            kick_tcg_thread, NULL);
         timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
@@ -1063,27 +1063,34 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu)
 
 static void qemu_wait_io_event_common(CPUState *cpu)
 {
+    atomic_mb_set(&cpu->thread_kicked, false);
     if (cpu->stop) {
         cpu->stop = false;
         cpu->stopped = true;
         qemu_cond_broadcast(&qemu_pause_cond);
     }
     process_queued_cpu_work(cpu);
-    cpu->thread_kicked = false;
+}
+
+static bool qemu_tcg_should_sleep(CPUState *cpu)
+{
+    if (mttcg_enabled) {
+        return cpu_thread_is_idle(cpu);
+    } else {
+        return all_cpu_threads_idle();
+    }
 }
 
 static void qemu_tcg_wait_io_event(CPUState *cpu)
 {
-    while (all_cpu_threads_idle()) {
+    while (qemu_tcg_should_sleep(cpu)) {
         stop_tcg_kick_timer();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
     }
 
     start_tcg_kick_timer();
 
-    CPU_FOREACH(cpu) {
-        qemu_wait_io_event_common(cpu);
-    }
+    qemu_wait_io_event_common(cpu);
 }
 
 static void qemu_kvm_wait_io_event(CPUState *cpu)
@@ -1154,6 +1161,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu->can_do_io = 1;
+    current_cpu = cpu;
 
     sigemptyset(&waitset);
     sigaddset(&waitset, SIG_IPI);
@@ -1162,9 +1170,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
     cpu->created = true;
     qemu_cond_signal(&qemu_cpu_cond);
 
-    current_cpu = cpu;
     while (1) {
-        current_cpu = NULL;
         qemu_mutex_unlock_iothread();
         do {
             int sig;
@@ -1175,7 +1181,6 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
             exit(1);
         }
         qemu_mutex_lock_iothread();
-        current_cpu = cpu;
         qemu_wait_io_event_common(cpu);
     }
 
@@ -1287,7 +1292,7 @@ static void deal_with_unplugged_cpus(void)
  * elsewhere.
  */
 
-static void *qemu_tcg_cpu_thread_fn(void *arg)
+static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
 
@@ -1309,6 +1314,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
         /* process any pending work */
         CPU_FOREACH(cpu) {
+            current_cpu = cpu;
             qemu_wait_io_event_common(cpu);
         }
     }
@@ -1331,6 +1337,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         while (cpu && !cpu->queued_work_first && !cpu->exit_request) {
 
             atomic_mb_set(&tcg_current_rr_cpu, cpu);
+            current_cpu = cpu;
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
                               (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0);
@@ -1342,7 +1349,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                     cpu_handle_guest_debug(cpu);
                     break;
                 }
-            } else if (cpu->stop || cpu->stopped) {
+            } else if (cpu->stop) {
                 if (cpu->unplug) {
                     cpu = CPU_NEXT(cpu);
                 }
@@ -1361,7 +1368,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
         handle_icount_deadline();
 
-        qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
+        qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
         deal_with_unplugged_cpus();
     }
 
@@ -1408,6 +1415,64 @@ static void CALLBACK dummy_apc_func(ULONG_PTR unused)
 }
 #endif
 
+/* Multi-threaded TCG
+ *
+ * In the multi-threaded case each vCPU has its own thread. The TLS
+ * variable current_cpu can be used deep in the code to find the
+ * current CPUState for a given thread.
+ */
+
+static void *qemu_tcg_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu = arg;
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id = qemu_get_thread_id();
+    cpu->created = true;
+    cpu->can_do_io = 1;
+    current_cpu = cpu;
+    qemu_cond_signal(&qemu_cpu_cond);
+
+    /* process any pending work */
+    cpu->exit_request = 1;
+
+    while (1) {
+        if (cpu_can_run(cpu)) {
+            int r;
+            r = tcg_cpu_exec(cpu);
+            switch (r) {
+            case EXCP_DEBUG:
+                cpu_handle_guest_debug(cpu);
+                break;
+            case EXCP_HALTED:
+                /* during start-up the vCPU is reset and the thread is
+                 * kicked several times. If we don't ensure we go back
+                 * to sleep in the halted state we won't cleanly
+                 * start-up when the vCPU is enabled.
+                 *
+                 * cpu->halted should ensure we sleep in wait_io_event
+                 */
+                g_assert(cpu->halted);
+                break;
+            default:
+                /* Ignore everything else? */
+                break;
+            }
+        }
+
+        handle_icount_deadline();
+
+        atomic_mb_set(&cpu->exit_request, 0);
+        qemu_tcg_wait_io_event(cpu);
+    }
+
+    return NULL;
+}
+
 static void qemu_cpu_kick_thread(CPUState *cpu)
 {
 #ifndef _WIN32
@@ -1438,7 +1503,7 @@ void qemu_cpu_kick(CPUState *cpu)
     qemu_cond_broadcast(cpu->halt_cond);
     if (tcg_enabled()) {
         cpu_exit(cpu);
-        /* Also ensure current RR cpu is kicked */
+        /* NOP unless doing single-thread RR */
         qemu_cpu_kick_rr_cpu();
     } else {
         if (hax_enabled()) {
@@ -1514,13 +1579,6 @@ void pause_all_vcpus(void)
 
     if (qemu_in_vcpu_thread()) {
         cpu_stop_current();
-        if (!kvm_enabled()) {
-            CPU_FOREACH(cpu) {
-                cpu->stop = false;
-                cpu->stopped = true;
-            }
-            return;
-        }
     }
 
     while (!all_vcpus_paused()) {
@@ -1569,29 +1627,43 @@ void cpu_remove_sync(CPUState *cpu)
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
-    static QemuCond *tcg_halt_cond;
-    static QemuThread *tcg_cpu_thread;
+    static QemuCond *single_tcg_halt_cond;
+    static QemuThread *single_tcg_cpu_thread;
 
-    /* share a single thread for all cpus with TCG */
-    if (!tcg_cpu_thread) {
+    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
         cpu->thread = g_malloc0(sizeof(QemuThread));
         cpu->halt_cond = g_malloc0(sizeof(QemuCond));
         qemu_cond_init(cpu->halt_cond);
-        tcg_halt_cond = cpu->halt_cond;
-        snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+
+        if (qemu_tcg_mttcg_enabled()) {
+            /* create a thread per vCPU with TCG (MTTCG) */
+            parallel_cpus = true;
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
                  cpu->cpu_index);
-        qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
-                           cpu, QEMU_THREAD_JOINABLE);
+
+            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+        } else {
+            /* share a single thread for all cpus with TCG */
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
+            qemu_thread_create(cpu->thread, thread_name,
+                               qemu_tcg_rr_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+            single_tcg_halt_cond = cpu->halt_cond;
+            single_tcg_cpu_thread = cpu->thread;
+        }
 #ifdef _WIN32
         cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
         while (!cpu->created) {
             qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
         }
-        tcg_cpu_thread = cpu->thread;
     } else {
-        cpu->thread = tcg_cpu_thread;
-        cpu->halt_cond = tcg_halt_cond;
+        /* For non-MTTCG cases we share the thread */
+        cpu->thread = single_tcg_cpu_thread;
+        cpu->halt_cond = single_tcg_halt_cond;
     }
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 12/24] tcg: handle EXCP_ATOMIC exception for system emulation
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (10 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 13/24] cputlb: add assert_cpu_is_self checks Alex Bennée
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Pranith Kumar, Alex Bennée, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson

From: Pranith Kumar <bobby.prani@gmail.com>

The patch enables handling atomic code in the guest. This should be
preferably done in cpu_handle_exception(), but the current assumptions
regarding when we can execute atomic sections cause a deadlock.

The current mechanism discards the flags which were set in atomic
execution. We ensure they are properly saved by calling the
cc->cpu_exec_enter/leave() functions around the loop.

As we are running cpu_exec_step_atomic() from the outermost loop we
need to avoid an abort() when single stepping over atomic code since
debug exception longjmp will point to the the setlongjmp in
cpu_exec(). We do this by setting a new jmp_env so that it jumps back
here on an exception.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[AJB: tweak title, merge with new patches, add mmap_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c | 43 +++++++++++++++++++++++++++++++------------
 cpus.c     |  9 +++++++++
 2 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 2edd26e823..1a5ad4889d 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -228,24 +228,43 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 
 static void cpu_exec_step(CPUState *cpu)
 {
+    CPUClass *cc = CPU_GET_CLASS(cpu);
     CPUArchState *env = (CPUArchState *)cpu->env_ptr;
     TranslationBlock *tb;
     target_ulong cs_base, pc;
     uint32_t flags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-    tb_lock();
-    tb = tb_gen_code(cpu, pc, cs_base, flags,
-                     1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
-    tb->orig_tb = NULL;
-    tb_unlock();
-    /* execute the generated code */
-    trace_exec_tb_nocache(tb, pc);
-    cpu_tb_exec(cpu, tb);
-    tb_lock();
-    tb_phys_invalidate(tb, -1);
-    tb_free(tb);
-    tb_unlock();
+    if (sigsetjmp(cpu->jmp_env, 0) == 0) {
+        mmap_lock();
+        tb_lock();
+        tb = tb_gen_code(cpu, pc, cs_base, flags,
+                         1 | CF_NOCACHE | CF_IGNORE_ICOUNT);
+        tb->orig_tb = NULL;
+        tb_unlock();
+        mmap_unlock();
+
+        cc->cpu_exec_enter(cpu);
+        /* execute the generated code */
+        trace_exec_tb_nocache(tb, pc);
+        cpu_tb_exec(cpu, tb);
+        cc->cpu_exec_exit(cpu);
+
+        tb_lock();
+        tb_phys_invalidate(tb, -1);
+        tb_free(tb);
+        tb_unlock();
+    } else {
+        /* We may have exited due to another problem here, so we need
+         * to reset any tb_locks we may have taken but didn't release.
+         * The mmap_lock is dropped by tb_gen_code if it runs out of
+         * memory.
+         */
+#ifndef CONFIG_SOFTMMU
+        tcg_debug_assert(!have_mmap_lock());
+#endif
+        tb_lock_reset();
+    }
 }
 
 void cpu_exec_step_atomic(CPUState *cpu)
diff --git a/cpus.c b/cpus.c
index bfee326d30..8200ac6b75 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1348,6 +1348,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
                 if (r == EXCP_DEBUG) {
                     cpu_handle_guest_debug(cpu);
                     break;
+                } else if (r == EXCP_ATOMIC) {
+                    qemu_mutex_unlock_iothread();
+                    cpu_exec_step_atomic(cpu);
+                    qemu_mutex_lock_iothread();
+                    break;
                 }
             } else if (cpu->stop) {
                 if (cpu->unplug) {
@@ -1458,6 +1463,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                  */
                 g_assert(cpu->halted);
                 break;
+            case EXCP_ATOMIC:
+                qemu_mutex_unlock_iothread();
+                cpu_exec_step_atomic(cpu);
+                qemu_mutex_lock_iothread();
             default:
                 /* Ignore everything else? */
                 break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 13/24] cputlb: add assert_cpu_is_self checks
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (11 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 12/24] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:20 ` [Qemu-devel] [PULL 14/24] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

For SoftMMU the TLB flushes are an example of a task that can be
triggered on one vCPU by another. To deal with this properly we need to
use safe work to ensure these changes are done safely. The new assert
can be enabled while debugging to catch these cases.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/cputlb.c b/cputlb.c
index 1cc9d9da51..af0e65cd2c 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -58,6 +58,12 @@
     } \
 } while (0)
 
+#define assert_cpu_is_self(this_cpu) do {                         \
+        if (DEBUG_TLB_GATE) {                                     \
+            g_assert(!cpu->created || qemu_cpu_is_self(cpu));     \
+        }                                                         \
+    } while (0)
+
 /* statistics */
 int tlb_flush_count;
 
@@ -70,6 +76,9 @@ void tlb_flush(CPUState *cpu)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    assert_cpu_is_self(cpu);
+    tlb_debug("(count: %d)\n", tlb_flush_count++);
+
     memset(env->tlb_table, -1, sizeof(env->tlb_table));
     memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -77,13 +86,13 @@ void tlb_flush(CPUState *cpu)
     env->vtlb_index = 0;
     env->tlb_flush_addr = -1;
     env->tlb_flush_mask = 0;
-    tlb_flush_count++;
 }
 
 static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
     for (;;) {
@@ -128,6 +137,7 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     int i;
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
     tlb_debug("page :" TARGET_FMT_lx "\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -165,6 +175,7 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 
     va_start(argp, addr);
 
+    assert_cpu_is_self(cpu);
     tlb_debug("addr "TARGET_FMT_lx"\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -253,6 +264,8 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
+
     env = cpu->env_ptr;
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         unsigned int i;
@@ -284,6 +297,8 @@ void tlb_set_dirty(CPUState *cpu, target_ulong vaddr)
     int i;
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
+
     vaddr &= TARGET_PAGE_MASK;
     i = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
@@ -343,6 +358,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
     int asidx = cpu_asidx_from_attrs(cpu, attrs);
 
+    assert_cpu_is_self(cpu);
     assert(size >= TARGET_PAGE_SIZE);
     if (size != TARGET_PAGE_SIZE) {
         tlb_add_large_page(env, vaddr, size);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 14/24] cputlb: tweak qemu_ram_addr_from_host_nofail reporting
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (12 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 13/24] cputlb: add assert_cpu_is_self checks Alex Bennée
@ 2017-02-24 11:20 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 15/24] cputlb: introduce tlb_flush_* async work Alex Bennée
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:20 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

This moves the helper function closer to where it is called and updates
the error message to report via error_report instead of the deprecated
fprintf.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index af0e65cd2c..94fa9977c5 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -246,18 +246,6 @@ void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
     }
 }
 
-static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
-{
-    ram_addr_t ram_addr;
-
-    ram_addr = qemu_ram_addr_from_host(ptr);
-    if (ram_addr == RAM_ADDR_INVALID) {
-        fprintf(stderr, "Bad ram pointer %p\n", ptr);
-        abort();
-    }
-    return ram_addr;
-}
-
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 {
     CPUArchState *env;
@@ -469,6 +457,18 @@ static void report_bad_exec(CPUState *cpu, target_ulong addr)
     log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP);
 }
 
+static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
+{
+    ram_addr_t ram_addr;
+
+    ram_addr = qemu_ram_addr_from_host(ptr);
+    if (ram_addr == RAM_ADDR_INVALID) {
+        error_report("Bad ram pointer %p", ptr);
+        abort();
+    }
+    return ram_addr;
+}
+
 /* NOTE: this function can trigger an exception */
 /* NOTE2: the returned address is not exactly the physical address: it
  * is actually a ram_addr_t (in system mode; the user mode emulation
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 15/24] cputlb: introduce tlb_flush_* async work.
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (13 preceding siblings ...)
  2017-02-24 11:20 ` [Qemu-devel] [PULL 14/24] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 16/24] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap Alex Bennée
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, KONRAD Frederic, Alex Bennée, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson

From: KONRAD Frederic <fred.konrad@greensocs.com>

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c                | 66 +++++++++++++++++++++++++++++++++++++++++++++++--
 include/exec/exec-all.h |  1 +
 include/qom/cpu.h       |  6 +++++
 3 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 94fa9977c5..5dfd3c3ba9 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -64,6 +64,10 @@
         }                                                         \
     } while (0)
 
+/* run_on_cpu_data.target_ptr should always be big enough for a
+ * target_ulong even on 32 bit builds */
+QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
+
 /* statistics */
 int tlb_flush_count;
 
@@ -72,13 +76,22 @@ int tlb_flush_count;
  * flushing more entries than required is only an efficiency issue,
  * not a correctness issue.
  */
-void tlb_flush(CPUState *cpu)
+static void tlb_flush_nocheck(CPUState *cpu)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    /* The QOM tests will trigger tlb_flushes without setting up TCG
+     * so we bug out here in that case.
+     */
+    if (!tcg_enabled()) {
+        return;
+    }
+
     assert_cpu_is_self(cpu);
     tlb_debug("(count: %d)\n", tlb_flush_count++);
 
+    tb_lock();
+
     memset(env->tlb_table, -1, sizeof(env->tlb_table));
     memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -86,6 +99,27 @@ void tlb_flush(CPUState *cpu)
     env->vtlb_index = 0;
     env->tlb_flush_addr = -1;
     env->tlb_flush_mask = 0;
+
+    tb_unlock();
+
+    atomic_mb_set(&cpu->pending_tlb_flush, false);
+}
+
+static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
+{
+    tlb_flush_nocheck(cpu);
+}
+
+void tlb_flush(CPUState *cpu)
+{
+    if (cpu->created && !qemu_cpu_is_self(cpu)) {
+        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
+            async_run_on_cpu(cpu, tlb_flush_global_async_work,
+                             RUN_ON_CPU_NULL);
+        }
+    } else {
+        tlb_flush_nocheck(cpu);
+    }
 }
 
 static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
@@ -95,6 +129,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
+    tb_lock();
+
     for (;;) {
         int mmu_idx = va_arg(argp, int);
 
@@ -109,6 +145,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+
+    tb_unlock();
 }
 
 void tlb_flush_by_mmuidx(CPUState *cpu, ...)
@@ -131,13 +169,15 @@ static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
     }
 }
 
-void tlb_flush_page(CPUState *cpu, target_ulong addr)
+static void tlb_flush_page_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
+    target_ulong addr = (target_ulong) data.target_ptr;
     int i;
     int mmu_idx;
 
     assert_cpu_is_self(cpu);
+
     tlb_debug("page :" TARGET_FMT_lx "\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -167,6 +207,18 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page(CPUState *cpu, target_ulong addr)
+{
+    tlb_debug("page :" TARGET_FMT_lx "\n", addr);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    } else {
+        tlb_flush_page_async_work(cpu, RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 {
     CPUArchState *env = cpu->env_ptr;
@@ -213,6 +265,16 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page_all(target_ulong addr)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 /* update the TLBs so that writes to code in the virtual page 'addr'
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 82f0e12327..c694e3482b 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -158,6 +158,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr);
 void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
                  uintptr_t retaddr);
+void tlb_flush_page_all(target_ulong addr);
 #else
 static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
 {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 10db89b16a..e80bf7a64a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -402,6 +402,12 @@ struct CPUState {
 
     bool hax_vcpu_dirty;
     struct hax_vcpu_state *hax_vcpu;
+
+    /* The pending_tlb_flush flag is set and cleared atomically to
+     * avoid potential races. The aim of the flag is to avoid
+     * unnecessary flushes.
+     */
+    bool pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 16/24] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (14 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 15/24] cputlb: introduce tlb_flush_* async work Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 17/24] cputlb: add tlb_flush_by_mmuidx async routines Alex Bennée
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson, Mark Cave-Ayland, Artyom Tarasenko,
	open list:ARM

While the vargs approach was flexible the original MTTCG ended up
having munge the bits to a bitmap so the data could be used in
deferred work helpers. Instead of hiding that in cputlb we push the
change to the API to make it take a bitmap of MMU indexes instead.

For ARM some the resulting flushes end up being quite long so to aid
readability I've tended to move the index shifting to a new line so
all the bits being or-ed together line up nicely, for example:

    tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                             (1 << ARMMMUIdx_S1SE1) |
                             (1 << ARMMMUIdx_S1SE0));

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[AT: SPARC parts only]
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[PM: ARM parts only]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 cputlb.c                   |  60 +++++++++--------------
 include/exec/exec-all.h    |  13 ++---
 target/arm/helper.c        | 116 ++++++++++++++++++++++++++++-----------------
 target/sparc/ldst_helper.c |   8 ++--
 4 files changed, 107 insertions(+), 90 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 5dfd3c3ba9..97e5c12de8 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -122,26 +122,25 @@ void tlb_flush(CPUState *cpu)
     }
 }
 
-static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
+static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 {
     CPUArchState *env = cpu->env_ptr;
+    unsigned long mmu_idx_bitmask = idxmap;
+    int mmu_idx;
 
     assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
     tb_lock();
 
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
-
-        if (mmu_idx < 0) {
-            break;
-        }
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
 
-        tlb_debug("%d\n", mmu_idx);
+        if (test_bit(mmu_idx, &mmu_idx_bitmask)) {
+            tlb_debug("%d\n", mmu_idx);
 
-        memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
-        memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+            memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
+            memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+        }
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -149,12 +148,9 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     tb_unlock();
 }
 
-void tlb_flush_by_mmuidx(CPUState *cpu, ...)
+void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 {
-    va_list argp;
-    va_start(argp, cpu);
-    v_tlb_flush_by_mmuidx(cpu, argp);
-    va_end(argp);
+    v_tlb_flush_by_mmuidx(cpu, idxmap);
 }
 
 static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
@@ -219,13 +215,11 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     }
 }
 
-void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
+void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
 {
     CPUArchState *env = cpu->env_ptr;
-    int i, k;
-    va_list argp;
-
-    va_start(argp, addr);
+    unsigned long mmu_idx_bitmap = idxmap;
+    int i, page, mmu_idx;
 
     assert_cpu_is_self(cpu);
     tlb_debug("addr "TARGET_FMT_lx"\n", addr);
@@ -236,31 +230,23 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
                   TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
                   env->tlb_flush_addr, env->tlb_flush_mask);
 
-        v_tlb_flush_by_mmuidx(cpu, argp);
-        va_end(argp);
+        v_tlb_flush_by_mmuidx(cpu, idxmap);
         return;
     }
 
     addr &= TARGET_PAGE_MASK;
-    i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
+    page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
 
-        if (mmu_idx < 0) {
-            break;
-        }
-
-        tlb_debug("idx %d\n", mmu_idx);
-
-        tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
+        if (test_bit(mmu_idx, &mmu_idx_bitmap)) {
+            tlb_flush_entry(&env->tlb_table[mmu_idx][page], addr);
 
-        /* check whether there are vltb entries that need to be flushed */
-        for (k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
+            /* check whether there are vltb entries that need to be flushed */
+            for (i = 0; i < CPU_VTLB_SIZE; i++) {
+                tlb_flush_entry(&env->tlb_v_table[mmu_idx][i], addr);
+            }
         }
     }
-    va_end(argp);
 
     tb_flush_jmp_cache(cpu, addr);
 }
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index c694e3482b..e94e6849dd 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -106,21 +106,22 @@ void tlb_flush(CPUState *cpu);
  * tlb_flush_page_by_mmuidx:
  * @cpu: CPU whose TLB should be flushed
  * @addr: virtual address of page to be flushed
- * @...: list of MMU indexes to flush, terminated by a negative value
+ * @idxmap: bitmap of MMU indexes to flush
  *
  * Flush one page from the TLB of the specified CPU, for the specified
  * MMU indexes.
  */
-void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...);
+void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr,
+                              uint16_t idxmap);
 /**
  * tlb_flush_by_mmuidx:
  * @cpu: CPU whose TLB should be flushed
- * @...: list of MMU indexes to flush, terminated by a negative value
+ * @idxmap: bitmap of MMU indexes to flush
  *
  * Flush all entries from the TLB of the specified CPU, for the specified
  * MMU indexes.
  */
-void tlb_flush_by_mmuidx(CPUState *cpu, ...);
+void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap);
 /**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
@@ -169,11 +170,11 @@ static inline void tlb_flush(CPUState *cpu)
 }
 
 static inline void tlb_flush_page_by_mmuidx(CPUState *cpu,
-                                            target_ulong addr, ...)
+                                            target_ulong addr, uint16_t idxmap)
 {
 }
 
-static inline void tlb_flush_by_mmuidx(CPUState *cpu, ...)
+static inline void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 {
 }
 #endif
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 753a69d40d..b41d0494d1 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -578,8 +578,10 @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
 {
     CPUState *cs = ENV_GET_CPU(env);
 
-    tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0,
-                        ARMMMUIdx_S2NS, -1);
+    tlb_flush_by_mmuidx(cs,
+                        (1 << ARMMMUIdx_S12NSE1) |
+                        (1 << ARMMMUIdx_S12NSE0) |
+                        (1 << ARMMMUIdx_S2NS));
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -588,8 +590,10 @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *other_cs;
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S12NSE1,
-                            ARMMMUIdx_S12NSE0, ARMMMUIdx_S2NS, -1);
+        tlb_flush_by_mmuidx(other_cs,
+                            (1 << ARMMMUIdx_S12NSE1) |
+                            (1 << ARMMMUIdx_S12NSE0) |
+                            (1 << ARMMMUIdx_S2NS));
     }
 }
 
@@ -611,7 +615,7 @@ static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     pageaddr = sextract64(value << 12, 0, 40);
 
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S2NS, -1);
+    tlb_flush_page_by_mmuidx(cs, pageaddr, (1 << ARMMMUIdx_S2NS));
 }
 
 static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -627,7 +631,7 @@ static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     pageaddr = sextract64(value << 12, 0, 40);
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S2NS, -1);
+        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
     }
 }
 
@@ -636,7 +640,7 @@ static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
 {
     CPUState *cs = ENV_GET_CPU(env);
 
-    tlb_flush_by_mmuidx(cs, ARMMMUIdx_S1E2, -1);
+    tlb_flush_by_mmuidx(cs, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -645,7 +649,7 @@ static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *other_cs;
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S1E2, -1);
+        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
     }
 }
 
@@ -655,7 +659,7 @@ static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
 
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S1E2, -1);
+    tlb_flush_page_by_mmuidx(cs, pageaddr, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -665,7 +669,7 @@ static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S1E2, -1);
+        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
     }
 }
 
@@ -2542,8 +2546,10 @@ static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     /* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
     if (raw_read(env, ri) != value) {
-        tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0,
-                            ARMMMUIdx_S2NS, -1);
+        tlb_flush_by_mmuidx(cs,
+                            (1 << ARMMMUIdx_S12NSE1) |
+                            (1 << ARMMMUIdx_S12NSE0) |
+                            (1 << ARMMMUIdx_S2NS));
         raw_write(env, ri, value);
     }
 }
@@ -2902,9 +2908,13 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *cs = CPU(cpu);
 
     if (arm_is_secure_below_el3(env)) {
-        tlb_flush_by_mmuidx(cs, ARMMMUIdx_S1SE1, ARMMMUIdx_S1SE0, -1);
+        tlb_flush_by_mmuidx(cs,
+                            (1 << ARMMMUIdx_S1SE1) |
+                            (1 << ARMMMUIdx_S1SE0));
     } else {
-        tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0, -1);
+        tlb_flush_by_mmuidx(cs,
+                            (1 << ARMMMUIdx_S12NSE1) |
+                            (1 << ARMMMUIdx_S12NSE0));
     }
 }
 
@@ -2916,10 +2926,13 @@ static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     CPU_FOREACH(other_cs) {
         if (sec) {
-            tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S1SE1, ARMMMUIdx_S1SE0, -1);
+            tlb_flush_by_mmuidx(other_cs,
+                                (1 << ARMMMUIdx_S1SE1) |
+                                (1 << ARMMMUIdx_S1SE0));
         } else {
-            tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S12NSE1,
-                                ARMMMUIdx_S12NSE0, -1);
+            tlb_flush_by_mmuidx(other_cs,
+                                (1 << ARMMMUIdx_S12NSE1) |
+                                (1 << ARMMMUIdx_S12NSE0));
         }
     }
 }
@@ -2935,13 +2948,19 @@ static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *cs = CPU(cpu);
 
     if (arm_is_secure_below_el3(env)) {
-        tlb_flush_by_mmuidx(cs, ARMMMUIdx_S1SE1, ARMMMUIdx_S1SE0, -1);
+        tlb_flush_by_mmuidx(cs,
+                            (1 << ARMMMUIdx_S1SE1) |
+                            (1 << ARMMMUIdx_S1SE0));
     } else {
         if (arm_feature(env, ARM_FEATURE_EL2)) {
-            tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0,
-                                ARMMMUIdx_S2NS, -1);
+            tlb_flush_by_mmuidx(cs,
+                                (1 << ARMMMUIdx_S12NSE1) |
+                                (1 << ARMMMUIdx_S12NSE0) |
+                                (1 << ARMMMUIdx_S2NS));
         } else {
-            tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0, -1);
+            tlb_flush_by_mmuidx(cs,
+                                (1 << ARMMMUIdx_S12NSE1) |
+                                (1 << ARMMMUIdx_S12NSE0));
         }
     }
 }
@@ -2952,7 +2971,7 @@ static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
     ARMCPU *cpu = arm_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
 
-    tlb_flush_by_mmuidx(cs, ARMMMUIdx_S1E2, -1);
+    tlb_flush_by_mmuidx(cs, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2961,7 +2980,7 @@ static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
     ARMCPU *cpu = arm_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
 
-    tlb_flush_by_mmuidx(cs, ARMMMUIdx_S1E3, -1);
+    tlb_flush_by_mmuidx(cs, (1 << ARMMMUIdx_S1E3));
 }
 
 static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2977,13 +2996,18 @@ static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     CPU_FOREACH(other_cs) {
         if (sec) {
-            tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S1SE1, ARMMMUIdx_S1SE0, -1);
+            tlb_flush_by_mmuidx(other_cs,
+                                (1 << ARMMMUIdx_S1SE1) |
+                                (1 << ARMMMUIdx_S1SE0));
         } else if (has_el2) {
-            tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S12NSE1,
-                                ARMMMUIdx_S12NSE0, ARMMMUIdx_S2NS, -1);
+            tlb_flush_by_mmuidx(other_cs,
+                                (1 << ARMMMUIdx_S12NSE1) |
+                                (1 << ARMMMUIdx_S12NSE0) |
+                                (1 << ARMMMUIdx_S2NS));
         } else {
-            tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S12NSE1,
-                                ARMMMUIdx_S12NSE0, -1);
+            tlb_flush_by_mmuidx(other_cs,
+                                (1 << ARMMMUIdx_S12NSE1) |
+                                (1 << ARMMMUIdx_S12NSE0));
         }
     }
 }
@@ -2994,7 +3018,7 @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *other_cs;
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S1E2, -1);
+        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
     }
 }
 
@@ -3004,7 +3028,7 @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *other_cs;
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, ARMMMUIdx_S1E3, -1);
+        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E3));
     }
 }
 
@@ -3021,11 +3045,13 @@ static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
     if (arm_is_secure_below_el3(env)) {
-        tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S1SE1,
-                                 ARMMMUIdx_S1SE0, -1);
+        tlb_flush_page_by_mmuidx(cs, pageaddr,
+                                 (1 << ARMMMUIdx_S1SE1) |
+                                 (1 << ARMMMUIdx_S1SE0));
     } else {
-        tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S12NSE1,
-                                 ARMMMUIdx_S12NSE0, -1);
+        tlb_flush_page_by_mmuidx(cs, pageaddr,
+                                 (1 << ARMMMUIdx_S12NSE1) |
+                                 (1 << ARMMMUIdx_S12NSE0));
     }
 }
 
@@ -3040,7 +3066,7 @@ static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *cs = CPU(cpu);
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S1E2, -1);
+    tlb_flush_page_by_mmuidx(cs, pageaddr, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3054,7 +3080,7 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
     CPUState *cs = CPU(cpu);
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S1E3, -1);
+    tlb_flush_page_by_mmuidx(cs, pageaddr, (1 << ARMMMUIdx_S1E3));
 }
 
 static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3066,11 +3092,13 @@ static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     CPU_FOREACH(other_cs) {
         if (sec) {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S1SE1,
-                                     ARMMMUIdx_S1SE0, -1);
+            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
+                                     (1 << ARMMMUIdx_S1SE1) |
+                                     (1 << ARMMMUIdx_S1SE0));
         } else {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S12NSE1,
-                                     ARMMMUIdx_S12NSE0, -1);
+            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
+                                     (1 << ARMMMUIdx_S12NSE1) |
+                                     (1 << ARMMMUIdx_S12NSE0));
         }
     }
 }
@@ -3082,7 +3110,7 @@ static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S1E2, -1);
+        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
     }
 }
 
@@ -3093,7 +3121,7 @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S1E3, -1);
+        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E3));
     }
 }
 
@@ -3116,7 +3144,7 @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     pageaddr = sextract64(value << 12, 0, 48);
 
-    tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S2NS, -1);
+    tlb_flush_page_by_mmuidx(cs, pageaddr, (1 << ARMMMUIdx_S2NS));
 }
 
 static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3132,7 +3160,7 @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
     pageaddr = sextract64(value << 12, 0, 48);
 
     CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S2NS, -1);
+        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
     }
 }
 
diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 2c05d6af75..57968d9143 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1768,13 +1768,15 @@ void helper_st_asi(CPUSPARCState *env, target_ulong addr, target_ulong val,
           case 1:
               env->dmmu.mmu_primary_context = val;
               env->immu.mmu_primary_context = val;
-              tlb_flush_by_mmuidx(CPU(cpu), MMU_USER_IDX, MMU_KERNEL_IDX, -1);
+              tlb_flush_by_mmuidx(CPU(cpu),
+                                  (1 << MMU_USER_IDX) | (1 << MMU_KERNEL_IDX));
               break;
           case 2:
               env->dmmu.mmu_secondary_context = val;
               env->immu.mmu_secondary_context = val;
-              tlb_flush_by_mmuidx(CPU(cpu), MMU_USER_SECONDARY_IDX,
-                                  MMU_KERNEL_SECONDARY_IDX, -1);
+              tlb_flush_by_mmuidx(CPU(cpu),
+                                  (1 << MMU_USER_SECONDARY_IDX) |
+                                  (1 << MMU_KERNEL_SECONDARY_IDX));
               break;
           default:
               cpu_unassigned_access(cs, addr, true, false, 1, size);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 17/24] cputlb: add tlb_flush_by_mmuidx async routines
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (15 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 16/24] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 18/24] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

This converts the remaining TLB flush routines to use async work when
detecting a cross-vCPU flush. The only minor complication is having to
serialise the var_list of MMU indexes into a form that can be punted
to an asynchronous job.

The pending_tlb_flush field on QOM's CPU structure also becomes a
bitfield rather than a boolean.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c          | 110 +++++++++++++++++++++++++++++++++++++++++++-----------
 include/qom/cpu.h |   2 +-
 2 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 97e5c12de8..c50254be26 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -68,6 +68,11 @@
  * target_ulong even on 32 bit builds */
 QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
 
+/* We currently can't handle more than 16 bits in the MMUIDX bitmask.
+ */
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
+#define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
+
 /* statistics */
 int tlb_flush_count;
 
@@ -102,7 +107,7 @@ static void tlb_flush_nocheck(CPUState *cpu)
 
     tb_unlock();
 
-    atomic_mb_set(&cpu->pending_tlb_flush, false);
+    atomic_mb_set(&cpu->pending_tlb_flush, 0);
 }
 
 static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
@@ -113,7 +118,8 @@ static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
 void tlb_flush(CPUState *cpu)
 {
     if (cpu->created && !qemu_cpu_is_self(cpu)) {
-        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
+        if (atomic_mb_read(&cpu->pending_tlb_flush) != ALL_MMUIDX_BITS) {
+            atomic_mb_set(&cpu->pending_tlb_flush, ALL_MMUIDX_BITS);
             async_run_on_cpu(cpu, tlb_flush_global_async_work,
                              RUN_ON_CPU_NULL);
         }
@@ -122,17 +128,18 @@ void tlb_flush(CPUState *cpu)
     }
 }
 
-static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
+static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
-    unsigned long mmu_idx_bitmask = idxmap;
+    unsigned long mmu_idx_bitmask = data.host_int;
     int mmu_idx;
 
     assert_cpu_is_self(cpu);
-    tlb_debug("start\n");
 
     tb_lock();
 
+    tlb_debug("start: mmu_idx:0x%04lx\n", mmu_idx_bitmask);
+
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
 
         if (test_bit(mmu_idx, &mmu_idx_bitmask)) {
@@ -145,12 +152,30 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
 
+    tlb_debug("done\n");
+
     tb_unlock();
 }
 
 void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 {
-    v_tlb_flush_by_mmuidx(cpu, idxmap);
+    tlb_debug("mmu_idx: 0x%" PRIx16 "\n", idxmap);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        uint16_t pending_flushes = idxmap;
+        pending_flushes &= ~atomic_mb_read(&cpu->pending_tlb_flush);
+
+        if (pending_flushes) {
+            tlb_debug("reduced mmu_idx: 0x%" PRIx16 "\n", pending_flushes);
+
+            atomic_or(&cpu->pending_tlb_flush, pending_flushes);
+            async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
+                             RUN_ON_CPU_HOST_INT(pending_flushes));
+        }
+    } else {
+        tlb_flush_by_mmuidx_async_work(cpu,
+                                       RUN_ON_CPU_HOST_INT(idxmap));
+    }
 }
 
 static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
@@ -215,27 +240,26 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     }
 }
 
-void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
+/* As we are going to hijack the bottom bits of the page address for a
+ * mmuidx bit mask we need to fail to build if we can't do that
+ */
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > TARGET_PAGE_BITS_MIN);
+
+static void tlb_flush_page_by_mmuidx_async_work(CPUState *cpu,
+                                                run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
-    unsigned long mmu_idx_bitmap = idxmap;
-    int i, page, mmu_idx;
+    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
+    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
+    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
+    int page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+    int mmu_idx;
+    int i;
 
     assert_cpu_is_self(cpu);
-    tlb_debug("addr "TARGET_FMT_lx"\n", addr);
-
-    /* Check if we need to flush due to large pages.  */
-    if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) {
-        tlb_debug("forced full flush ("
-                  TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
-                  env->tlb_flush_addr, env->tlb_flush_mask);
-
-        v_tlb_flush_by_mmuidx(cpu, idxmap);
-        return;
-    }
 
-    addr &= TARGET_PAGE_MASK;
-    page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+    tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx:0x%lx\n",
+              page, addr, mmu_idx_bitmap);
 
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         if (test_bit(mmu_idx, &mmu_idx_bitmap)) {
@@ -251,6 +275,48 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+static void tlb_check_page_and_flush_by_mmuidx_async_work(CPUState *cpu,
+                                                          run_on_cpu_data data)
+{
+    CPUArchState *env = cpu->env_ptr;
+    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
+    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
+    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
+
+    tlb_debug("addr:"TARGET_FMT_lx" mmu_idx: %04lx\n", addr, mmu_idx_bitmap);
+
+    /* Check if we need to flush due to large pages.  */
+    if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) {
+        tlb_debug("forced full flush ("
+                  TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
+                  env->tlb_flush_addr, env->tlb_flush_mask);
+
+        tlb_flush_by_mmuidx_async_work(cpu,
+                                       RUN_ON_CPU_HOST_INT(mmu_idx_bitmap));
+    } else {
+        tlb_flush_page_by_mmuidx_async_work(cpu, data);
+    }
+}
+
+void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
+{
+    target_ulong addr_and_mmu_idx;
+
+    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%" PRIx16 "\n", addr, idxmap);
+
+    /* This should already be page aligned */
+    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
+    addr_and_mmu_idx |= idxmap;
+
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_check_page_and_flush_by_mmuidx_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+    } else {
+        tlb_check_page_and_flush_by_mmuidx_async_work(
+            cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+    }
+}
+
 void tlb_flush_page_all(target_ulong addr)
 {
     CPUState *cpu;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index e80bf7a64a..3e61c880da 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -407,7 +407,7 @@ struct CPUState {
      * avoid potential races. The aim of the flag is to avoid
      * unnecessary flushes.
      */
-    bool pending_tlb_flush;
+    uint16_t pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 18/24] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (16 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 17/24] cputlb: add tlb_flush_by_mmuidx async routines Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 19/24] cputlb: introduce tlb_flush_*_all_cpus[_synced] Alex Bennée
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
in TLB entries to force the slow-path on writes. This is used to mark
page ranges containing code which has been translated so it can be
invalidated if written to. To do this safely we need to ensure the TLB
entries in question for all vCPUs are updated before we attempt to run
the code otherwise a race could be introduced.

To achieve this we atomically set the flag in tlb_reset_dirty_range and
take care when setting it when the TLB entry is filled.

On 32 bit systems attempting to emulate 64 bit guests we don't even
bother as we might not have the atomic primitives available. MTTCG is
disabled in this case and can't be forced on. The copy_tlb_helper
function helps keep the atomic semantics in one place to avoid
confusion.

The dirty helper function is made static as it isn't used outside of
cputlb.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c              | 120 +++++++++++++++++++++++++++++++++++++++-----------
 include/exec/cputlb.h |   2 -
 2 files changed, 95 insertions(+), 27 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index c50254be26..65003350e3 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -342,32 +342,90 @@ void tlb_unprotect_code(ram_addr_t ram_addr)
     cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_CODE);
 }
 
-static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
-{
-    return (tlbe->addr_write & (TLB_INVALID_MASK|TLB_MMIO|TLB_NOTDIRTY)) == 0;
-}
 
-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
+/*
+ * Dirty write flag handling
+ *
+ * When the TCG code writes to a location it looks up the address in
+ * the TLB and uses that data to compute the final address. If any of
+ * the lower bits of the address are set then the slow path is forced.
+ * There are a number of reasons to do this but for normal RAM the
+ * most usual is detecting writes to code regions which may invalidate
+ * generated code.
+ *
+ * Because we want other vCPUs to respond to changes straight away we
+ * update the te->addr_write field atomically. If the TLB entry has
+ * been changed by the vCPU in the mean time we skip the update.
+ *
+ * As this function uses atomic accesses we also need to ensure
+ * updates to tlb_entries follow the same access rules. We don't need
+ * to worry about this for oversized guests as MTTCG is disabled for
+ * them.
+ */
+
+static void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
                            uintptr_t length)
 {
-    uintptr_t addr;
+#if TCG_OVERSIZED_GUEST
+    uintptr_t addr = tlb_entry->addr_write;
 
-    if (tlb_is_dirty_ram(tlb_entry)) {
-        addr = (tlb_entry->addr_write & TARGET_PAGE_MASK) + tlb_entry->addend;
+    if ((addr & (TLB_INVALID_MASK | TLB_MMIO | TLB_NOTDIRTY)) == 0) {
+        addr &= TARGET_PAGE_MASK;
+        addr += tlb_entry->addend;
         if ((addr - start) < length) {
             tlb_entry->addr_write |= TLB_NOTDIRTY;
         }
     }
+#else
+    /* paired with atomic_mb_set in tlb_set_page_with_attrs */
+    uintptr_t orig_addr = atomic_mb_read(&tlb_entry->addr_write);
+    uintptr_t addr = orig_addr;
+
+    if ((addr & (TLB_INVALID_MASK | TLB_MMIO | TLB_NOTDIRTY)) == 0) {
+        addr &= TARGET_PAGE_MASK;
+        addr += atomic_read(&tlb_entry->addend);
+        if ((addr - start) < length) {
+            uintptr_t notdirty_addr = orig_addr | TLB_NOTDIRTY;
+            atomic_cmpxchg(&tlb_entry->addr_write, orig_addr, notdirty_addr);
+        }
+    }
+#endif
+}
+
+/* For atomic correctness when running MTTCG we need to use the right
+ * primitives when copying entries */
+static inline void copy_tlb_helper(CPUTLBEntry *d, CPUTLBEntry *s,
+                                   bool atomic_set)
+{
+#if TCG_OVERSIZED_GUEST
+    *d = *s;
+#else
+    if (atomic_set) {
+        d->addr_read = s->addr_read;
+        d->addr_code = s->addr_code;
+        atomic_set(&d->addend, atomic_read(&s->addend));
+        /* Pairs with flag setting in tlb_reset_dirty_range */
+        atomic_mb_set(&d->addr_write, atomic_read(&s->addr_write));
+    } else {
+        d->addr_read = s->addr_read;
+        d->addr_write = atomic_read(&s->addr_write);
+        d->addr_code = s->addr_code;
+        d->addend = atomic_read(&s->addend);
+    }
+#endif
 }
 
+/* This is a cross vCPU call (i.e. another vCPU resetting the flags of
+ * the target vCPU). As such care needs to be taken that we don't
+ * dangerously race with another vCPU update. The only thing actually
+ * updated is the target TLB entry ->addr_write flags.
+ */
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 {
     CPUArchState *env;
 
     int mmu_idx;
 
-    assert_cpu_is_self(cpu);
-
     env = cpu->env_ptr;
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         unsigned int i;
@@ -455,7 +513,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     target_ulong address;
     target_ulong code_address;
     uintptr_t addend;
-    CPUTLBEntry *te;
+    CPUTLBEntry *te, *tv, tn;
     hwaddr iotlb, xlat, sz;
     unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
     int asidx = cpu_asidx_from_attrs(cpu, attrs);
@@ -490,41 +548,50 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
 
     index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     te = &env->tlb_table[mmu_idx][index];
-
     /* do not discard the translation in te, evict it into a victim tlb */
-    env->tlb_v_table[mmu_idx][vidx] = *te;
+    tv = &env->tlb_v_table[mmu_idx][vidx];
+
+    /* addr_write can race with tlb_reset_dirty_range */
+    copy_tlb_helper(tv, te, true);
+
     env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
 
     /* refill the tlb */
     env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
     env->iotlb[mmu_idx][index].attrs = attrs;
-    te->addend = addend - vaddr;
+
+    /* Now calculate the new entry */
+    tn.addend = addend - vaddr;
     if (prot & PAGE_READ) {
-        te->addr_read = address;
+        tn.addr_read = address;
     } else {
-        te->addr_read = -1;
+        tn.addr_read = -1;
     }
 
     if (prot & PAGE_EXEC) {
-        te->addr_code = code_address;
+        tn.addr_code = code_address;
     } else {
-        te->addr_code = -1;
+        tn.addr_code = -1;
     }
+
+    tn.addr_write = -1;
     if (prot & PAGE_WRITE) {
         if ((memory_region_is_ram(section->mr) && section->readonly)
             || memory_region_is_romd(section->mr)) {
             /* Write access calls the I/O callback.  */
-            te->addr_write = address | TLB_MMIO;
+            tn.addr_write = address | TLB_MMIO;
         } else if (memory_region_is_ram(section->mr)
                    && cpu_physical_memory_is_clean(
                         memory_region_get_ram_addr(section->mr) + xlat)) {
-            te->addr_write = address | TLB_NOTDIRTY;
+            tn.addr_write = address | TLB_NOTDIRTY;
         } else {
-            te->addr_write = address;
+            tn.addr_write = address;
         }
-    } else {
-        te->addr_write = -1;
     }
+
+    /* Pairs with flag setting in tlb_reset_dirty_range */
+    copy_tlb_helper(te, &tn, true);
+    /* atomic_mb_set(&te->addr_write, write_address); */
 }
 
 /* Add a new TLB entry, but without specifying the memory
@@ -687,10 +754,13 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
         if (cmp == page) {
             /* Found entry in victim tlb, swap tlb and iotlb.  */
             CPUTLBEntry tmptlb, *tlb = &env->tlb_table[mmu_idx][index];
+
+            copy_tlb_helper(&tmptlb, tlb, false);
+            copy_tlb_helper(tlb, vtlb, true);
+            copy_tlb_helper(vtlb, &tmptlb, true);
+
             CPUIOTLBEntry tmpio, *io = &env->iotlb[mmu_idx][index];
             CPUIOTLBEntry *vio = &env->iotlb_v[mmu_idx][vidx];
-
-            tmptlb = *tlb; *tlb = *vtlb; *vtlb = tmptlb;
             tmpio = *io; *io = *vio; *vio = tmpio;
             return true;
         }
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index d454c005b7..3f941783c5 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -23,8 +23,6 @@
 /* cputlb.c */
 void tlb_protect_code(ram_addr_t ram_addr);
 void tlb_unprotect_code(ram_addr_t ram_addr);
-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
-                           uintptr_t length);
 extern int tlb_flush_count;
 
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 19/24] cputlb: introduce tlb_flush_*_all_cpus[_synced]
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (17 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 18/24] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 20/24] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Paolo Bonzini, Peter Crosthwaite,
	Richard Henderson

This introduces support to the cputlb API for flushing all CPUs TLBs
with one call. This avoids the need for target helpers to iterate
through the vCPUs themselves.

An additional variant of the API (_synced) will cause the source vCPUs
work to be scheduled as "safe work". The result will be all the flush
operations will be complete by the time the originating vCPU executes
its safe work. The calling implementation can either end the TB
straight away (which will then pick up the cpu->exit_request on
entering the next block) or defer the exit until the architectural
sync point (usually a barrier instruction).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 cputlb.c                | 108 +++++++++++++++++++++++++++++++++++++++++---
 include/exec/exec-all.h | 116 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 215 insertions(+), 9 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 65003350e3..7fa7fefa05 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -73,6 +73,25 @@ QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
 QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
 #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
 
+/* flush_all_helper: run fn across all cpus
+ *
+ * If the wait flag is set then the src cpu's helper will be queued as
+ * "safe" work and the loop exited creating a synchronisation point
+ * where all queued work will be finished before execution starts
+ * again.
+ */
+static void flush_all_helper(CPUState *src, run_on_cpu_func fn,
+                             run_on_cpu_data d)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        if (cpu != src) {
+            async_run_on_cpu(cpu, fn, d);
+        }
+    }
+}
+
 /* statistics */
 int tlb_flush_count;
 
@@ -128,6 +147,20 @@ void tlb_flush(CPUState *cpu)
     }
 }
 
+void tlb_flush_all_cpus(CPUState *src_cpu)
+{
+    const run_on_cpu_func fn = tlb_flush_global_async_work;
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_NULL);
+    fn(src_cpu, RUN_ON_CPU_NULL);
+}
+
+void tlb_flush_all_cpus_synced(CPUState *src_cpu)
+{
+    const run_on_cpu_func fn = tlb_flush_global_async_work;
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_NULL);
+    async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_NULL);
+}
+
 static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
@@ -178,6 +211,29 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
     }
 }
 
+void tlb_flush_by_mmuidx_all_cpus(CPUState *src_cpu, uint16_t idxmap)
+{
+    const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work;
+
+    tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap);
+
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap));
+    fn(src_cpu, RUN_ON_CPU_HOST_INT(idxmap));
+}
+
+void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
+                                                       uint16_t idxmap)
+{
+    const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work;
+
+    tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap);
+
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap));
+    async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap));
+}
+
+
+
 static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
 {
     if (addr == (tlb_entry->addr_read &
@@ -317,14 +373,54 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap)
     }
 }
 
-void tlb_flush_page_all(target_ulong addr)
+void tlb_flush_page_by_mmuidx_all_cpus(CPUState *src_cpu, target_ulong addr,
+                                       uint16_t idxmap)
 {
-    CPUState *cpu;
+    const run_on_cpu_func fn = tlb_check_page_and_flush_by_mmuidx_async_work;
+    target_ulong addr_and_mmu_idx;
 
-    CPU_FOREACH(cpu) {
-        async_run_on_cpu(cpu, tlb_flush_page_async_work,
-                         RUN_ON_CPU_TARGET_PTR(addr));
-    }
+    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap);
+
+    /* This should already be page aligned */
+    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
+    addr_and_mmu_idx |= idxmap;
+
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+    fn(src_cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+}
+
+void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
+                                                            target_ulong addr,
+                                                            uint16_t idxmap)
+{
+    const run_on_cpu_func fn = tlb_check_page_and_flush_by_mmuidx_async_work;
+    target_ulong addr_and_mmu_idx;
+
+    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap);
+
+    /* This should already be page aligned */
+    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
+    addr_and_mmu_idx |= idxmap;
+
+    flush_all_helper(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+    async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+}
+
+void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr)
+{
+    const run_on_cpu_func fn = tlb_flush_page_async_work;
+
+    flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr));
+    fn(src, RUN_ON_CPU_TARGET_PTR(addr));
+}
+
+void tlb_flush_page_all_cpus_synced(CPUState *src,
+                                                  target_ulong addr)
+{
+    const run_on_cpu_func fn = tlb_flush_page_async_work;
+
+    flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr));
+    async_safe_run_on_cpu(src, fn, RUN_ON_CPU_TARGET_PTR(addr));
 }
 
 /* update the TLBs so that writes to code in the virtual page 'addr'
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index e94e6849dd..bcde1e6a14 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -93,6 +93,27 @@ void cpu_address_space_init(CPUState *cpu, AddressSpace *as, int asidx);
  */
 void tlb_flush_page(CPUState *cpu, target_ulong addr);
 /**
+ * tlb_flush_page_all_cpus:
+ * @cpu: src CPU of the flush
+ * @addr: virtual address of page to be flushed
+ *
+ * Flush one page from the TLB of the specified CPU, for all
+ * MMU indexes.
+ */
+void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr);
+/**
+ * tlb_flush_page_all_cpus_synced:
+ * @cpu: src CPU of the flush
+ * @addr: virtual address of page to be flushed
+ *
+ * Flush one page from the TLB of the specified CPU, for all MMU
+ * indexes like tlb_flush_page_all_cpus except the source vCPUs work
+ * is scheduled as safe work meaning all flushes will be complete once
+ * the source vCPUs safe work is complete. This will depend on when
+ * the guests translation ends the TB.
+ */
+void tlb_flush_page_all_cpus_synced(CPUState *src, target_ulong addr);
+/**
  * tlb_flush:
  * @cpu: CPU whose TLB should be flushed
  *
@@ -103,6 +124,21 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr);
  */
 void tlb_flush(CPUState *cpu);
 /**
+ * tlb_flush_all_cpus:
+ * @cpu: src CPU of the flush
+ */
+void tlb_flush_all_cpus(CPUState *src_cpu);
+/**
+ * tlb_flush_all_cpus_synced:
+ * @cpu: src CPU of the flush
+ *
+ * Like tlb_flush_all_cpus except this except the source vCPUs work is
+ * scheduled as safe work meaning all flushes will be complete once
+ * the source vCPUs safe work is complete. This will depend on when
+ * the guests translation ends the TB.
+ */
+void tlb_flush_all_cpus_synced(CPUState *src_cpu);
+/**
  * tlb_flush_page_by_mmuidx:
  * @cpu: CPU whose TLB should be flushed
  * @addr: virtual address of page to be flushed
@@ -114,8 +150,34 @@ void tlb_flush(CPUState *cpu);
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr,
                               uint16_t idxmap);
 /**
+ * tlb_flush_page_by_mmuidx_all_cpus:
+ * @cpu: Originating CPU of the flush
+ * @addr: virtual address of page to be flushed
+ * @idxmap: bitmap of MMU indexes to flush
+ *
+ * Flush one page from the TLB of all CPUs, for the specified
+ * MMU indexes.
+ */
+void tlb_flush_page_by_mmuidx_all_cpus(CPUState *cpu, target_ulong addr,
+                                       uint16_t idxmap);
+/**
+ * tlb_flush_page_by_mmuidx_all_cpus_synced:
+ * @cpu: Originating CPU of the flush
+ * @addr: virtual address of page to be flushed
+ * @idxmap: bitmap of MMU indexes to flush
+ *
+ * Flush one page from the TLB of all CPUs, for the specified MMU
+ * indexes like tlb_flush_page_by_mmuidx_all_cpus except the source
+ * vCPUs work is scheduled as safe work meaning all flushes will be
+ * complete once  the source vCPUs safe work is complete. This will
+ * depend on when the guests translation ends the TB.
+ */
+void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *cpu, target_ulong addr,
+                                              uint16_t idxmap);
+/**
  * tlb_flush_by_mmuidx:
  * @cpu: CPU whose TLB should be flushed
+ * @wait: If true ensure synchronisation by exiting the cpu_loop
  * @idxmap: bitmap of MMU indexes to flush
  *
  * Flush all entries from the TLB of the specified CPU, for the specified
@@ -123,6 +185,27 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr,
  */
 void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap);
 /**
+ * tlb_flush_by_mmuidx_all_cpus:
+ * @cpu: Originating CPU of the flush
+ * @idxmap: bitmap of MMU indexes to flush
+ *
+ * Flush all entries from all TLBs of all CPUs, for the specified
+ * MMU indexes.
+ */
+void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap);
+/**
+ * tlb_flush_by_mmuidx_all_cpus_synced:
+ * @cpu: Originating CPU of the flush
+ * @idxmap: bitmap of MMU indexes to flush
+ *
+ * Flush all entries from all TLBs of all CPUs, for the specified
+ * MMU indexes like tlb_flush_by_mmuidx_all_cpus except except the source
+ * vCPUs work is scheduled as safe work meaning all flushes will be
+ * complete once  the source vCPUs safe work is complete. This will
+ * depend on when the guests translation ends the TB.
+ */
+void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
+/**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
  * @vaddr: virtual address of page to add entry for
@@ -159,16 +242,26 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr);
 void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
                  uintptr_t retaddr);
-void tlb_flush_page_all(target_ulong addr);
 #else
 static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
 {
 }
-
+static inline void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr)
+{
+}
+static inline void tlb_flush_page_all_cpus_synced(CPUState *src,
+                                                  target_ulong addr)
+{
+}
 static inline void tlb_flush(CPUState *cpu)
 {
 }
-
+static inline void tlb_flush_all_cpus(CPUState *src_cpu)
+{
+}
+static inline void tlb_flush_all_cpus_synced(CPUState *src_cpu)
+{
+}
 static inline void tlb_flush_page_by_mmuidx(CPUState *cpu,
                                             target_ulong addr, uint16_t idxmap)
 {
@@ -177,6 +270,23 @@ static inline void tlb_flush_page_by_mmuidx(CPUState *cpu,
 static inline void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap)
 {
 }
+static inline void tlb_flush_page_by_mmuidx_all_cpus(CPUState *cpu,
+                                                     target_ulong addr,
+                                                     uint16_t idxmap)
+{
+}
+static inline void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *cpu,
+                                                            target_ulong addr,
+                                                            uint16_t idxmap)
+{
+}
+static inline void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap)
+{
+}
+static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
+                                                       uint16_t idxmap)
+{
+}
 #endif
 
 #define CODE_GEN_ALIGN           16 /* must be >= of the size of a icache line */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 20/24] target-arm/powerctl: defer cpu reset work to CPU context
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (18 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 19/24] cputlb: introduce tlb_flush_*_all_cpus[_synced] Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 21/24] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée, open list:ARM

When switching a new vCPU on we want to complete a bunch of the setup
work before we start scheduling the vCPU thread. To do this cleanly we
defer vCPU setup to async work which will run the vCPUs execution
context as the thread is woken up. The scheduling of the work will kick
the vCPU awake.

This avoids potential races in MTTCG system emulation.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/arm-powerctl.c | 202 +++++++++++++++++++++++++++++++---------------
 target/arm/arm-powerctl.h |   2 +
 target/arm/cpu.c          |   4 +-
 target/arm/cpu.h          |  15 +++-
 target/arm/kvm.c          |   7 +-
 target/arm/machine.c      |  41 +++++++++-
 target/arm/psci.c         |   4 +-
 7 files changed, 201 insertions(+), 74 deletions(-)

diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
index fbb7a15daa..25207cb850 100644
--- a/target/arm/arm-powerctl.c
+++ b/target/arm/arm-powerctl.c
@@ -14,6 +14,7 @@
 #include "internals.h"
 #include "arm-powerctl.h"
 #include "qemu/log.h"
+#include "qemu/main-loop.h"
 #include "exec/exec-all.h"
 
 #ifndef DEBUG_ARM_POWERCTL
@@ -48,11 +49,93 @@ CPUState *arm_get_cpu_by_id(uint64_t id)
     return NULL;
 }
 
+struct CpuOnInfo {
+    uint64_t entry;
+    uint64_t context_id;
+    uint32_t target_el;
+    bool target_aa64;
+};
+
+
+static void arm_set_cpu_on_async_work(CPUState *target_cpu_state,
+                                      run_on_cpu_data data)
+{
+    ARMCPU *target_cpu = ARM_CPU(target_cpu_state);
+    struct CpuOnInfo *info = (struct CpuOnInfo *) data.host_ptr;
+
+    /* Initialize the cpu we are turning on */
+    cpu_reset(target_cpu_state);
+    target_cpu_state->halted = 0;
+
+    if (info->target_aa64) {
+        if ((info->target_el < 3) && arm_feature(&target_cpu->env,
+                                                 ARM_FEATURE_EL3)) {
+            /*
+             * As target mode is AArch64, we need to set lower
+             * exception level (the requested level 2) to AArch64
+             */
+            target_cpu->env.cp15.scr_el3 |= SCR_RW;
+        }
+
+        if ((info->target_el < 2) && arm_feature(&target_cpu->env,
+                                                 ARM_FEATURE_EL2)) {
+            /*
+             * As target mode is AArch64, we need to set lower
+             * exception level (the requested level 1) to AArch64
+             */
+            target_cpu->env.cp15.hcr_el2 |= HCR_RW;
+        }
+
+        target_cpu->env.pstate = aarch64_pstate_mode(info->target_el, true);
+    } else {
+        /* We are requested to boot in AArch32 mode */
+        static const uint32_t mode_for_el[] = { 0,
+                                                ARM_CPU_MODE_SVC,
+                                                ARM_CPU_MODE_HYP,
+                                                ARM_CPU_MODE_SVC };
+
+        cpsr_write(&target_cpu->env, mode_for_el[info->target_el], CPSR_M,
+                   CPSRWriteRaw);
+    }
+
+    if (info->target_el == 3) {
+        /* Processor is in secure mode */
+        target_cpu->env.cp15.scr_el3 &= ~SCR_NS;
+    } else {
+        /* Processor is not in secure mode */
+        target_cpu->env.cp15.scr_el3 |= SCR_NS;
+    }
+
+    /* We check if the started CPU is now at the correct level */
+    assert(info->target_el == arm_current_el(&target_cpu->env));
+
+    if (info->target_aa64) {
+        target_cpu->env.xregs[0] = info->context_id;
+        target_cpu->env.thumb = false;
+    } else {
+        target_cpu->env.regs[0] = info->context_id;
+        target_cpu->env.thumb = info->entry & 1;
+        info->entry &= 0xfffffffe;
+    }
+
+    /* Start the new CPU at the requested address */
+    cpu_set_pc(target_cpu_state, info->entry);
+
+    g_free(info);
+
+    /* Finally set the power status */
+    assert(qemu_mutex_iothread_locked());
+    target_cpu->power_state = PSCI_ON;
+}
+
 int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
                    uint32_t target_el, bool target_aa64)
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
+    struct CpuOnInfo *info;
+
+    assert(qemu_mutex_iothread_locked());
 
     DPRINTF("cpu %" PRId64 " (EL %d, %s) @ 0x%" PRIx64 " with R0 = 0x%" PRIx64
             "\n", cpuid, target_el, target_aa64 ? "aarch64" : "aarch32", entry,
@@ -77,7 +160,7 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
     }
 
     target_cpu = ARM_CPU(target_cpu_state);
-    if (!target_cpu->powered_off) {
+    if (target_cpu->power_state == PSCI_ON) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "[ARM]%s: CPU %" PRId64 " is already on\n",
                       __func__, cpuid);
@@ -109,74 +192,54 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
 
-    /* Initialize the cpu we are turning on */
-    cpu_reset(target_cpu_state);
-    target_cpu->powered_off = false;
-    target_cpu_state->halted = 0;
-
-    if (target_aa64) {
-        if ((target_el < 3) && arm_feature(&target_cpu->env, ARM_FEATURE_EL3)) {
-            /*
-             * As target mode is AArch64, we need to set lower
-             * exception level (the requested level 2) to AArch64
-             */
-            target_cpu->env.cp15.scr_el3 |= SCR_RW;
-        }
-
-        if ((target_el < 2) && arm_feature(&target_cpu->env, ARM_FEATURE_EL2)) {
-            /*
-             * As target mode is AArch64, we need to set lower
-             * exception level (the requested level 1) to AArch64
-             */
-            target_cpu->env.cp15.hcr_el2 |= HCR_RW;
-        }
-
-        target_cpu->env.pstate = aarch64_pstate_mode(target_el, true);
-    } else {
-        /* We are requested to boot in AArch32 mode */
-        static uint32_t mode_for_el[] = { 0,
-                                          ARM_CPU_MODE_SVC,
-                                          ARM_CPU_MODE_HYP,
-                                          ARM_CPU_MODE_SVC };
-
-        cpsr_write(&target_cpu->env, mode_for_el[target_el], CPSR_M,
-                   CPSRWriteRaw);
-    }
-
-    if (target_el == 3) {
-        /* Processor is in secure mode */
-        target_cpu->env.cp15.scr_el3 &= ~SCR_NS;
-    } else {
-        /* Processor is not in secure mode */
-        target_cpu->env.cp15.scr_el3 |= SCR_NS;
-    }
-
-    /* We check if the started CPU is now at the correct level */
-    assert(target_el == arm_current_el(&target_cpu->env));
-
-    if (target_aa64) {
-        target_cpu->env.xregs[0] = context_id;
-        target_cpu->env.thumb = false;
-    } else {
-        target_cpu->env.regs[0] = context_id;
-        target_cpu->env.thumb = entry & 1;
-        entry &= 0xfffffffe;
+    /*
+     * If another CPU has powered the target on we are in the state
+     * ON_PENDING and additional attempts to power on the CPU should
+     * fail (see 6.6 Implementation CPU_ON/CPU_OFF races in the PSCI
+     * spec)
+     */
+    if (target_cpu->power_state == PSCI_ON_PENDING) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "[ARM]%s: CPU %" PRId64 " is already powering on\n",
+                      __func__, cpuid);
+        return QEMU_ARM_POWERCTL_ON_PENDING;
     }
 
-    /* Start the new CPU at the requested address */
-    cpu_set_pc(target_cpu_state, entry);
+    /* To avoid racing with a CPU we are just kicking off we do the
+     * final bit of preparation for the work in the target CPUs
+     * context.
+     */
+    info = g_new(struct CpuOnInfo, 1);
+    info->entry = entry;
+    info->context_id = context_id;
+    info->target_el = target_el;
+    info->target_aa64 = target_aa64;
 
-    qemu_cpu_kick(target_cpu_state);
+    async_run_on_cpu(target_cpu_state, arm_set_cpu_on_async_work,
+                     RUN_ON_CPU_HOST_PTR(info));
 
     /* We are good to go */
     return QEMU_ARM_POWERCTL_RET_SUCCESS;
 }
 
+static void arm_set_cpu_off_async_work(CPUState *target_cpu_state,
+                                       run_on_cpu_data data)
+{
+    ARMCPU *target_cpu = ARM_CPU(target_cpu_state);
+
+    assert(qemu_mutex_iothread_locked());
+    target_cpu->power_state = PSCI_OFF;
+    target_cpu_state->halted = 1;
+    target_cpu_state->exception_index = EXCP_HLT;
+}
+
 int arm_set_cpu_off(uint64_t cpuid)
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
 
+    assert(qemu_mutex_iothread_locked());
+
     DPRINTF("cpu %" PRId64 "\n", cpuid);
 
     /* change to the cpu we are powering up */
@@ -185,27 +248,34 @@ int arm_set_cpu_off(uint64_t cpuid)
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
     target_cpu = ARM_CPU(target_cpu_state);
-    if (target_cpu->powered_off) {
+    if (target_cpu->power_state == PSCI_OFF) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "[ARM]%s: CPU %" PRId64 " is already off\n",
                       __func__, cpuid);
         return QEMU_ARM_POWERCTL_IS_OFF;
     }
 
-    target_cpu->powered_off = true;
-    target_cpu_state->halted = 1;
-    target_cpu_state->exception_index = EXCP_HLT;
-    cpu_loop_exit(target_cpu_state);
-    /* notreached */
+    /* Queue work to run under the target vCPUs context */
+    async_run_on_cpu(target_cpu_state, arm_set_cpu_off_async_work,
+                     RUN_ON_CPU_NULL);
 
     return QEMU_ARM_POWERCTL_RET_SUCCESS;
 }
 
+static void arm_reset_cpu_async_work(CPUState *target_cpu_state,
+                                     run_on_cpu_data data)
+{
+    /* Reset the cpu */
+    cpu_reset(target_cpu_state);
+}
+
 int arm_reset_cpu(uint64_t cpuid)
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
 
+    assert(qemu_mutex_iothread_locked());
+
     DPRINTF("cpu %" PRId64 "\n", cpuid);
 
     /* change to the cpu we are resetting */
@@ -214,15 +284,17 @@ int arm_reset_cpu(uint64_t cpuid)
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
     target_cpu = ARM_CPU(target_cpu_state);
-    if (target_cpu->powered_off) {
+
+    if (target_cpu->power_state == PSCI_OFF) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "[ARM]%s: CPU %" PRId64 " is off\n",
                       __func__, cpuid);
         return QEMU_ARM_POWERCTL_IS_OFF;
     }
 
-    /* Reset the cpu */
-    cpu_reset(target_cpu_state);
+    /* Queue work to run under the target vCPUs context */
+    async_run_on_cpu(target_cpu_state, arm_reset_cpu_async_work,
+                     RUN_ON_CPU_NULL);
 
     return QEMU_ARM_POWERCTL_RET_SUCCESS;
 }
diff --git a/target/arm/arm-powerctl.h b/target/arm/arm-powerctl.h
index 98ee04989b..04353923c0 100644
--- a/target/arm/arm-powerctl.h
+++ b/target/arm/arm-powerctl.h
@@ -17,6 +17,7 @@
 #define QEMU_ARM_POWERCTL_INVALID_PARAM QEMU_PSCI_RET_INVALID_PARAMS
 #define QEMU_ARM_POWERCTL_ALREADY_ON QEMU_PSCI_RET_ALREADY_ON
 #define QEMU_ARM_POWERCTL_IS_OFF QEMU_PSCI_RET_DENIED
+#define QEMU_ARM_POWERCTL_ON_PENDING QEMU_PSCI_RET_ON_PENDING
 
 /*
  * arm_get_cpu_by_id:
@@ -43,6 +44,7 @@ CPUState *arm_get_cpu_by_id(uint64_t cpuid);
  * Returns: QEMU_ARM_POWERCTL_RET_SUCCESS on success.
  * QEMU_ARM_POWERCTL_INVALID_PARAM if bad parameters are provided.
  * QEMU_ARM_POWERCTL_ALREADY_ON if the CPU was already started.
+ * QEMU_ARM_POWERCTL_ON_PENDING if the CPU is still powering up
  */
 int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
                    uint32_t target_el, bool target_aa64);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 4a069f6985..f7157dc0e5 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -45,7 +45,7 @@ static bool arm_cpu_has_work(CPUState *cs)
 {
     ARMCPU *cpu = ARM_CPU(cs);
 
-    return !cpu->powered_off
+    return (cpu->power_state != PSCI_OFF)
         && cs->interrupt_request &
         (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
          | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
@@ -132,7 +132,7 @@ static void arm_cpu_reset(CPUState *s)
     env->vfp.xregs[ARM_VFP_MVFR1] = cpu->mvfr1;
     env->vfp.xregs[ARM_VFP_MVFR2] = cpu->mvfr2;
 
-    cpu->powered_off = cpu->start_powered_off;
+    cpu->power_state = cpu->start_powered_off ? PSCI_OFF : PSCI_ON;
     s->halted = cpu->start_powered_off;
 
     if (arm_feature(env, ARM_FEATURE_IWMMXT)) {
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 0956a54e89..e285ba3b4b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -526,6 +526,15 @@ typedef struct CPUARMState {
  */
 typedef void ARMELChangeHook(ARMCPU *cpu, void *opaque);
 
+
+/* These values map onto the return values for
+ * QEMU_PSCI_0_2_FN_AFFINITY_INFO */
+typedef enum ARMPSCIState {
+    PSCI_OFF = 0,
+    PSCI_ON = 1,
+    PSCI_ON_PENDING = 2
+} ARMPSCIState;
+
 /**
  * ARMCPU:
  * @env: #CPUARMState
@@ -582,8 +591,10 @@ struct ARMCPU {
 
     /* Should CPU start in PSCI powered-off state? */
     bool start_powered_off;
-    /* CPU currently in PSCI powered-off state */
-    bool powered_off;
+
+    /* Current power state, access guarded by BQL */
+    ARMPSCIState power_state;
+
     /* CPU has virtualization extension */
     bool has_el2;
     /* CPU has security extension */
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index c00b94e42a..395e986973 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -488,8 +488,8 @@ int kvm_arm_sync_mpstate_to_kvm(ARMCPU *cpu)
 {
     if (cap_has_mp_state) {
         struct kvm_mp_state mp_state = {
-            .mp_state =
-            cpu->powered_off ? KVM_MP_STATE_STOPPED : KVM_MP_STATE_RUNNABLE
+            .mp_state = (cpu->power_state == PSCI_OFF) ?
+            KVM_MP_STATE_STOPPED : KVM_MP_STATE_RUNNABLE
         };
         int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MP_STATE, &mp_state);
         if (ret) {
@@ -515,7 +515,8 @@ int kvm_arm_sync_mpstate_to_qemu(ARMCPU *cpu)
                     __func__, ret, strerror(-ret));
             abort();
         }
-        cpu->powered_off = (mp_state.mp_state == KVM_MP_STATE_STOPPED);
+        cpu->power_state = (mp_state.mp_state == KVM_MP_STATE_STOPPED) ?
+            PSCI_OFF : PSCI_ON;
     }
 
     return 0;
diff --git a/target/arm/machine.c b/target/arm/machine.c
index fa5ec76090..d8094a840b 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -211,6 +211,38 @@ static const VMStateInfo vmstate_cpsr = {
     .put = put_cpsr,
 };
 
+static int get_power(QEMUFile *f, void *opaque, size_t size,
+                    VMStateField *field)
+{
+    ARMCPU *cpu = opaque;
+    bool powered_off = qemu_get_byte(f);
+    cpu->power_state = powered_off ? PSCI_OFF : PSCI_ON;
+    return 0;
+}
+
+static int put_power(QEMUFile *f, void *opaque, size_t size,
+                    VMStateField *field, QJSON *vmdesc)
+{
+    ARMCPU *cpu = opaque;
+
+    /* Migration should never happen while we transition power states */
+
+    if (cpu->power_state == PSCI_ON ||
+        cpu->power_state == PSCI_OFF) {
+        bool powered_off = (cpu->power_state == PSCI_OFF) ? true : false;
+        qemu_put_byte(f, powered_off);
+        return 0;
+    } else {
+        return 1;
+    }
+}
+
+static const VMStateInfo vmstate_powered_off = {
+    .name = "powered_off",
+    .get = get_power,
+    .put = put_power,
+};
+
 static void cpu_pre_save(void *opaque)
 {
     ARMCPU *cpu = opaque;
@@ -329,7 +361,14 @@ const VMStateDescription vmstate_arm_cpu = {
         VMSTATE_UINT64(env.exception.vaddress, ARMCPU),
         VMSTATE_TIMER_PTR(gt_timer[GTIMER_PHYS], ARMCPU),
         VMSTATE_TIMER_PTR(gt_timer[GTIMER_VIRT], ARMCPU),
-        VMSTATE_BOOL(powered_off, ARMCPU),
+        {
+            .name = "power_state",
+            .version_id = 0,
+            .size = sizeof(bool),
+            .info = &vmstate_powered_off,
+            .flags = VMS_SINGLE,
+            .offset = 0,
+        },
         VMSTATE_END_OF_LIST()
     },
     .subsections = (const VMStateDescription*[]) {
diff --git a/target/arm/psci.c b/target/arm/psci.c
index 64bf82eea1..ade9fe2ede 100644
--- a/target/arm/psci.c
+++ b/target/arm/psci.c
@@ -127,7 +127,9 @@ void arm_handle_psci_call(ARMCPU *cpu)
                 break;
             }
             target_cpu = ARM_CPU(target_cpu_state);
-            ret = target_cpu->powered_off ? 1 : 0;
+
+            g_assert(qemu_mutex_iothread_locked());
+            ret = target_cpu->power_state;
             break;
         default:
             /* Everything above affinity level 0 is always on. */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 21/24] target-arm: don't generate WFE/YIELD calls for MTTCG
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (19 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 20/24] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete Alex Bennée
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée, open list:ARM

The WFE and YIELD instructions are really only hints and in TCG's case
they were useful to move the scheduling on from one vCPU to the next. In
the parallel context (MTTCG) this just causes an unnecessary cpu_exit
and contention of the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/op_helper.c     |  7 +++++++
 target/arm/translate-a64.c |  8 ++++++--
 target/arm/translate.c     | 20 ++++++++++++++++----
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 5f3e3bdae2..d64c8670fa 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -436,6 +436,13 @@ void HELPER(yield)(CPUARMState *env)
     ARMCPU *cpu = arm_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
 
+    /* When running in MTTCG we don't generate jumps to the yield and
+     * WFE helpers as it won't affect the scheduling of other vCPUs.
+     * If we wanted to more completely model WFE/SEV so we don't busy
+     * spin unnecessarily we would need to do something more involved.
+     */
+    g_assert(!parallel_cpus);
+
     /* This is a non-trappable hint instruction that generally indicates
      * that the guest is currently busy-looping. Yield control back to the
      * top level loop so that a more deserving VCPU has a chance to run.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index e61bbd6b3b..e15eae6d41 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1328,10 +1328,14 @@ static void handle_hint(DisasContext *s, uint32_t insn,
         s->is_jmp = DISAS_WFI;
         return;
     case 1: /* YIELD */
-        s->is_jmp = DISAS_YIELD;
+        if (!parallel_cpus) {
+            s->is_jmp = DISAS_YIELD;
+        }
         return;
     case 2: /* WFE */
-        s->is_jmp = DISAS_WFE;
+        if (!parallel_cpus) {
+            s->is_jmp = DISAS_WFE;
+        }
         return;
     case 4: /* SEV */
     case 5: /* SEVL */
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4436d8f3a2..abc1f77ee4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4404,20 +4404,32 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
     gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
+/*
+ * For WFI we will halt the vCPU until an IRQ. For WFE and YIELD we
+ * only call the helper when running single threaded TCG code to ensure
+ * the next round-robin scheduled vCPU gets a crack. In MTTCG mode we
+ * just skip this instruction. Currently the SEV/SEVL instructions
+ * which are *one* of many ways to wake the CPU from WFE are not
+ * implemented so we can't sleep like WFI does.
+ */
 static void gen_nop_hint(DisasContext *s, int val)
 {
     switch (val) {
     case 1: /* yield */
-        gen_set_pc_im(s, s->pc);
-        s->is_jmp = DISAS_YIELD;
+        if (!parallel_cpus) {
+            gen_set_pc_im(s, s->pc);
+            s->is_jmp = DISAS_YIELD;
+        }
         break;
     case 3: /* wfi */
         gen_set_pc_im(s, s->pc);
         s->is_jmp = DISAS_WFI;
         break;
     case 2: /* wfe */
-        gen_set_pc_im(s, s->pc);
-        s->is_jmp = DISAS_WFE;
+        if (!parallel_cpus) {
+            gen_set_pc_im(s, s->pc);
+            s->is_jmp = DISAS_WFE;
+        }
         break;
     case 4: /* sev */
     case 5: /* sevl */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (20 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 21/24] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-09-17 13:07   ` Dmitry Osipenko
  2017-02-24 11:21 ` [Qemu-devel] [PULL 23/24] hw/misc/imx6_src: defer clearing of SRC_SCR reset bits Alex Bennée
                   ` (3 subsequent siblings)
  25 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée, open list:ARM

Previously flushes on other vCPUs would only get serviced when they
exited their TranslationBlocks. While this isn't overly problematic it
violates the semantics of TLB flush from the point of view of source
vCPU.

To solve this we call the cputlb *_all_cpus_synced() functions to do
the flushes which ensures all flushes are completed by the time the
vCPU next schedules its own work. As the TLB instructions are modelled
as CP writes the TB ends at this point meaning cpu->exit_request will
be checked before the next instruction is executed.

Deferring the work until the architectural sync point is a possible
future optimisation.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
 1 file changed, 69 insertions(+), 96 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index b41d0494d1..bcedb4a808 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -536,41 +536,33 @@ static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush(other_cs);
-    }
+    tlb_flush_all_cpus_synced(cs);
 }
 
 static void tlbiasid_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush(other_cs);
-    }
+    tlb_flush_all_cpus_synced(cs);
 }
 
 static void tlbimva_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
-    }
+    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
 }
 
 static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
-    }
+    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
 }
 
 static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -587,14 +579,12 @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                   uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs,
-                            (1 << ARMMMUIdx_S12NSE1) |
-                            (1 << ARMMMUIdx_S12NSE0) |
-                            (1 << ARMMMUIdx_S2NS));
-    }
+    tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                        (1 << ARMMMUIdx_S12NSE1) |
+                                        (1 << ARMMMUIdx_S12NSE0) |
+                                        (1 << ARMMMUIdx_S2NS));
 }
 
 static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -621,7 +611,7 @@ static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr;
 
     if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
@@ -630,9 +620,8 @@ static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     pageaddr = sextract64(value << 12, 0, 40);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
-    }
+    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                             (1 << ARMMMUIdx_S2NS));
 }
 
 static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -646,11 +635,9 @@ static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                  uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
-    }
+    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -665,12 +652,11 @@ static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                  uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
-    }
+    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                             (1 << ARMMMUIdx_S1E2));
 }
 
 static const ARMCPRegInfo cp_reginfo[] = {
@@ -2904,8 +2890,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
 static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                     uint64_t value)
 {
-    ARMCPU *cpu = arm_env_get_cpu(env);
-    CPUState *cs = CPU(cpu);
+    CPUState *cs = ENV_GET_CPU(env);
 
     if (arm_is_secure_below_el3(env)) {
         tlb_flush_by_mmuidx(cs,
@@ -2921,19 +2906,17 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                       uint64_t value)
 {
+    CPUState *cs = ENV_GET_CPU(env);
     bool sec = arm_is_secure_below_el3(env);
-    CPUState *other_cs;
 
-    CPU_FOREACH(other_cs) {
-        if (sec) {
-            tlb_flush_by_mmuidx(other_cs,
-                                (1 << ARMMMUIdx_S1SE1) |
-                                (1 << ARMMMUIdx_S1SE0));
-        } else {
-            tlb_flush_by_mmuidx(other_cs,
-                                (1 << ARMMMUIdx_S12NSE1) |
-                                (1 << ARMMMUIdx_S12NSE0));
-        }
+    if (sec) {
+        tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                            (1 << ARMMMUIdx_S1SE1) |
+                                            (1 << ARMMMUIdx_S1SE0));
+    } else {
+        tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                            (1 << ARMMMUIdx_S12NSE1) |
+                                            (1 << ARMMMUIdx_S12NSE0));
     }
 }
 
@@ -2990,46 +2973,40 @@ static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
      * stage 2 translations, whereas most other scopes only invalidate
      * stage 1 translations.
      */
+    CPUState *cs = ENV_GET_CPU(env);
     bool sec = arm_is_secure_below_el3(env);
     bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
-    CPUState *other_cs;
-
-    CPU_FOREACH(other_cs) {
-        if (sec) {
-            tlb_flush_by_mmuidx(other_cs,
-                                (1 << ARMMMUIdx_S1SE1) |
-                                (1 << ARMMMUIdx_S1SE0));
-        } else if (has_el2) {
-            tlb_flush_by_mmuidx(other_cs,
-                                (1 << ARMMMUIdx_S12NSE1) |
-                                (1 << ARMMMUIdx_S12NSE0) |
-                                (1 << ARMMMUIdx_S2NS));
-        } else {
-            tlb_flush_by_mmuidx(other_cs,
-                                (1 << ARMMMUIdx_S12NSE1) |
-                                (1 << ARMMMUIdx_S12NSE0));
-        }
+
+    if (sec) {
+        tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                            (1 << ARMMMUIdx_S1SE1) |
+                                            (1 << ARMMMUIdx_S1SE0));
+    } else if (has_el2) {
+        tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                            (1 << ARMMMUIdx_S12NSE1) |
+                                            (1 << ARMMMUIdx_S12NSE0) |
+                                            (1 << ARMMMUIdx_S2NS));
+    } else {
+          tlb_flush_by_mmuidx_all_cpus_synced(cs,
+                                              (1 << ARMMMUIdx_S12NSE1) |
+                                              (1 << ARMMMUIdx_S12NSE0));
     }
 }
 
 static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                     uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
-    }
+    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                     uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E3));
-    }
+    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E3));
 }
 
 static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3086,43 +3063,40 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
 {
+    ARMCPU *cpu = arm_env_get_cpu(env);
+    CPUState *cs = CPU(cpu);
     bool sec = arm_is_secure_below_el3(env);
-    CPUState *other_cs;
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-    CPU_FOREACH(other_cs) {
-        if (sec) {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
-                                     (1 << ARMMMUIdx_S1SE1) |
-                                     (1 << ARMMMUIdx_S1SE0));
-        } else {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
-                                     (1 << ARMMMUIdx_S12NSE1) |
-                                     (1 << ARMMMUIdx_S12NSE0));
-        }
+    if (sec) {
+        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                                 (1 << ARMMMUIdx_S1SE1) |
+                                                 (1 << ARMMMUIdx_S1SE0));
+    } else {
+        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                                 (1 << ARMMMUIdx_S12NSE1) |
+                                                 (1 << ARMMMUIdx_S12NSE0));
     }
 }
 
 static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
-    }
+    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                             (1 << ARMMMUIdx_S1E2));
 }
 
 static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E3));
-    }
+    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                             (1 << ARMMMUIdx_S1E3));
 }
 
 static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3150,7 +3124,7 @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                       uint64_t value)
 {
-    CPUState *other_cs;
+    CPUState *cs = ENV_GET_CPU(env);
     uint64_t pageaddr;
 
     if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
@@ -3159,9 +3133,8 @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 
     pageaddr = sextract64(value << 12, 0, 48);
 
-    CPU_FOREACH(other_cs) {
-        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
-    }
+    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
+                                             (1 << ARMMMUIdx_S2NS));
 }
 
 static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 23/24] hw/misc/imx6_src: defer clearing of SRC_SCR reset bits
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (21 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-24 11:21 ` [Qemu-devel] [PULL 24/24] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-devel, Alex Bennée, Peter Chubb, open list:i.MX31

The arm_reset_cpu/set_cpu_on/set_cpu_off() functions do their work
asynchronously in the target vCPUs context. As a result we need to
ensure the SRC_SCR reset bits correctly report the reset status at the
right time. To do this we defer the clearing of the bit with an async
job which will run after the work queued by ARM powerctl functions.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/imx6_src.c | 58 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 9 deletions(-)

diff --git a/hw/misc/imx6_src.c b/hw/misc/imx6_src.c
index 55b817b8d7..edbb756c36 100644
--- a/hw/misc/imx6_src.c
+++ b/hw/misc/imx6_src.c
@@ -14,6 +14,7 @@
 #include "qemu/bitops.h"
 #include "qemu/log.h"
 #include "arm-powerctl.h"
+#include "qom/cpu.h"
 
 #ifndef DEBUG_IMX6_SRC
 #define DEBUG_IMX6_SRC 0
@@ -113,6 +114,45 @@ static uint64_t imx6_src_read(void *opaque, hwaddr offset, unsigned size)
     return value;
 }
 
+
+/* The reset is asynchronous so we need to defer clearing the reset
+ * bit until the work is completed.
+ */
+
+struct SRCSCRResetInfo {
+    IMX6SRCState *s;
+    int reset_bit;
+};
+
+static void imx6_clear_reset_bit(CPUState *cpu, run_on_cpu_data data)
+{
+    struct SRCSCRResetInfo *ri = data.host_ptr;
+    IMX6SRCState *s = ri->s;
+
+    assert(qemu_mutex_iothread_locked());
+
+    s->regs[SRC_SCR] = deposit32(s->regs[SRC_SCR], ri->reset_bit, 1, 0);
+    DPRINTF("reg[%s] <= 0x%" PRIx32 "\n",
+            imx6_src_reg_name(SRC_SCR), s->regs[SRC_SCR]);
+
+    g_free(ri);
+}
+
+static void imx6_defer_clear_reset_bit(int cpuid,
+                                       IMX6SRCState *s,
+                                       unsigned long reset_shift)
+{
+    struct SRCSCRResetInfo *ri;
+
+    ri = g_malloc(sizeof(struct SRCSCRResetInfo));
+    ri->s = s;
+    ri->reset_bit = reset_shift;
+
+    async_run_on_cpu(arm_get_cpu_by_id(cpuid), imx6_clear_reset_bit,
+                     RUN_ON_CPU_HOST_PTR(ri));
+}
+
+
 static void imx6_src_write(void *opaque, hwaddr offset, uint64_t value,
                            unsigned size)
 {
@@ -153,7 +193,7 @@ static void imx6_src_write(void *opaque, hwaddr offset, uint64_t value,
                 arm_set_cpu_off(3);
             }
             /* We clear the reset bits as the processor changed state */
-            clear_bit(CORE3_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(3, s, CORE3_RST_SHIFT);
             clear_bit(CORE3_RST_SHIFT, &change_mask);
         }
         if (EXTRACT(change_mask, CORE2_ENABLE)) {
@@ -162,11 +202,11 @@ static void imx6_src_write(void *opaque, hwaddr offset, uint64_t value,
                 arm_set_cpu_on(2, s->regs[SRC_GPR5], s->regs[SRC_GPR6],
                                3, false);
             } else {
-                /* CORE 3 is shut down */
+                /* CORE 2 is shut down */
                 arm_set_cpu_off(2);
             }
             /* We clear the reset bits as the processor changed state */
-            clear_bit(CORE2_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(2, s, CORE2_RST_SHIFT);
             clear_bit(CORE2_RST_SHIFT, &change_mask);
         }
         if (EXTRACT(change_mask, CORE1_ENABLE)) {
@@ -175,28 +215,28 @@ static void imx6_src_write(void *opaque, hwaddr offset, uint64_t value,
                 arm_set_cpu_on(1, s->regs[SRC_GPR3], s->regs[SRC_GPR4],
                                3, false);
             } else {
-                /* CORE 3 is shut down */
+                /* CORE 1 is shut down */
                 arm_set_cpu_off(1);
             }
             /* We clear the reset bits as the processor changed state */
-            clear_bit(CORE1_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(1, s, CORE1_RST_SHIFT);
             clear_bit(CORE1_RST_SHIFT, &change_mask);
         }
         if (EXTRACT(change_mask, CORE0_RST)) {
             arm_reset_cpu(0);
-            clear_bit(CORE0_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(0, s, CORE0_RST_SHIFT);
         }
         if (EXTRACT(change_mask, CORE1_RST)) {
             arm_reset_cpu(1);
-            clear_bit(CORE1_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(1, s, CORE1_RST_SHIFT);
         }
         if (EXTRACT(change_mask, CORE2_RST)) {
             arm_reset_cpu(2);
-            clear_bit(CORE2_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(2, s, CORE2_RST_SHIFT);
         }
         if (EXTRACT(change_mask, CORE3_RST)) {
             arm_reset_cpu(3);
-            clear_bit(CORE3_RST_SHIFT, &current_value);
+            imx6_defer_clear_reset_bit(3, s, CORE3_RST_SHIFT);
         }
         if (EXTRACT(change_mask, SW_IPU2_RST)) {
             /* We pretend the IPU2 is reset */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [Qemu-devel] [PULL 24/24] tcg: enable MTTCG by default for ARM on x86 hosts
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (22 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 23/24] hw/misc/imx6_src: defer clearing of SRC_SCR reset bits Alex Bennée
@ 2017-02-24 11:21 ` Alex Bennée
  2017-02-25 21:14 ` [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Peter Maydell
  2017-02-27 12:39 ` Paolo Bonzini
  25 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-24 11:21 UTC (permalink / raw)
  To: peter.maydell
  Cc: qemu-devel, Alex Bennée, Richard Henderson, open list:ARM

This enables the multi-threaded system emulation by default for ARMv7
and ARMv8 guests using the x86_64 TCG backend. This is because on the
guest side:

  - The ARM translate.c/translate-64.c have been converted to
    - use MTTCG safe atomic primitives
    - emit the appropriate barrier ops
  - The ARM machine has been updated to
    - hold the BQL when modifying shared cross-vCPU state
    - defer powerctl changes to async safe work

All the host backends support the barrier and atomic primitives but
need to provide same-or-better support for normal load/store
operations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
---
 configure             |  6 ++++++
 target/arm/cpu.h      |  3 +++
 tcg/i386/tcg-target.h | 11 +++++++++++
 3 files changed, 20 insertions(+)

diff --git a/configure b/configure
index 4b68861992..44ecbe6f74 100755
--- a/configure
+++ b/configure
@@ -5879,6 +5879,7 @@ mkdir -p $target_dir
 echo "# Automatically generated by configure - do not modify" > $config_target_mak
 
 bflt="no"
+mttcg="no"
 interp_prefix1=$(echo "$interp_prefix" | sed "s/%M/$target_name/g")
 gdb_xml_files=""
 
@@ -5897,11 +5898,13 @@ case "$target_name" in
   arm|armeb)
     TARGET_ARCH=arm
     bflt="yes"
+    mttcg="yes"
     gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
   ;;
   aarch64)
     TARGET_BASE_ARCH=arm
     bflt="yes"
+    mttcg="yes"
     gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
   ;;
   cris)
@@ -6066,6 +6069,9 @@ if test "$target_bigendian" = "yes" ; then
 fi
 if test "$target_softmmu" = "yes" ; then
   echo "CONFIG_SOFTMMU=y" >> $config_target_mak
+  if test "$mttcg" = "yes" ; then
+    echo "TARGET_SUPPORTS_MTTCG=y" >> $config_target_mak
+  fi
 fi
 if test "$target_user_only" = "yes" ; then
   echo "CONFIG_USER_ONLY=y" >> $config_target_mak
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e285ba3b4b..38a8e00908 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -30,6 +30,9 @@
 #  define TARGET_LONG_BITS 32
 #endif
 
+/* ARM processors have a weak memory model */
+#define TCG_GUEST_DEFAULT_MO      (0)
+
 #define CPUArchState struct CPUARMState
 
 #include "qemu-common.h"
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 21d96ec35c..4275787db9 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -165,4 +165,15 @@ static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
 {
 }
 
+/* This defines the natural memory order supported by this
+ * architecture before guarantees made by various barrier
+ * instructions.
+ *
+ * The x86 has a pretty strong memory ordering which only really
+ * allows for some stores to be re-ordered after loads.
+ */
+#include "tcg-mo.h"
+
+#define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD)
+
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (23 preceding siblings ...)
  2017-02-24 11:21 ` [Qemu-devel] [PULL 24/24] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
@ 2017-02-25 21:14 ` Peter Maydell
  2017-02-27  8:48   ` Christian Borntraeger
  2017-02-27 12:39 ` Paolo Bonzini
  25 siblings, 1 reply; 55+ messages in thread
From: Peter Maydell @ 2017-02-25 21:14 UTC (permalink / raw)
  To: Alex Bennée; +Cc: QEMU Developers

On 24 February 2017 at 11:20, Alex Bennée <alex.bennee@linaro.org> wrote:
> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>
>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>
> are available in the git repository at:
>
>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>
> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>
>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>
> ----------------------------------------------------------------
> This is the MTTCG pull-request as posted yesterday.
>
> ----------------------------------------------------------------

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-25 21:14 ` [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Peter Maydell
@ 2017-02-27  8:48   ` Christian Borntraeger
  2017-02-27  9:11     ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Christian Borntraeger @ 2017-02-27  8:48 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée; +Cc: QEMU Developers

On 02/25/2017 10:14 PM, Peter Maydell wrote:
> On 24 February 2017 at 11:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>>
>>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>>
>> are available in the git repository at:
>>
>>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>>
>> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>>
>>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>>
>> ----------------------------------------------------------------
>> This is the MTTCG pull-request as posted yesterday.
>>
>> ----------------------------------------------------------------
> 
> Applied, thanks.
> 
> -- PMM
> 

This seems to trigger 

/home/cborntra/REPOS/qemu/vl.c: In function ‘main’:
/home/cborntra/REPOS/qemu/vl.c:3700:18: error: ‘QEMU_OPTION_accel’ undeclared (first use in this function)
             case QEMU_OPTION_accel:
                  ^~~~~~~~~~~~~~~~~
/home/cborntra/REPOS/qemu/vl.c:3700:18: note: each undeclared identifier is reported only once for each function it appears in


on s390.

Christian

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-27  8:48   ` Christian Borntraeger
@ 2017-02-27  9:11     ` Alex Bennée
  2017-02-27  9:25       ` Christian Borntraeger
  2017-02-27  9:35       ` Christian Borntraeger
  0 siblings, 2 replies; 55+ messages in thread
From: Alex Bennée @ 2017-02-27  9:11 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: Peter Maydell, QEMU Developers


Christian Borntraeger <borntraeger@de.ibm.com> writes:

> On 02/25/2017 10:14 PM, Peter Maydell wrote:
>> On 24 February 2017 at 11:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>>>
>>>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>>>
>>> are available in the git repository at:
>>>
>>>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>>>
>>> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>>>
>>>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>>>
>>> ----------------------------------------------------------------
>>> This is the MTTCG pull-request as posted yesterday.
>>>
>>> ----------------------------------------------------------------
>>
>> Applied, thanks.
>>
>> -- PMM
>>
>
> This seems to trigger
>
> /home/cborntra/REPOS/qemu/vl.c: In function ‘main’:
> /home/cborntra/REPOS/qemu/vl.c:3700:18: error: ‘QEMU_OPTION_accel’ undeclared (first use in this function)
>              case QEMU_OPTION_accel:
>                   ^~~~~~~~~~~~~~~~~
> /home/cborntra/REPOS/qemu/vl.c:3700:18: note: each undeclared identifier is reported only once for each function it appears in
>
>
> on s390.

Is this for softmmu compilation? I'll have a look but I'll have to set
up some s390 images to test so you might beat me to it if you have real
hardware around.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-27  9:11     ` Alex Bennée
@ 2017-02-27  9:25       ` Christian Borntraeger
  2017-02-27  9:35       ` Christian Borntraeger
  1 sibling, 0 replies; 55+ messages in thread
From: Christian Borntraeger @ 2017-02-27  9:25 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Peter Maydell, QEMU Developers

On 02/27/2017 10:11 AM, Alex Bennée wrote:
> 
> Christian Borntraeger <borntraeger@de.ibm.com> writes:
> 
>> On 02/25/2017 10:14 PM, Peter Maydell wrote:
>>> On 24 February 2017 at 11:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>>>> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>>>>
>>>>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>>>>
>>>> are available in the git repository at:
>>>>
>>>>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>>>>
>>>> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>>>>
>>>>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>>>>
>>>> ----------------------------------------------------------------
>>>> This is the MTTCG pull-request as posted yesterday.
>>>>
>>>> ----------------------------------------------------------------
>>>
>>> Applied, thanks.
>>>
>>> -- PMM
>>>
>>
>> This seems to trigger
>>
>> /home/cborntra/REPOS/qemu/vl.c: In function ‘main’:
>> /home/cborntra/REPOS/qemu/vl.c:3700:18: error: ‘QEMU_OPTION_accel’ undeclared (first use in this function)
>>              case QEMU_OPTION_accel:
>>                   ^~~~~~~~~~~~~~~~~
>> /home/cborntra/REPOS/qemu/vl.c:3700:18: note: each undeclared identifier is reported only once for each function it appears in
>>
>>
>> on s390.
> 
> Is this for softmmu compilation? I'll have a look but I'll have to set
> up some s390 images to test so you might beat me to it if you have real
> hardware around.

Yes, softmmu. I do not yet understand why it is failing Still looking 8-| 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-27  9:11     ` Alex Bennée
  2017-02-27  9:25       ` Christian Borntraeger
@ 2017-02-27  9:35       ` Christian Borntraeger
  1 sibling, 0 replies; 55+ messages in thread
From: Christian Borntraeger @ 2017-02-27  9:35 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Peter Maydell, QEMU Developers

On 02/27/2017 10:11 AM, Alex Bennée wrote:
> 
> Christian Borntraeger <borntraeger@de.ibm.com> writes:
> 
>> On 02/25/2017 10:14 PM, Peter Maydell wrote:
>>> On 24 February 2017 at 11:20, Alex Bennée <alex.bennee@linaro.org> wrote:
>>>> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>>>>
>>>>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>>>>
>>>> are available in the git repository at:
>>>>
>>>>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>>>>
>>>> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>>>>
>>>>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>>>>
>>>> ----------------------------------------------------------------
>>>> This is the MTTCG pull-request as posted yesterday.
>>>>
>>>> ----------------------------------------------------------------
>>>
>>> Applied, thanks.
>>>
>>> -- PMM
>>>
>>
>> This seems to trigger
>>
>> /home/cborntra/REPOS/qemu/vl.c: In function ‘main’:
>> /home/cborntra/REPOS/qemu/vl.c:3700:18: error: ‘QEMU_OPTION_accel’ undeclared (first use in this function)
>>              case QEMU_OPTION_accel:
>>                   ^~~~~~~~~~~~~~~~~
>> /home/cborntra/REPOS/qemu/vl.c:3700:18: note: each undeclared identifier is reported only once for each function it appears in
>>
>>
>> on s390.
> 
> Is this for softmmu compilation? I'll have a look but I'll have to set
> up some s390 images to test so you might beat me to it if you have real
> hardware around.

Ok, my fault. I seem to have run make in the code folder somewhen in the past,
which created an qemu-options.def file in the source folder. When doing the 
rebuild in my build folder, it used qemu-options.def from the source folder
and not from the build folder.

With a clean restart everything seems fine

Christian

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
                   ` (24 preceding siblings ...)
  2017-02-25 21:14 ` [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Peter Maydell
@ 2017-02-27 12:39 ` Paolo Bonzini
  2017-02-27 15:48   ` Alex Bennée
  25 siblings, 1 reply; 55+ messages in thread
From: Paolo Bonzini @ 2017-02-27 12:39 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell; +Cc: qemu-devel

On 24/02/2017 12:20, Alex Bennée wrote:
> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
> 
>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
> 
> are available in the git repository at:
> 
>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
> 
> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
> 
>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
> 
> ----------------------------------------------------------------
> This is the MTTCG pull-request as posted yesterday.

This breaks "-icount auto" on qemu-system-aarch64 with "-M virt" and
AAVMF firmware, in two ways:

1) "-icount auto" doesn't work;

2) "-icount auto -accel tcg,thread=single" hangs fairly early, printing
this on the serial console.  It's okay if it hangs at

[Bds]=============End Load Options Dumping=============
[Bds]BdsWait ...Zzzzzzzzzzzz...
[Bds]BdsWait(3)..Zzzz...

(pressing Enter a few times then seems to unhang it), but it now hangs
much earlier than that.


Also, x86 "-accel tcg,thread=multi" prints the scary message on memory
ordering.

Paolo

> ----------------------------------------------------------------
> Alex Bennée (18):
>       docs: new design document multi-thread-tcg.txt
>       tcg: move TCG_MO/BAR types into own file
>       tcg: add kick timer for single-threaded vCPU emulation
>       tcg: rename tcg_current_cpu to tcg_current_rr_cpu
>       tcg: remove global exit_request
>       tcg: enable tb_lock() for SoftMMU
>       tcg: enable thread-per-vCPU
>       cputlb: add assert_cpu_is_self checks
>       cputlb: tweak qemu_ram_addr_from_host_nofail reporting
>       cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
>       cputlb: add tlb_flush_by_mmuidx async routines
>       cputlb: atomically update tlb fields used by tlb_reset_dirty
>       cputlb: introduce tlb_flush_*_all_cpus[_synced]
>       target-arm/powerctl: defer cpu reset work to CPU context
>       target-arm: don't generate WFE/YIELD calls for MTTCG
>       target-arm: ensure all cross vCPUs TLB flushes complete
>       hw/misc/imx6_src: defer clearing of SRC_SCR reset bits
>       tcg: enable MTTCG by default for ARM on x86 hosts
> 
> Jan Kiszka (1):
>       tcg: drop global lock during TCG code execution
> 
> KONRAD Frederic (2):
>       tcg: add options for enabling MTTCG
>       cputlb: introduce tlb_flush_* async work.
> 
> Pranith Kumar (3):
>       mttcg: translate-all: Enable locking debug in a debug build
>       mttcg: Add missing tb_lock/unlock() in cpu_exec_step()
>       tcg: handle EXCP_ATOMIC exception for system emulation
> 
>  configure                  |   6 +
>  cpu-exec-common.c          |   3 -
>  cpu-exec.c                 |  89 ++++++---
>  cpus.c                     | 345 ++++++++++++++++++++++++++-------
>  cputlb.c                   | 463 +++++++++++++++++++++++++++++++++++++--------
>  docs/multi-thread-tcg.txt  | 350 ++++++++++++++++++++++++++++++++++
>  exec.c                     |  12 +-
>  hw/core/irq.c              |   1 +
>  hw/i386/kvmvapic.c         |   4 +-
>  hw/intc/arm_gicv3_cpuif.c  |   3 +
>  hw/misc/imx6_src.c         |  58 +++++-
>  hw/ppc/ppc.c               |  16 +-
>  hw/ppc/spapr.c             |   3 +
>  include/exec/cputlb.h      |   2 -
>  include/exec/exec-all.h    | 132 +++++++++++--
>  include/qom/cpu.h          |  16 ++
>  include/sysemu/cpus.h      |   2 +
>  memory.c                   |   2 +
>  qemu-options.hx            |  20 ++
>  qom/cpu.c                  |  10 +
>  target/arm/arm-powerctl.c  | 202 +++++++++++++-------
>  target/arm/arm-powerctl.h  |   2 +
>  target/arm/cpu.c           |   4 +-
>  target/arm/cpu.h           |  18 +-
>  target/arm/helper.c        | 219 ++++++++++-----------
>  target/arm/kvm.c           |   7 +-
>  target/arm/machine.c       |  41 +++-
>  target/arm/op_helper.c     |  50 ++++-
>  target/arm/psci.c          |   4 +-
>  target/arm/translate-a64.c |   8 +-
>  target/arm/translate.c     |  20 +-
>  target/i386/smm_helper.c   |   7 +
>  target/s390x/misc_helper.c |   5 +-
>  target/sparc/ldst_helper.c |   8 +-
>  tcg/i386/tcg-target.h      |  11 ++
>  tcg/tcg-mo.h               |  48 +++++
>  tcg/tcg.h                  |  27 +--
>  translate-all.c            |  66 ++-----
>  translate-common.c         |  21 +-
>  vl.c                       |  49 ++++-
>  40 files changed, 1878 insertions(+), 476 deletions(-)
>  create mode 100644 docs/multi-thread-tcg.txt
>  create mode 100644 tcg/tcg-mo.h
> 
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution
  2017-02-24 11:20 ` [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution Alex Bennée
@ 2017-02-27 12:48   ` Laurent Desnogues
  2017-02-27 14:39     ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Laurent Desnogues @ 2017-02-27 12:48 UTC (permalink / raw)
  To: Alex Bennée, Jan Kiszka; +Cc: qemu-devel, open list:ARM cores

Hello,

On Fri, Feb 24, 2017 at 12:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
>
> This finally allows TCG to benefit from the iothread introduction: Drop
> the global mutex while running pure TCG CPU code. Reacquire the lock
> when entering MMIO or PIO emulation, or when leaving the TCG loop.
>
> We have to revert a few optimization for the current TCG threading
> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
> kicking it in qemu_cpu_kick. We also need to disable RAM block
> reordering until we have a more efficient locking mechanism at hand.
>
> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
> These numbers demonstrate where we gain something:
>
> 20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
> 20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
>
> The guest CPU was fully loaded, but the iothread could still run mostly
> independent on a second core. Without the patch we don't get beyond
>
> 32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
> 32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
>
> We don't benefit significantly, though, when the guest is not fully
> loading a host CPU.

I tried this patch (8d04fb55 in the repository) with the following image:

   http://wiki.qemu.org/download/arm-test-0.2.tar.gz

Running the image with no option works fine.  But specifying '-icount
1' results in a (guest) deadlock. Enabling some heavy logging (-d
in_asm,exec) sometimes results in a 'Bad ram offset'.

Is it expected that this patch breaks -icount?

Thanks,

Laurent

PS - To clarify 791158d9 works.

> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
> [EGC: fixed iothread lock for cpu-exec IRQ handling]
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
> [PM: target-arm changes]
> Acked-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  cpu-exec.c                 | 23 +++++++++++++++++++++--
>  cpus.c                     | 28 +++++-----------------------
>  cputlb.c                   | 21 ++++++++++++++++++++-
>  exec.c                     | 12 +++++++++---
>  hw/core/irq.c              |  1 +
>  hw/i386/kvmvapic.c         |  4 ++--
>  hw/intc/arm_gicv3_cpuif.c  |  3 +++
>  hw/ppc/ppc.c               | 16 +++++++++++++++-
>  hw/ppc/spapr.c             |  3 +++
>  include/qom/cpu.h          |  1 +
>  memory.c                   |  2 ++
>  qom/cpu.c                  | 10 ++++++++++
>  target/arm/helper.c        |  6 ++++++
>  target/arm/op_helper.c     | 43 +++++++++++++++++++++++++++++++++++++++----
>  target/i386/smm_helper.c   |  7 +++++++
>  target/s390x/misc_helper.c |  5 ++++-
>  translate-all.c            |  9 +++++++--
>  translate-common.c         | 21 +++++++++++----------
>  18 files changed, 166 insertions(+), 49 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 06a6b25564..1bd3d72002 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -29,6 +29,7 @@
>  #include "qemu/rcu.h"
>  #include "exec/tb-hash.h"
>  #include "exec/log.h"
> +#include "qemu/main-loop.h"
>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>  #include "hw/i386/apic.h"
>  #endif
> @@ -388,8 +389,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
>          if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
>              && replay_interrupt()) {
>              X86CPU *x86_cpu = X86_CPU(cpu);
> +            qemu_mutex_lock_iothread();
>              apic_poll_irq(x86_cpu->apic_state);
>              cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
> +            qemu_mutex_unlock_iothread();
>          }
>  #endif
>          if (!cpu_has_work(cpu)) {
> @@ -443,7 +446,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
>  #else
>              if (replay_exception()) {
>                  CPUClass *cc = CPU_GET_CLASS(cpu);
> +                qemu_mutex_lock_iothread();
>                  cc->do_interrupt(cpu);
> +                qemu_mutex_unlock_iothread();
>                  cpu->exception_index = -1;
>              } else if (!replay_has_interrupt()) {
>                  /* give a chance to iothread in replay mode */
> @@ -469,9 +474,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>                                          TranslationBlock **last_tb)
>  {
>      CPUClass *cc = CPU_GET_CLASS(cpu);
> -    int interrupt_request = cpu->interrupt_request;
>
> -    if (unlikely(interrupt_request)) {
> +    if (unlikely(atomic_read(&cpu->interrupt_request))) {
> +        int interrupt_request;
> +        qemu_mutex_lock_iothread();
> +        interrupt_request = cpu->interrupt_request;
>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
>              /* Mask out external interrupts for this step. */
>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> @@ -479,6 +486,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>              cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
>              cpu->exception_index = EXCP_DEBUG;
> +            qemu_mutex_unlock_iothread();
>              return true;
>          }
>          if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
> @@ -488,6 +496,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>              cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
>              cpu->halted = 1;
>              cpu->exception_index = EXCP_HLT;
> +            qemu_mutex_unlock_iothread();
>              return true;
>          }
>  #if defined(TARGET_I386)
> @@ -498,12 +507,14 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>              cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
>              do_cpu_init(x86_cpu);
>              cpu->exception_index = EXCP_HALTED;
> +            qemu_mutex_unlock_iothread();
>              return true;
>          }
>  #else
>          else if (interrupt_request & CPU_INTERRUPT_RESET) {
>              replay_interrupt();
>              cpu_reset(cpu);
> +            qemu_mutex_unlock_iothread();
>              return true;
>          }
>  #endif
> @@ -526,7 +537,12 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>                 the program flow was changed */
>              *last_tb = NULL;
>          }
> +
> +        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
> +        qemu_mutex_unlock_iothread();
>      }
> +
> +
>      if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
>          atomic_set(&cpu->exit_request, 0);
>          cpu->exception_index = EXCP_INTERRUPT;
> @@ -643,6 +659,9 @@ int cpu_exec(CPUState *cpu)
>  #endif /* buggy compiler */
>          cpu->can_do_io = 1;
>          tb_lock_reset();
> +        if (qemu_mutex_iothread_locked()) {
> +            qemu_mutex_unlock_iothread();
> +        }
>      }
>
>      /* if an exception is pending, we execute it here */
> diff --git a/cpus.c b/cpus.c
> index 860034a794..0ae8f69be5 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1027,8 +1027,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
>  #endif /* _WIN32 */
>
>  static QemuMutex qemu_global_mutex;
> -static QemuCond qemu_io_proceeded_cond;
> -static unsigned iothread_requesting_mutex;
>
>  static QemuThread io_thread;
>
> @@ -1042,7 +1040,6 @@ void qemu_init_cpu_loop(void)
>      qemu_init_sigbus();
>      qemu_cond_init(&qemu_cpu_cond);
>      qemu_cond_init(&qemu_pause_cond);
> -    qemu_cond_init(&qemu_io_proceeded_cond);
>      qemu_mutex_init(&qemu_global_mutex);
>
>      qemu_thread_get_self(&io_thread);
> @@ -1085,10 +1082,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
>
>      start_tcg_kick_timer();
>
> -    while (iothread_requesting_mutex) {
> -        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
> -    }
> -
>      CPU_FOREACH(cpu) {
>          qemu_wait_io_event_common(cpu);
>      }
> @@ -1249,9 +1242,11 @@ static int tcg_cpu_exec(CPUState *cpu)
>          cpu->icount_decr.u16.low = decr;
>          cpu->icount_extra = count;
>      }
> +    qemu_mutex_unlock_iothread();
>      cpu_exec_start(cpu);
>      ret = cpu_exec(cpu);
>      cpu_exec_end(cpu);
> +    qemu_mutex_lock_iothread();
>  #ifdef CONFIG_PROFILER
>      tcg_time += profile_getclock() - ti;
>  #endif
> @@ -1479,27 +1474,14 @@ bool qemu_mutex_iothread_locked(void)
>
>  void qemu_mutex_lock_iothread(void)
>  {
> -    atomic_inc(&iothread_requesting_mutex);
> -    /* In the simple case there is no need to bump the VCPU thread out of
> -     * TCG code execution.
> -     */
> -    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
> -        !first_cpu || !first_cpu->created) {
> -        qemu_mutex_lock(&qemu_global_mutex);
> -        atomic_dec(&iothread_requesting_mutex);
> -    } else {
> -        if (qemu_mutex_trylock(&qemu_global_mutex)) {
> -            qemu_cpu_kick_rr_cpu();
> -            qemu_mutex_lock(&qemu_global_mutex);
> -        }
> -        atomic_dec(&iothread_requesting_mutex);
> -        qemu_cond_broadcast(&qemu_io_proceeded_cond);
> -    }
> +    g_assert(!qemu_mutex_iothread_locked());
> +    qemu_mutex_lock(&qemu_global_mutex);
>      iothread_locked = true;
>  }
>
>  void qemu_mutex_unlock_iothread(void)
>  {
> +    g_assert(qemu_mutex_iothread_locked());
>      iothread_locked = false;
>      qemu_mutex_unlock(&qemu_global_mutex);
>  }
> diff --git a/cputlb.c b/cputlb.c
> index 6c39927455..1cc9d9da51 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -18,6 +18,7 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qemu/main-loop.h"
>  #include "cpu.h"
>  #include "exec/exec-all.h"
>  #include "exec/memory.h"
> @@ -495,6 +496,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>      hwaddr physaddr = iotlbentry->addr;
>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
>      uint64_t val;
> +    bool locked = false;
>
>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>      cpu->mem_io_pc = retaddr;
> @@ -503,7 +505,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>      }
>
>      cpu->mem_io_vaddr = addr;
> +
> +    if (mr->global_locking) {
> +        qemu_mutex_lock_iothread();
> +        locked = true;
> +    }
>      memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
> +    if (locked) {
> +        qemu_mutex_unlock_iothread();
> +    }
> +
>      return val;
>  }
>
> @@ -514,15 +525,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>      CPUState *cpu = ENV_GET_CPU(env);
>      hwaddr physaddr = iotlbentry->addr;
>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
> +    bool locked = false;
>
>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>      if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
>          cpu_io_recompile(cpu, retaddr);
>      }
> -
>      cpu->mem_io_vaddr = addr;
>      cpu->mem_io_pc = retaddr;
> +
> +    if (mr->global_locking) {
> +        qemu_mutex_lock_iothread();
> +        locked = true;
> +    }
>      memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
> +    if (locked) {
> +        qemu_mutex_unlock_iothread();
> +    }
>  }
>
>  /* Return true if ADDR is present in the victim tlb, and has been copied
> diff --git a/exec.c b/exec.c
> index 865a1e8295..3adf2b1861 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2134,9 +2134,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
>                  }
>                  cpu->watchpoint_hit = wp;
>
> -                /* The tb_lock will be reset when cpu_loop_exit or
> -                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
> -                 * main loop.
> +                /* Both tb_lock and iothread_mutex will be reset when
> +                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
> +                 * back into the cpu_exec main loop.
>                   */
>                  tb_lock();
>                  tb_check_watchpoint(cpu);
> @@ -2371,8 +2371,14 @@ static void io_mem_init(void)
>      memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
>      memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
>                            NULL, UINT64_MAX);
> +
> +    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
> +     * which can be called without the iothread mutex.
> +     */
>      memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
>                            NULL, UINT64_MAX);
> +    memory_region_clear_global_locking(&io_mem_notdirty);
> +
>      memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
>                            NULL, UINT64_MAX);
>  }
> diff --git a/hw/core/irq.c b/hw/core/irq.c
> index 49ff2e64fe..b98d1d69f5 100644
> --- a/hw/core/irq.c
> +++ b/hw/core/irq.c
> @@ -22,6 +22,7 @@
>   * THE SOFTWARE.
>   */
>  #include "qemu/osdep.h"
> +#include "qemu/main-loop.h"
>  #include "qemu-common.h"
>  #include "hw/irq.h"
>  #include "qom/object.h"
> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
> index 7135633863..82a49556af 100644
> --- a/hw/i386/kvmvapic.c
> +++ b/hw/i386/kvmvapic.c
> @@ -457,8 +457,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
>      resume_all_vcpus();
>
>      if (!kvm_enabled()) {
> -        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
> -         * back into the cpu_exec loop. */
> +        /* Both tb_lock and iothread_mutex will be reset when
> +         *  longjmps back into the cpu_exec loop. */
>          tb_lock();
>          tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
>          cpu_loop_exit_noexc(cs);
> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
> index c25ee03556..f775aba507 100644
> --- a/hw/intc/arm_gicv3_cpuif.c
> +++ b/hw/intc/arm_gicv3_cpuif.c
> @@ -14,6 +14,7 @@
>
>  #include "qemu/osdep.h"
>  #include "qemu/bitops.h"
> +#include "qemu/main-loop.h"
>  #include "trace.h"
>  #include "gicv3_internal.h"
>  #include "cpu.h"
> @@ -733,6 +734,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
>      ARMCPU *cpu = ARM_CPU(cs->cpu);
>      CPUARMState *env = &cpu->env;
>
> +    g_assert(qemu_mutex_iothread_locked());
> +
>      trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
>                               cs->hppi.grp, cs->hppi.prio);
>
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index d171e60b5c..5f93083d4a 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -62,7 +62,16 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>  {
>      CPUState *cs = CPU(cpu);
>      CPUPPCState *env = &cpu->env;
> -    unsigned int old_pending = env->pending_interrupts;
> +    unsigned int old_pending;
> +    bool locked = false;
> +
> +    /* We may already have the BQL if coming from the reset path */
> +    if (!qemu_mutex_iothread_locked()) {
> +        locked = true;
> +        qemu_mutex_lock_iothread();
> +    }
> +
> +    old_pending = env->pending_interrupts;
>
>      if (level) {
>          env->pending_interrupts |= 1 << n_IRQ;
> @@ -80,9 +89,14 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>  #endif
>      }
>
> +
>      LOG_IRQ("%s: %p n_IRQ %d level %d => pending %08" PRIx32
>                  "req %08x\n", __func__, env, n_IRQ, level,
>                  env->pending_interrupts, CPU(cpu)->interrupt_request);
> +
> +    if (locked) {
> +        qemu_mutex_unlock_iothread();
> +    }
>  }
>
>  /* PowerPC 6xx / 7xx internal IRQ controller */
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e465d7ac98..b1e374f3f9 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1010,6 +1010,9 @@ static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
>  {
>      CPUPPCState *env = &cpu->env;
>
> +    /* The TCG path should also be holding the BQL at this point */
> +    g_assert(qemu_mutex_iothread_locked());
> +
>      if (msr_pr) {
>          hcall_dprintf("Hypercall made with MSR[PR]=1\n");
>          env->gpr[3] = H_PRIVILEGE;
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 2cf4ecf144..10db89b16a 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -329,6 +329,7 @@ struct CPUState {
>      bool unplug;
>      bool crash_occurred;
>      bool exit_request;
> +    /* updates protected by BQL */
>      uint32_t interrupt_request;
>      int singlestep_enabled;
>      int64_t icount_extra;
> diff --git a/memory.c b/memory.c
> index ed8b5aa83e..d61caee867 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
>      AddressSpace *as;
>
>      assert(memory_region_transaction_depth);
> +    assert(qemu_mutex_iothread_locked());
> +
>      --memory_region_transaction_depth;
>      if (!memory_region_transaction_depth) {
>          if (memory_region_update_pending) {
> diff --git a/qom/cpu.c b/qom/cpu.c
> index ed87c50cea..58784bcbea 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
>      error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
>  }
>
> +/* Resetting the IRQ comes from across the code base so we take the
> + * BQL here if we need to.  cpu_interrupt assumes it is held.*/
>  void cpu_reset_interrupt(CPUState *cpu, int mask)
>  {
> +    bool need_lock = !qemu_mutex_iothread_locked();
> +
> +    if (need_lock) {
> +        qemu_mutex_lock_iothread();
> +    }
>      cpu->interrupt_request &= ~mask;
> +    if (need_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>  }
>
>  void cpu_exit(CPUState *cpu)
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 47250bcf16..753a69d40d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -6769,6 +6769,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
>          arm_cpu_do_interrupt_aarch32(cs);
>      }
>
> +    /* Hooks may change global state so BQL should be held, also the
> +     * BQL needs to be held for any modification of
> +     * cs->interrupt_request.
> +     */
> +    g_assert(qemu_mutex_iothread_locked());
> +
>      arm_call_el_change_hook(cpu);
>
>      if (!kvm_enabled()) {
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index fb366fdc35..5f3e3bdae2 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -18,6 +18,7 @@
>   */
>  #include "qemu/osdep.h"
>  #include "qemu/log.h"
> +#include "qemu/main-loop.h"
>  #include "cpu.h"
>  #include "exec/helper-proto.h"
>  #include "internals.h"
> @@ -487,7 +488,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
>       */
>      env->regs[15] &= (env->thumb ? ~1 : ~3);
>
> +    qemu_mutex_lock_iothread();
>      arm_call_el_change_hook(arm_env_get_cpu(env));
> +    qemu_mutex_unlock_iothread();
>  }
>
>  /* Access to user mode registers from privileged modes.  */
> @@ -735,28 +738,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
>  {
>      const ARMCPRegInfo *ri = rip;
>
> -    ri->writefn(env, ri, value);
> +    if (ri->type & ARM_CP_IO) {
> +        qemu_mutex_lock_iothread();
> +        ri->writefn(env, ri, value);
> +        qemu_mutex_unlock_iothread();
> +    } else {
> +        ri->writefn(env, ri, value);
> +    }
>  }
>
>  uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
>  {
>      const ARMCPRegInfo *ri = rip;
> +    uint32_t res;
>
> -    return ri->readfn(env, ri);
> +    if (ri->type & ARM_CP_IO) {
> +        qemu_mutex_lock_iothread();
> +        res = ri->readfn(env, ri);
> +        qemu_mutex_unlock_iothread();
> +    } else {
> +        res = ri->readfn(env, ri);
> +    }
> +
> +    return res;
>  }
>
>  void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
>  {
>      const ARMCPRegInfo *ri = rip;
>
> -    ri->writefn(env, ri, value);
> +    if (ri->type & ARM_CP_IO) {
> +        qemu_mutex_lock_iothread();
> +        ri->writefn(env, ri, value);
> +        qemu_mutex_unlock_iothread();
> +    } else {
> +        ri->writefn(env, ri, value);
> +    }
>  }
>
>  uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
>  {
>      const ARMCPRegInfo *ri = rip;
> +    uint64_t res;
> +
> +    if (ri->type & ARM_CP_IO) {
> +        qemu_mutex_lock_iothread();
> +        res = ri->readfn(env, ri);
> +        qemu_mutex_unlock_iothread();
> +    } else {
> +        res = ri->readfn(env, ri);
> +    }
>
> -    return ri->readfn(env, ri);
> +    return res;
>  }
>
>  void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
> @@ -989,7 +1022,9 @@ void HELPER(exception_return)(CPUARMState *env)
>                        cur_el, new_el, env->pc);
>      }
>
> +    qemu_mutex_lock_iothread();
>      arm_call_el_change_hook(arm_env_get_cpu(env));
> +    qemu_mutex_unlock_iothread();
>
>      return;
>
> diff --git a/target/i386/smm_helper.c b/target/i386/smm_helper.c
> index 4dd6a2c544..f051a77c4a 100644
> --- a/target/i386/smm_helper.c
> +++ b/target/i386/smm_helper.c
> @@ -18,6 +18,7 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qemu/main-loop.h"
>  #include "cpu.h"
>  #include "exec/helper-proto.h"
>  #include "exec/log.h"
> @@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
>  #define SMM_REVISION_ID 0x00020000
>  #endif
>
> +/* Called with iothread lock taken */
>  void cpu_smm_update(X86CPU *cpu)
>  {
>      CPUX86State *env = &cpu->env;
>      bool smm_enabled = (env->hflags & HF_SMM_MASK);
>
> +    g_assert(qemu_mutex_iothread_locked());
> +
>      if (cpu->smram) {
>          memory_region_set_enabled(cpu->smram, smm_enabled);
>      }
> @@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
>      }
>      env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
>      env->hflags &= ~HF_SMM_MASK;
> +
> +    qemu_mutex_lock_iothread();
>      cpu_smm_update(cpu);
> +    qemu_mutex_unlock_iothread();
>
>      qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
>      log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
> index c9604ea9c7..3cb942e8bb 100644
> --- a/target/s390x/misc_helper.c
> +++ b/target/s390x/misc_helper.c
> @@ -25,6 +25,7 @@
>  #include "exec/helper-proto.h"
>  #include "sysemu/kvm.h"
>  #include "qemu/timer.h"
> +#include "qemu/main-loop.h"
>  #include "exec/address-spaces.h"
>  #ifdef CONFIG_KVM
>  #include <linux/kvm.h>
> @@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
>  /* SCLP service call */
>  uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
>  {
> +    qemu_mutex_lock_iothread();
>      int r = sclp_service_call(env, r1, r2);
>      if (r < 0) {
>          program_interrupt(env, -r, 4);
> -        return 0;
> +        r = 0;
>      }
> +    qemu_mutex_unlock_iothread();
>      return r;
>  }
>
> diff --git a/translate-all.c b/translate-all.c
> index 8a861cb583..f810259c41 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -55,6 +55,7 @@
>  #include "translate-all.h"
>  #include "qemu/bitmap.h"
>  #include "qemu/timer.h"
> +#include "qemu/main-loop.h"
>  #include "exec/log.h"
>
>  /* #define DEBUG_TB_INVALIDATE */
> @@ -1523,7 +1524,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
>  #ifdef CONFIG_SOFTMMU
>  /* len must be <= 8 and start must be a multiple of len.
>   * Called via softmmu_template.h when code areas are written to with
> - * tb_lock held.
> + * iothread mutex not held.
>   */
>  void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
>  {
> @@ -1725,7 +1726,10 @@ void tb_check_watchpoint(CPUState *cpu)
>
>  #ifndef CONFIG_USER_ONLY
>  /* in deterministic execution mode, instructions doing device I/Os
> -   must be at the end of the TB */
> + * must be at the end of the TB.
> + *
> + * Called by softmmu_template.h, with iothread mutex not held.
> + */
>  void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
>  {
>  #if defined(TARGET_MIPS) || defined(TARGET_SH4)
> @@ -1937,6 +1941,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
>
>  void cpu_interrupt(CPUState *cpu, int mask)
>  {
> +    g_assert(qemu_mutex_iothread_locked());
>      cpu->interrupt_request |= mask;
>      cpu->tcg_exit_req = 1;
>  }
> diff --git a/translate-common.c b/translate-common.c
> index 5e989cdf70..d504dd0d33 100644
> --- a/translate-common.c
> +++ b/translate-common.c
> @@ -21,6 +21,7 @@
>  #include "qemu-common.h"
>  #include "qom/cpu.h"
>  #include "sysemu/cpus.h"
> +#include "qemu/main-loop.h"
>
>  uintptr_t qemu_real_host_page_size;
>  intptr_t qemu_real_host_page_mask;
> @@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
>  static void tcg_handle_interrupt(CPUState *cpu, int mask)
>  {
>      int old_mask;
> +    g_assert(qemu_mutex_iothread_locked());
>
>      old_mask = cpu->interrupt_request;
>      cpu->interrupt_request |= mask;
> @@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
>       */
>      if (!qemu_cpu_is_self(cpu)) {
>          qemu_cpu_kick(cpu);
> -        return;
> -    }
> -
> -    if (use_icount) {
> -        cpu->icount_decr.u16.high = 0xffff;
> -        if (!cpu->can_do_io
> -            && (mask & ~old_mask) != 0) {
> -            cpu_abort(cpu, "Raised interrupt while not in I/O function");
> -        }
>      } else {
> -        cpu->tcg_exit_req = 1;
> +        if (use_icount) {
> +            cpu->icount_decr.u16.high = 0xffff;
> +            if (!cpu->can_do_io
> +                && (mask & ~old_mask) != 0) {
> +                cpu_abort(cpu, "Raised interrupt while not in I/O function");
> +            }
> +        } else {
> +            cpu->tcg_exit_req = 1;
> +        }
>      }
>  }
>
> --
> 2.11.0
>
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-02-24 11:20 ` [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU Alex Bennée
@ 2017-02-27 12:48   ` Laurent Vivier
  2017-02-27 14:38     ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Laurent Vivier @ 2017-02-27 12:48 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell
  Cc: Peter Crosthwaite, qemu-devel, Paolo Bonzini, KONRAD Frederic,
	Richard Henderson

Le 24/02/2017 à 12:20, Alex Bennée a écrit :
> There are a couple of changes that occur at the same time here:
> 
>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
> 
>   One of these is spawned per vCPU with its own Thread and Condition
>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>   single threaded function.
> 
>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>     vCPU threads. This is for future work where async jobs need to know
>     the vCPU context they are operating in.
> 
> The user to switch on multi-thread behaviour and spawn a thread
> per-vCPU. For a simple test kvm-unit-test like:
> 
>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
> 
> Will now use 4 vCPU threads and have an expected FAIL (instead of the
> unexpected PASS) as the default mode of the test has no protection when
> incrementing a shared variable.
> 
> We enable the parallel_cpus flag to ensure we generate correct barrier
> and atomic code if supported by the front and backends. This doesn't
> automatically enable MTTCG until default_mttcg_enabled() is updated to
> check the configuration is supported.

This commit breaks linux-user mode:

debian-8 with qemu-ppc on x86_64 with ltp-full-20170116

cd /opt/ltp
./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
setgroups03

setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
sysconf(_SC_NGROUPS_MAX), errno=22
qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
...

Laurent

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-02-27 12:48   ` Laurent Vivier
@ 2017-02-27 14:38     ` Alex Bennée
  2017-03-13 14:03       ` Laurent Vivier
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-27 14:38 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson


Laurent Vivier <laurent@vivier.eu> writes:

> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>> There are a couple of changes that occur at the same time here:
>>
>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>
>>   One of these is spawned per vCPU with its own Thread and Condition
>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>   single threaded function.
>>
>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>     vCPU threads. This is for future work where async jobs need to know
>>     the vCPU context they are operating in.
>>
>> The user to switch on multi-thread behaviour and spawn a thread
>> per-vCPU. For a simple test kvm-unit-test like:
>>
>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>
>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>> unexpected PASS) as the default mode of the test has no protection when
>> incrementing a shared variable.
>>
>> We enable the parallel_cpus flag to ensure we generate correct barrier
>> and atomic code if supported by the front and backends. This doesn't
>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>> check the configuration is supported.
>
> This commit breaks linux-user mode:
>
> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>
> cd /opt/ltp
> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
> setgroups03
>
> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
> sysconf(_SC_NGROUPS_MAX), errno=22
> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
> ...

Interesting. I can only think the current_cpu change has broken it
because most of the changes in this commit affect softmmu targets only
(linux-user has its own run loop).

Thanks for the report - I'll look into it.


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution
  2017-02-27 12:48   ` Laurent Desnogues
@ 2017-02-27 14:39     ` Alex Bennée
  2017-03-03 20:59       ` Aaron Lindsay
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-27 14:39 UTC (permalink / raw)
  To: Laurent Desnogues; +Cc: Jan Kiszka, qemu-devel, open list:ARM cores


Laurent Desnogues <laurent.desnogues@gmail.com> writes:

> Hello,
>
> On Fri, Feb 24, 2017 at 12:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> This finally allows TCG to benefit from the iothread introduction: Drop
>> the global mutex while running pure TCG CPU code. Reacquire the lock
>> when entering MMIO or PIO emulation, or when leaving the TCG loop.
>>
>> We have to revert a few optimization for the current TCG threading
>> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
>> kicking it in qemu_cpu_kick. We also need to disable RAM block
>> reordering until we have a more efficient locking mechanism at hand.
>>
>> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
>> These numbers demonstrate where we gain something:
>>
>> 20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
>> 20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
>>
>> The guest CPU was fully loaded, but the iothread could still run mostly
>> independent on a second core. Without the patch we don't get beyond
>>
>> 32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
>> 32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
>>
>> We don't benefit significantly, though, when the guest is not fully
>> loading a host CPU.
>
> I tried this patch (8d04fb55 in the repository) with the following image:
>
>    http://wiki.qemu.org/download/arm-test-0.2.tar.gz
>
> Running the image with no option works fine.  But specifying '-icount
> 1' results in a (guest) deadlock. Enabling some heavy logging (-d
> in_asm,exec) sometimes results in a 'Bad ram offset'.
>
> Is it expected that this patch breaks -icount?

Not really. Using icount will disable MTTCG and run single threaded as
before. Paolo reported another icount failure so they may be related. I
shall have a look at it.

Thanks for the report.

>
> Thanks,
>
> Laurent
>
> PS - To clarify 791158d9 works.
>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
>> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
>> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>> [EGC: fixed iothread lock for cpu-exec IRQ handling]
>> Signed-off-by: Emilio G. Cota <cota@braap.org>
>> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
>> [PM: target-arm changes]
>> Acked-by: Peter Maydell <peter.maydell@linaro.org>
>> ---
>>  cpu-exec.c                 | 23 +++++++++++++++++++++--
>>  cpus.c                     | 28 +++++-----------------------
>>  cputlb.c                   | 21 ++++++++++++++++++++-
>>  exec.c                     | 12 +++++++++---
>>  hw/core/irq.c              |  1 +
>>  hw/i386/kvmvapic.c         |  4 ++--
>>  hw/intc/arm_gicv3_cpuif.c  |  3 +++
>>  hw/ppc/ppc.c               | 16 +++++++++++++++-
>>  hw/ppc/spapr.c             |  3 +++
>>  include/qom/cpu.h          |  1 +
>>  memory.c                   |  2 ++
>>  qom/cpu.c                  | 10 ++++++++++
>>  target/arm/helper.c        |  6 ++++++
>>  target/arm/op_helper.c     | 43 +++++++++++++++++++++++++++++++++++++++----
>>  target/i386/smm_helper.c   |  7 +++++++
>>  target/s390x/misc_helper.c |  5 ++++-
>>  translate-all.c            |  9 +++++++--
>>  translate-common.c         | 21 +++++++++++----------
>>  18 files changed, 166 insertions(+), 49 deletions(-)
>>
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index 06a6b25564..1bd3d72002 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -29,6 +29,7 @@
>>  #include "qemu/rcu.h"
>>  #include "exec/tb-hash.h"
>>  #include "exec/log.h"
>> +#include "qemu/main-loop.h"
>>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>>  #include "hw/i386/apic.h"
>>  #endif
>> @@ -388,8 +389,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
>>          if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
>>              && replay_interrupt()) {
>>              X86CPU *x86_cpu = X86_CPU(cpu);
>> +            qemu_mutex_lock_iothread();
>>              apic_poll_irq(x86_cpu->apic_state);
>>              cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
>> +            qemu_mutex_unlock_iothread();
>>          }
>>  #endif
>>          if (!cpu_has_work(cpu)) {
>> @@ -443,7 +446,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
>>  #else
>>              if (replay_exception()) {
>>                  CPUClass *cc = CPU_GET_CLASS(cpu);
>> +                qemu_mutex_lock_iothread();
>>                  cc->do_interrupt(cpu);
>> +                qemu_mutex_unlock_iothread();
>>                  cpu->exception_index = -1;
>>              } else if (!replay_has_interrupt()) {
>>                  /* give a chance to iothread in replay mode */
>> @@ -469,9 +474,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>                                          TranslationBlock **last_tb)
>>  {
>>      CPUClass *cc = CPU_GET_CLASS(cpu);
>> -    int interrupt_request = cpu->interrupt_request;
>>
>> -    if (unlikely(interrupt_request)) {
>> +    if (unlikely(atomic_read(&cpu->interrupt_request))) {
>> +        int interrupt_request;
>> +        qemu_mutex_lock_iothread();
>> +        interrupt_request = cpu->interrupt_request;
>>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
>>              /* Mask out external interrupts for this step. */
>>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
>> @@ -479,6 +486,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>>              cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
>>              cpu->exception_index = EXCP_DEBUG;
>> +            qemu_mutex_unlock_iothread();
>>              return true;
>>          }
>>          if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
>> @@ -488,6 +496,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>              cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
>>              cpu->halted = 1;
>>              cpu->exception_index = EXCP_HLT;
>> +            qemu_mutex_unlock_iothread();
>>              return true;
>>          }
>>  #if defined(TARGET_I386)
>> @@ -498,12 +507,14 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>              cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
>>              do_cpu_init(x86_cpu);
>>              cpu->exception_index = EXCP_HALTED;
>> +            qemu_mutex_unlock_iothread();
>>              return true;
>>          }
>>  #else
>>          else if (interrupt_request & CPU_INTERRUPT_RESET) {
>>              replay_interrupt();
>>              cpu_reset(cpu);
>> +            qemu_mutex_unlock_iothread();
>>              return true;
>>          }
>>  #endif
>> @@ -526,7 +537,12 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>                 the program flow was changed */
>>              *last_tb = NULL;
>>          }
>> +
>> +        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
>> +        qemu_mutex_unlock_iothread();
>>      }
>> +
>> +
>>      if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
>>          atomic_set(&cpu->exit_request, 0);
>>          cpu->exception_index = EXCP_INTERRUPT;
>> @@ -643,6 +659,9 @@ int cpu_exec(CPUState *cpu)
>>  #endif /* buggy compiler */
>>          cpu->can_do_io = 1;
>>          tb_lock_reset();
>> +        if (qemu_mutex_iothread_locked()) {
>> +            qemu_mutex_unlock_iothread();
>> +        }
>>      }
>>
>>      /* if an exception is pending, we execute it here */
>> diff --git a/cpus.c b/cpus.c
>> index 860034a794..0ae8f69be5 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1027,8 +1027,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
>>  #endif /* _WIN32 */
>>
>>  static QemuMutex qemu_global_mutex;
>> -static QemuCond qemu_io_proceeded_cond;
>> -static unsigned iothread_requesting_mutex;
>>
>>  static QemuThread io_thread;
>>
>> @@ -1042,7 +1040,6 @@ void qemu_init_cpu_loop(void)
>>      qemu_init_sigbus();
>>      qemu_cond_init(&qemu_cpu_cond);
>>      qemu_cond_init(&qemu_pause_cond);
>> -    qemu_cond_init(&qemu_io_proceeded_cond);
>>      qemu_mutex_init(&qemu_global_mutex);
>>
>>      qemu_thread_get_self(&io_thread);
>> @@ -1085,10 +1082,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
>>
>>      start_tcg_kick_timer();
>>
>> -    while (iothread_requesting_mutex) {
>> -        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
>> -    }
>> -
>>      CPU_FOREACH(cpu) {
>>          qemu_wait_io_event_common(cpu);
>>      }
>> @@ -1249,9 +1242,11 @@ static int tcg_cpu_exec(CPUState *cpu)
>>          cpu->icount_decr.u16.low = decr;
>>          cpu->icount_extra = count;
>>      }
>> +    qemu_mutex_unlock_iothread();
>>      cpu_exec_start(cpu);
>>      ret = cpu_exec(cpu);
>>      cpu_exec_end(cpu);
>> +    qemu_mutex_lock_iothread();
>>  #ifdef CONFIG_PROFILER
>>      tcg_time += profile_getclock() - ti;
>>  #endif
>> @@ -1479,27 +1474,14 @@ bool qemu_mutex_iothread_locked(void)
>>
>>  void qemu_mutex_lock_iothread(void)
>>  {
>> -    atomic_inc(&iothread_requesting_mutex);
>> -    /* In the simple case there is no need to bump the VCPU thread out of
>> -     * TCG code execution.
>> -     */
>> -    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
>> -        !first_cpu || !first_cpu->created) {
>> -        qemu_mutex_lock(&qemu_global_mutex);
>> -        atomic_dec(&iothread_requesting_mutex);
>> -    } else {
>> -        if (qemu_mutex_trylock(&qemu_global_mutex)) {
>> -            qemu_cpu_kick_rr_cpu();
>> -            qemu_mutex_lock(&qemu_global_mutex);
>> -        }
>> -        atomic_dec(&iothread_requesting_mutex);
>> -        qemu_cond_broadcast(&qemu_io_proceeded_cond);
>> -    }
>> +    g_assert(!qemu_mutex_iothread_locked());
>> +    qemu_mutex_lock(&qemu_global_mutex);
>>      iothread_locked = true;
>>  }
>>
>>  void qemu_mutex_unlock_iothread(void)
>>  {
>> +    g_assert(qemu_mutex_iothread_locked());
>>      iothread_locked = false;
>>      qemu_mutex_unlock(&qemu_global_mutex);
>>  }
>> diff --git a/cputlb.c b/cputlb.c
>> index 6c39927455..1cc9d9da51 100644
>> --- a/cputlb.c
>> +++ b/cputlb.c
>> @@ -18,6 +18,7 @@
>>   */
>>
>>  #include "qemu/osdep.h"
>> +#include "qemu/main-loop.h"
>>  #include "cpu.h"
>>  #include "exec/exec-all.h"
>>  #include "exec/memory.h"
>> @@ -495,6 +496,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>>      hwaddr physaddr = iotlbentry->addr;
>>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
>>      uint64_t val;
>> +    bool locked = false;
>>
>>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>>      cpu->mem_io_pc = retaddr;
>> @@ -503,7 +505,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>>      }
>>
>>      cpu->mem_io_vaddr = addr;
>> +
>> +    if (mr->global_locking) {
>> +        qemu_mutex_lock_iothread();
>> +        locked = true;
>> +    }
>>      memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
>> +    if (locked) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>> +
>>      return val;
>>  }
>>
>> @@ -514,15 +525,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>>      CPUState *cpu = ENV_GET_CPU(env);
>>      hwaddr physaddr = iotlbentry->addr;
>>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
>> +    bool locked = false;
>>
>>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>>      if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
>>          cpu_io_recompile(cpu, retaddr);
>>      }
>> -
>>      cpu->mem_io_vaddr = addr;
>>      cpu->mem_io_pc = retaddr;
>> +
>> +    if (mr->global_locking) {
>> +        qemu_mutex_lock_iothread();
>> +        locked = true;
>> +    }
>>      memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
>> +    if (locked) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>  }
>>
>>  /* Return true if ADDR is present in the victim tlb, and has been copied
>> diff --git a/exec.c b/exec.c
>> index 865a1e8295..3adf2b1861 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2134,9 +2134,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
>>                  }
>>                  cpu->watchpoint_hit = wp;
>>
>> -                /* The tb_lock will be reset when cpu_loop_exit or
>> -                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
>> -                 * main loop.
>> +                /* Both tb_lock and iothread_mutex will be reset when
>> +                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
>> +                 * back into the cpu_exec main loop.
>>                   */
>>                  tb_lock();
>>                  tb_check_watchpoint(cpu);
>> @@ -2371,8 +2371,14 @@ static void io_mem_init(void)
>>      memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
>>      memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
>>                            NULL, UINT64_MAX);
>> +
>> +    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
>> +     * which can be called without the iothread mutex.
>> +     */
>>      memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
>>                            NULL, UINT64_MAX);
>> +    memory_region_clear_global_locking(&io_mem_notdirty);
>> +
>>      memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
>>                            NULL, UINT64_MAX);
>>  }
>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> index 49ff2e64fe..b98d1d69f5 100644
>> --- a/hw/core/irq.c
>> +++ b/hw/core/irq.c
>> @@ -22,6 +22,7 @@
>>   * THE SOFTWARE.
>>   */
>>  #include "qemu/osdep.h"
>> +#include "qemu/main-loop.h"
>>  #include "qemu-common.h"
>>  #include "hw/irq.h"
>>  #include "qom/object.h"
>> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
>> index 7135633863..82a49556af 100644
>> --- a/hw/i386/kvmvapic.c
>> +++ b/hw/i386/kvmvapic.c
>> @@ -457,8 +457,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
>>      resume_all_vcpus();
>>
>>      if (!kvm_enabled()) {
>> -        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
>> -         * back into the cpu_exec loop. */
>> +        /* Both tb_lock and iothread_mutex will be reset when
>> +         *  longjmps back into the cpu_exec loop. */
>>          tb_lock();
>>          tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
>>          cpu_loop_exit_noexc(cs);
>> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
>> index c25ee03556..f775aba507 100644
>> --- a/hw/intc/arm_gicv3_cpuif.c
>> +++ b/hw/intc/arm_gicv3_cpuif.c
>> @@ -14,6 +14,7 @@
>>
>>  #include "qemu/osdep.h"
>>  #include "qemu/bitops.h"
>> +#include "qemu/main-loop.h"
>>  #include "trace.h"
>>  #include "gicv3_internal.h"
>>  #include "cpu.h"
>> @@ -733,6 +734,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
>>      ARMCPU *cpu = ARM_CPU(cs->cpu);
>>      CPUARMState *env = &cpu->env;
>>
>> +    g_assert(qemu_mutex_iothread_locked());
>> +
>>      trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
>>                               cs->hppi.grp, cs->hppi.prio);
>>
>> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
>> index d171e60b5c..5f93083d4a 100644
>> --- a/hw/ppc/ppc.c
>> +++ b/hw/ppc/ppc.c
>> @@ -62,7 +62,16 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>>  {
>>      CPUState *cs = CPU(cpu);
>>      CPUPPCState *env = &cpu->env;
>> -    unsigned int old_pending = env->pending_interrupts;
>> +    unsigned int old_pending;
>> +    bool locked = false;
>> +
>> +    /* We may already have the BQL if coming from the reset path */
>> +    if (!qemu_mutex_iothread_locked()) {
>> +        locked = true;
>> +        qemu_mutex_lock_iothread();
>> +    }
>> +
>> +    old_pending = env->pending_interrupts;
>>
>>      if (level) {
>>          env->pending_interrupts |= 1 << n_IRQ;
>> @@ -80,9 +89,14 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>>  #endif
>>      }
>>
>> +
>>      LOG_IRQ("%s: %p n_IRQ %d level %d => pending %08" PRIx32
>>                  "req %08x\n", __func__, env, n_IRQ, level,
>>                  env->pending_interrupts, CPU(cpu)->interrupt_request);
>> +
>> +    if (locked) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>  }
>>
>>  /* PowerPC 6xx / 7xx internal IRQ controller */
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index e465d7ac98..b1e374f3f9 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1010,6 +1010,9 @@ static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
>>  {
>>      CPUPPCState *env = &cpu->env;
>>
>> +    /* The TCG path should also be holding the BQL at this point */
>> +    g_assert(qemu_mutex_iothread_locked());
>> +
>>      if (msr_pr) {
>>          hcall_dprintf("Hypercall made with MSR[PR]=1\n");
>>          env->gpr[3] = H_PRIVILEGE;
>> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> index 2cf4ecf144..10db89b16a 100644
>> --- a/include/qom/cpu.h
>> +++ b/include/qom/cpu.h
>> @@ -329,6 +329,7 @@ struct CPUState {
>>      bool unplug;
>>      bool crash_occurred;
>>      bool exit_request;
>> +    /* updates protected by BQL */
>>      uint32_t interrupt_request;
>>      int singlestep_enabled;
>>      int64_t icount_extra;
>> diff --git a/memory.c b/memory.c
>> index ed8b5aa83e..d61caee867 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
>>      AddressSpace *as;
>>
>>      assert(memory_region_transaction_depth);
>> +    assert(qemu_mutex_iothread_locked());
>> +
>>      --memory_region_transaction_depth;
>>      if (!memory_region_transaction_depth) {
>>          if (memory_region_update_pending) {
>> diff --git a/qom/cpu.c b/qom/cpu.c
>> index ed87c50cea..58784bcbea 100644
>> --- a/qom/cpu.c
>> +++ b/qom/cpu.c
>> @@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
>>      error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
>>  }
>>
>> +/* Resetting the IRQ comes from across the code base so we take the
>> + * BQL here if we need to.  cpu_interrupt assumes it is held.*/
>>  void cpu_reset_interrupt(CPUState *cpu, int mask)
>>  {
>> +    bool need_lock = !qemu_mutex_iothread_locked();
>> +
>> +    if (need_lock) {
>> +        qemu_mutex_lock_iothread();
>> +    }
>>      cpu->interrupt_request &= ~mask;
>> +    if (need_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>  }
>>
>>  void cpu_exit(CPUState *cpu)
>> diff --git a/target/arm/helper.c b/target/arm/helper.c
>> index 47250bcf16..753a69d40d 100644
>> --- a/target/arm/helper.c
>> +++ b/target/arm/helper.c
>> @@ -6769,6 +6769,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
>>          arm_cpu_do_interrupt_aarch32(cs);
>>      }
>>
>> +    /* Hooks may change global state so BQL should be held, also the
>> +     * BQL needs to be held for any modification of
>> +     * cs->interrupt_request.
>> +     */
>> +    g_assert(qemu_mutex_iothread_locked());
>> +
>>      arm_call_el_change_hook(cpu);
>>
>>      if (!kvm_enabled()) {
>> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
>> index fb366fdc35..5f3e3bdae2 100644
>> --- a/target/arm/op_helper.c
>> +++ b/target/arm/op_helper.c
>> @@ -18,6 +18,7 @@
>>   */
>>  #include "qemu/osdep.h"
>>  #include "qemu/log.h"
>> +#include "qemu/main-loop.h"
>>  #include "cpu.h"
>>  #include "exec/helper-proto.h"
>>  #include "internals.h"
>> @@ -487,7 +488,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
>>       */
>>      env->regs[15] &= (env->thumb ? ~1 : ~3);
>>
>> +    qemu_mutex_lock_iothread();
>>      arm_call_el_change_hook(arm_env_get_cpu(env));
>> +    qemu_mutex_unlock_iothread();
>>  }
>>
>>  /* Access to user mode registers from privileged modes.  */
>> @@ -735,28 +738,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
>>  {
>>      const ARMCPRegInfo *ri = rip;
>>
>> -    ri->writefn(env, ri, value);
>> +    if (ri->type & ARM_CP_IO) {
>> +        qemu_mutex_lock_iothread();
>> +        ri->writefn(env, ri, value);
>> +        qemu_mutex_unlock_iothread();
>> +    } else {
>> +        ri->writefn(env, ri, value);
>> +    }
>>  }
>>
>>  uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
>>  {
>>      const ARMCPRegInfo *ri = rip;
>> +    uint32_t res;
>>
>> -    return ri->readfn(env, ri);
>> +    if (ri->type & ARM_CP_IO) {
>> +        qemu_mutex_lock_iothread();
>> +        res = ri->readfn(env, ri);
>> +        qemu_mutex_unlock_iothread();
>> +    } else {
>> +        res = ri->readfn(env, ri);
>> +    }
>> +
>> +    return res;
>>  }
>>
>>  void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
>>  {
>>      const ARMCPRegInfo *ri = rip;
>>
>> -    ri->writefn(env, ri, value);
>> +    if (ri->type & ARM_CP_IO) {
>> +        qemu_mutex_lock_iothread();
>> +        ri->writefn(env, ri, value);
>> +        qemu_mutex_unlock_iothread();
>> +    } else {
>> +        ri->writefn(env, ri, value);
>> +    }
>>  }
>>
>>  uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
>>  {
>>      const ARMCPRegInfo *ri = rip;
>> +    uint64_t res;
>> +
>> +    if (ri->type & ARM_CP_IO) {
>> +        qemu_mutex_lock_iothread();
>> +        res = ri->readfn(env, ri);
>> +        qemu_mutex_unlock_iothread();
>> +    } else {
>> +        res = ri->readfn(env, ri);
>> +    }
>>
>> -    return ri->readfn(env, ri);
>> +    return res;
>>  }
>>
>>  void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
>> @@ -989,7 +1022,9 @@ void HELPER(exception_return)(CPUARMState *env)
>>                        cur_el, new_el, env->pc);
>>      }
>>
>> +    qemu_mutex_lock_iothread();
>>      arm_call_el_change_hook(arm_env_get_cpu(env));
>> +    qemu_mutex_unlock_iothread();
>>
>>      return;
>>
>> diff --git a/target/i386/smm_helper.c b/target/i386/smm_helper.c
>> index 4dd6a2c544..f051a77c4a 100644
>> --- a/target/i386/smm_helper.c
>> +++ b/target/i386/smm_helper.c
>> @@ -18,6 +18,7 @@
>>   */
>>
>>  #include "qemu/osdep.h"
>> +#include "qemu/main-loop.h"
>>  #include "cpu.h"
>>  #include "exec/helper-proto.h"
>>  #include "exec/log.h"
>> @@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
>>  #define SMM_REVISION_ID 0x00020000
>>  #endif
>>
>> +/* Called with iothread lock taken */
>>  void cpu_smm_update(X86CPU *cpu)
>>  {
>>      CPUX86State *env = &cpu->env;
>>      bool smm_enabled = (env->hflags & HF_SMM_MASK);
>>
>> +    g_assert(qemu_mutex_iothread_locked());
>> +
>>      if (cpu->smram) {
>>          memory_region_set_enabled(cpu->smram, smm_enabled);
>>      }
>> @@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
>>      }
>>      env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
>>      env->hflags &= ~HF_SMM_MASK;
>> +
>> +    qemu_mutex_lock_iothread();
>>      cpu_smm_update(cpu);
>> +    qemu_mutex_unlock_iothread();
>>
>>      qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
>>      log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
>> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
>> index c9604ea9c7..3cb942e8bb 100644
>> --- a/target/s390x/misc_helper.c
>> +++ b/target/s390x/misc_helper.c
>> @@ -25,6 +25,7 @@
>>  #include "exec/helper-proto.h"
>>  #include "sysemu/kvm.h"
>>  #include "qemu/timer.h"
>> +#include "qemu/main-loop.h"
>>  #include "exec/address-spaces.h"
>>  #ifdef CONFIG_KVM
>>  #include <linux/kvm.h>
>> @@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
>>  /* SCLP service call */
>>  uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
>>  {
>> +    qemu_mutex_lock_iothread();
>>      int r = sclp_service_call(env, r1, r2);
>>      if (r < 0) {
>>          program_interrupt(env, -r, 4);
>> -        return 0;
>> +        r = 0;
>>      }
>> +    qemu_mutex_unlock_iothread();
>>      return r;
>>  }
>>
>> diff --git a/translate-all.c b/translate-all.c
>> index 8a861cb583..f810259c41 100644
>> --- a/translate-all.c
>> +++ b/translate-all.c
>> @@ -55,6 +55,7 @@
>>  #include "translate-all.h"
>>  #include "qemu/bitmap.h"
>>  #include "qemu/timer.h"
>> +#include "qemu/main-loop.h"
>>  #include "exec/log.h"
>>
>>  /* #define DEBUG_TB_INVALIDATE */
>> @@ -1523,7 +1524,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
>>  #ifdef CONFIG_SOFTMMU
>>  /* len must be <= 8 and start must be a multiple of len.
>>   * Called via softmmu_template.h when code areas are written to with
>> - * tb_lock held.
>> + * iothread mutex not held.
>>   */
>>  void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
>>  {
>> @@ -1725,7 +1726,10 @@ void tb_check_watchpoint(CPUState *cpu)
>>
>>  #ifndef CONFIG_USER_ONLY
>>  /* in deterministic execution mode, instructions doing device I/Os
>> -   must be at the end of the TB */
>> + * must be at the end of the TB.
>> + *
>> + * Called by softmmu_template.h, with iothread mutex not held.
>> + */
>>  void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
>>  {
>>  #if defined(TARGET_MIPS) || defined(TARGET_SH4)
>> @@ -1937,6 +1941,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
>>
>>  void cpu_interrupt(CPUState *cpu, int mask)
>>  {
>> +    g_assert(qemu_mutex_iothread_locked());
>>      cpu->interrupt_request |= mask;
>>      cpu->tcg_exit_req = 1;
>>  }
>> diff --git a/translate-common.c b/translate-common.c
>> index 5e989cdf70..d504dd0d33 100644
>> --- a/translate-common.c
>> +++ b/translate-common.c
>> @@ -21,6 +21,7 @@
>>  #include "qemu-common.h"
>>  #include "qom/cpu.h"
>>  #include "sysemu/cpus.h"
>> +#include "qemu/main-loop.h"
>>
>>  uintptr_t qemu_real_host_page_size;
>>  intptr_t qemu_real_host_page_mask;
>> @@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
>>  static void tcg_handle_interrupt(CPUState *cpu, int mask)
>>  {
>>      int old_mask;
>> +    g_assert(qemu_mutex_iothread_locked());
>>
>>      old_mask = cpu->interrupt_request;
>>      cpu->interrupt_request |= mask;
>> @@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
>>       */
>>      if (!qemu_cpu_is_self(cpu)) {
>>          qemu_cpu_kick(cpu);
>> -        return;
>> -    }
>> -
>> -    if (use_icount) {
>> -        cpu->icount_decr.u16.high = 0xffff;
>> -        if (!cpu->can_do_io
>> -            && (mask & ~old_mask) != 0) {
>> -            cpu_abort(cpu, "Raised interrupt while not in I/O function");
>> -        }
>>      } else {
>> -        cpu->tcg_exit_req = 1;
>> +        if (use_icount) {
>> +            cpu->icount_decr.u16.high = 0xffff;
>> +            if (!cpu->can_do_io
>> +                && (mask & ~old_mask) != 0) {
>> +                cpu_abort(cpu, "Raised interrupt while not in I/O function");
>> +            }
>> +        } else {
>> +            cpu->tcg_exit_req = 1;
>> +        }
>>      }
>>  }
>>
>> --
>> 2.11.0
>>
>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-27 12:39 ` Paolo Bonzini
@ 2017-02-27 15:48   ` Alex Bennée
  2017-02-27 16:17     ` Paolo Bonzini
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-02-27 15:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/02/2017 12:20, Alex Bennée wrote:
>> The following changes since commit 2d896b454a0e19ec4c1ddbb0e0b65b7e54fcedf3:
>>
>>   Revert "hw/mips: MIPS Boston board support" (2017-02-23 18:04:45 +0000)
>>
>> are available in the git repository at:
>>
>>   https://github.com/stsquad/qemu.git tags/pull-mttcg-240217-1
>>
>> for you to fetch changes up to ca759f9e387db87e1719911f019bc60c74be9ed8:
>>
>>   tcg: enable MTTCG by default for ARM on x86 hosts (2017-02-24 10:32:46 +0000)
>>
>> ----------------------------------------------------------------
>> This is the MTTCG pull-request as posted yesterday.
>
> This breaks "-icount auto" on qemu-system-aarch64 with "-M virt" and
> AAVMF firmware, in two ways:
>
> 1) "-icount auto" doesn't work;

Currently the code does:

  static bool default_mttcg_enabled(void)
  {
      QemuOpts *icount_opts = qemu_find_opts_singleton("icount");
      const char *rr = qemu_opt_get(icount_opts, "rr");

      if (rr || TCG_OVERSIZED_GUEST) {
          return false;

I suspect I should just fail if any icount options are set. However
qemu_find_opts_singleton always returns the structure. How do I test
for any icount options?

>
> 2) "-icount auto -accel tcg,thread=single" hangs fairly early, printing
> this on the serial console.  It's okay if it hangs at
>
> [Bds]=============End Load Options Dumping=============
> [Bds]BdsWait ...Zzzzzzzzzzzz...
> [Bds]BdsWait(3)..Zzzz...
>
> (pressing Enter a few times then seems to unhang it), but it now hangs
> much earlier than that.

Hmm I can see a hang on linux booting like that. It looks like a vCPU
get stuck in waitio and never schedules the others.

>
>
> Also, x86 "-accel tcg,thread=multi" prints the scary message on memory
> ordering.

That is expected until x86 is properly tested and we submit the default
enabling patch for x86 on x86. To be honest we could submit the MO patch
now to make that warning go away.

>
> Paolo
>
>> ----------------------------------------------------------------
>> Alex Bennée (18):
>>       docs: new design document multi-thread-tcg.txt
>>       tcg: move TCG_MO/BAR types into own file
>>       tcg: add kick timer for single-threaded vCPU emulation
>>       tcg: rename tcg_current_cpu to tcg_current_rr_cpu
>>       tcg: remove global exit_request
>>       tcg: enable tb_lock() for SoftMMU
>>       tcg: enable thread-per-vCPU
>>       cputlb: add assert_cpu_is_self checks
>>       cputlb: tweak qemu_ram_addr_from_host_nofail reporting
>>       cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
>>       cputlb: add tlb_flush_by_mmuidx async routines
>>       cputlb: atomically update tlb fields used by tlb_reset_dirty
>>       cputlb: introduce tlb_flush_*_all_cpus[_synced]
>>       target-arm/powerctl: defer cpu reset work to CPU context
>>       target-arm: don't generate WFE/YIELD calls for MTTCG
>>       target-arm: ensure all cross vCPUs TLB flushes complete
>>       hw/misc/imx6_src: defer clearing of SRC_SCR reset bits
>>       tcg: enable MTTCG by default for ARM on x86 hosts
>>
>> Jan Kiszka (1):
>>       tcg: drop global lock during TCG code execution
>>
>> KONRAD Frederic (2):
>>       tcg: add options for enabling MTTCG
>>       cputlb: introduce tlb_flush_* async work.
>>
>> Pranith Kumar (3):
>>       mttcg: translate-all: Enable locking debug in a debug build
>>       mttcg: Add missing tb_lock/unlock() in cpu_exec_step()
>>       tcg: handle EXCP_ATOMIC exception for system emulation
>>
>>  configure                  |   6 +
>>  cpu-exec-common.c          |   3 -
>>  cpu-exec.c                 |  89 ++++++---
>>  cpus.c                     | 345 ++++++++++++++++++++++++++-------
>>  cputlb.c                   | 463 +++++++++++++++++++++++++++++++++++++--------
>>  docs/multi-thread-tcg.txt  | 350 ++++++++++++++++++++++++++++++++++
>>  exec.c                     |  12 +-
>>  hw/core/irq.c              |   1 +
>>  hw/i386/kvmvapic.c         |   4 +-
>>  hw/intc/arm_gicv3_cpuif.c  |   3 +
>>  hw/misc/imx6_src.c         |  58 +++++-
>>  hw/ppc/ppc.c               |  16 +-
>>  hw/ppc/spapr.c             |   3 +
>>  include/exec/cputlb.h      |   2 -
>>  include/exec/exec-all.h    | 132 +++++++++++--
>>  include/qom/cpu.h          |  16 ++
>>  include/sysemu/cpus.h      |   2 +
>>  memory.c                   |   2 +
>>  qemu-options.hx            |  20 ++
>>  qom/cpu.c                  |  10 +
>>  target/arm/arm-powerctl.c  | 202 +++++++++++++-------
>>  target/arm/arm-powerctl.h  |   2 +
>>  target/arm/cpu.c           |   4 +-
>>  target/arm/cpu.h           |  18 +-
>>  target/arm/helper.c        | 219 ++++++++++-----------
>>  target/arm/kvm.c           |   7 +-
>>  target/arm/machine.c       |  41 +++-
>>  target/arm/op_helper.c     |  50 ++++-
>>  target/arm/psci.c          |   4 +-
>>  target/arm/translate-a64.c |   8 +-
>>  target/arm/translate.c     |  20 +-
>>  target/i386/smm_helper.c   |   7 +
>>  target/s390x/misc_helper.c |   5 +-
>>  target/sparc/ldst_helper.c |   8 +-
>>  tcg/i386/tcg-target.h      |  11 ++
>>  tcg/tcg-mo.h               |  48 +++++
>>  tcg/tcg.h                  |  27 +--
>>  translate-all.c            |  66 ++-----
>>  translate-common.c         |  21 +-
>>  vl.c                       |  49 ++++-
>>  40 files changed, 1878 insertions(+), 476 deletions(-)
>>  create mode 100644 docs/multi-thread-tcg.txt
>>  create mode 100644 tcg/tcg-mo.h
>>
>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement
  2017-02-27 15:48   ` Alex Bennée
@ 2017-02-27 16:17     ` Paolo Bonzini
  0 siblings, 0 replies; 55+ messages in thread
From: Paolo Bonzini @ 2017-02-27 16:17 UTC (permalink / raw)
  To: Alex Bennée; +Cc: peter.maydell, qemu-devel



On 27/02/2017 16:48, Alex Bennée wrote:
> Currently the code does:
> 
>   static bool default_mttcg_enabled(void)
>   {
>       QemuOpts *icount_opts = qemu_find_opts_singleton("icount");
>       const char *rr = qemu_opt_get(icount_opts, "rr");
> 
>       if (rr || TCG_OVERSIZED_GUEST) {
>           return false;
> 
> I suspect I should just fail if any icount options are set.

Yes.

> However
> qemu_find_opts_singleton always returns the structure. How do I test
> for any icount options?

use_icount != 0 if configure_icount has been called already.

>> 2) "-icount auto -accel tcg,thread=single" hangs fairly early, printing
>> this on the serial console.  It's okay if it hangs at
>>
>> [Bds]=============End Load Options Dumping=============
>> [Bds]BdsWait ...Zzzzzzzzzzzz...
>> [Bds]BdsWait(3)..Zzzz...
>>
>> (pressing Enter a few times then seems to unhang it), but it now hangs
>> much earlier than that.
> 
> Hmm I can see a hang on linux booting like that. It looks like a vCPU
> get stuck in waitio and never schedules the others.

I don't know if it's the same. For OVMF it fails in many different ways
and it complains about unexpected expensions too.

>> Also, x86 "-accel tcg,thread=multi" prints the scary message on memory
>> ordering.
>
> That is expected until x86 is properly tested and we submit the default
> enabling patch for x86 on x86. To be honest we could submit the MO patch
> now to make that warning go away.

Yeah, this is just the message being unnecessarily specific.

Paolo

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution
  2017-02-27 14:39     ` Alex Bennée
@ 2017-03-03 20:59       ` Aaron Lindsay
  2017-03-03 21:08         ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Aaron Lindsay @ 2017-03-03 20:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Laurent Desnogues, Jan Kiszka, open list:ARM cores, qemu-devel

On Feb 27 14:39, Alex Bennée wrote:
> 
> Laurent Desnogues <laurent.desnogues@gmail.com> writes:
> 
> > Hello,
> >
> > On Fri, Feb 24, 2017 at 12:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
> >> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>
> >> This finally allows TCG to benefit from the iothread introduction: Drop
> >> the global mutex while running pure TCG CPU code. Reacquire the lock
> >> when entering MMIO or PIO emulation, or when leaving the TCG loop.
> >>
> >> We have to revert a few optimization for the current TCG threading
> >> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
> >> kicking it in qemu_cpu_kick. We also need to disable RAM block
> >> reordering until we have a more efficient locking mechanism at hand.
> >>
> >> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
> >> These numbers demonstrate where we gain something:
> >>
> >> 20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
> >> 20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
> >>
> >> The guest CPU was fully loaded, but the iothread could still run mostly
> >> independent on a second core. Without the patch we don't get beyond
> >>
> >> 32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
> >> 32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
> >>
> >> We don't benefit significantly, though, when the guest is not fully
> >> loading a host CPU.
> >
> > I tried this patch (8d04fb55 in the repository) with the following image:
> >
> >    http://wiki.qemu.org/download/arm-test-0.2.tar.gz
> >
> > Running the image with no option works fine.  But specifying '-icount
> > 1' results in a (guest) deadlock. Enabling some heavy logging (-d
> > in_asm,exec) sometimes results in a 'Bad ram offset'.
> >
> > Is it expected that this patch breaks -icount?
> 
> Not really. Using icount will disable MTTCG and run single threaded as
> before. Paolo reported another icount failure so they may be related. I
> shall have a look at it.
> 
> Thanks for the report.

I have not experienced a guest deadlock, but for me this patch makes
booting a simple Linux distribution take about an order of magnitude
longer when using '-icount 0' (from about 1.6 seconds to 17.9). It was
slow enough to get to the printk the first time after recompiling that I
killed it thinking it *had* deadlocked.

`perf report` from before this patch (snipped to >1%):
 23.81%  qemu-system-aar  perf-9267.map        [.] 0x0000000041a5cc9e
  7.15%  qemu-system-aar  [kernel.kallsyms]    [k] 0xffffffff8172bc82
  6.29%  qemu-system-aar  qemu-system-aarch64  [.] cpu_exec
  4.99%  qemu-system-aar  qemu-system-aarch64  [.] tcg_gen_code
  4.71%  qemu-system-aar  qemu-system-aarch64  [.] cpu_get_tb_cpu_state
  4.39%  qemu-system-aar  qemu-system-aarch64  [.] tcg_optimize
  3.28%  qemu-system-aar  qemu-system-aarch64  [.] helper_dc_zva
  2.66%  qemu-system-aar  qemu-system-aarch64  [.] liveness_pass_1
  1.98%  qemu-system-aar  qemu-system-aarch64  [.] qht_lookup
  1.93%  qemu-system-aar  qemu-system-aarch64  [.] tcg_out_opc
  1.81%  qemu-system-aar  qemu-system-aarch64  [.] get_phys_addr_lpae
  1.71%  qemu-system-aar  qemu-system-aarch64  [.] object_class_dynamic_cast_assert
  1.38%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi1
  1.10%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi0

and after this patch:
 20.10%  qemu-system-aar  perf-3285.map        [.] 0x0000000040a3b690
*18.08%  qemu-system-aar  [kernel.kallsyms]    [k] 0xffffffff81371865
  7.87%  qemu-system-aar  qemu-system-aarch64  [.] cpu_exec
  4.70%  qemu-system-aar  qemu-system-aarch64  [.] cpu_get_tb_cpu_state
* 2.64%  qemu-system-aar  qemu-system-aarch64  [.] g_mutex_get_impl
  2.39%  qemu-system-aar  qemu-system-aarch64  [.] gic_update
* 1.89%  qemu-system-aar  qemu-system-aarch64  [.] pthread_mutex_unlock
  1.61%  qemu-system-aar  qemu-system-aarch64  [.] object_class_dynamic_cast_assert
* 1.55%  qemu-system-aar  qemu-system-aarch64  [.] pthread_mutex_lock
  1.31%  qemu-system-aar  qemu-system-aarch64  [.] get_phys_addr_lpae
  1.21%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi0
  1.13%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi1

I've put asterisks by a few suspicious mutex-related functions, though I wonder
if the slowdowns are also partially inlined into some of the other functions.
The kernel also jumps up, presumably from handling more mutexes?

I confess I'm not familiar enough with this code to suggest optimizations, but
I'll be glad to test any.

-Aaron

> 
> >
> > Thanks,
> >
> > Laurent
> >
> > PS - To clarify 791158d9 works.
> >
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
> >> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
> >> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
> >> [EGC: fixed iothread lock for cpu-exec IRQ handling]
> >> Signed-off-by: Emilio G. Cota <cota@braap.org>
> >> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> Reviewed-by: Richard Henderson <rth@twiddle.net>
> >> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
> >> [PM: target-arm changes]
> >> Acked-by: Peter Maydell <peter.maydell@linaro.org>
> >> ---
> >>  cpu-exec.c                 | 23 +++++++++++++++++++++--
> >>  cpus.c                     | 28 +++++-----------------------
> >>  cputlb.c                   | 21 ++++++++++++++++++++-
> >>  exec.c                     | 12 +++++++++---
> >>  hw/core/irq.c              |  1 +
> >>  hw/i386/kvmvapic.c         |  4 ++--
> >>  hw/intc/arm_gicv3_cpuif.c  |  3 +++
> >>  hw/ppc/ppc.c               | 16 +++++++++++++++-
> >>  hw/ppc/spapr.c             |  3 +++
> >>  include/qom/cpu.h          |  1 +
> >>  memory.c                   |  2 ++
> >>  qom/cpu.c                  | 10 ++++++++++
> >>  target/arm/helper.c        |  6 ++++++
> >>  target/arm/op_helper.c     | 43 +++++++++++++++++++++++++++++++++++++++----
> >>  target/i386/smm_helper.c   |  7 +++++++
> >>  target/s390x/misc_helper.c |  5 ++++-
> >>  translate-all.c            |  9 +++++++--
> >>  translate-common.c         | 21 +++++++++++----------
> >>  18 files changed, 166 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/cpu-exec.c b/cpu-exec.c
> >> index 06a6b25564..1bd3d72002 100644
> >> --- a/cpu-exec.c
> >> +++ b/cpu-exec.c
> >> @@ -29,6 +29,7 @@
> >>  #include "qemu/rcu.h"
> >>  #include "exec/tb-hash.h"
> >>  #include "exec/log.h"
> >> +#include "qemu/main-loop.h"
> >>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
> >>  #include "hw/i386/apic.h"
> >>  #endif
> >> @@ -388,8 +389,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
> >>          if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
> >>              && replay_interrupt()) {
> >>              X86CPU *x86_cpu = X86_CPU(cpu);
> >> +            qemu_mutex_lock_iothread();
> >>              apic_poll_irq(x86_cpu->apic_state);
> >>              cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
> >> +            qemu_mutex_unlock_iothread();
> >>          }
> >>  #endif
> >>          if (!cpu_has_work(cpu)) {
> >> @@ -443,7 +446,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
> >>  #else
> >>              if (replay_exception()) {
> >>                  CPUClass *cc = CPU_GET_CLASS(cpu);
> >> +                qemu_mutex_lock_iothread();
> >>                  cc->do_interrupt(cpu);
> >> +                qemu_mutex_unlock_iothread();
> >>                  cpu->exception_index = -1;
> >>              } else if (!replay_has_interrupt()) {
> >>                  /* give a chance to iothread in replay mode */
> >> @@ -469,9 +474,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>                                          TranslationBlock **last_tb)
> >>  {
> >>      CPUClass *cc = CPU_GET_CLASS(cpu);
> >> -    int interrupt_request = cpu->interrupt_request;
> >>
> >> -    if (unlikely(interrupt_request)) {
> >> +    if (unlikely(atomic_read(&cpu->interrupt_request))) {
> >> +        int interrupt_request;
> >> +        qemu_mutex_lock_iothread();
> >> +        interrupt_request = cpu->interrupt_request;
> >>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
> >>              /* Mask out external interrupts for this step. */
> >>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> >> @@ -479,6 +486,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
> >>              cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> >>              cpu->exception_index = EXCP_DEBUG;
> >> +            qemu_mutex_unlock_iothread();
> >>              return true;
> >>          }
> >>          if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
> >> @@ -488,6 +496,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>              cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
> >>              cpu->halted = 1;
> >>              cpu->exception_index = EXCP_HLT;
> >> +            qemu_mutex_unlock_iothread();
> >>              return true;
> >>          }
> >>  #if defined(TARGET_I386)
> >> @@ -498,12 +507,14 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>              cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
> >>              do_cpu_init(x86_cpu);
> >>              cpu->exception_index = EXCP_HALTED;
> >> +            qemu_mutex_unlock_iothread();
> >>              return true;
> >>          }
> >>  #else
> >>          else if (interrupt_request & CPU_INTERRUPT_RESET) {
> >>              replay_interrupt();
> >>              cpu_reset(cpu);
> >> +            qemu_mutex_unlock_iothread();
> >>              return true;
> >>          }
> >>  #endif
> >> @@ -526,7 +537,12 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >>                 the program flow was changed */
> >>              *last_tb = NULL;
> >>          }
> >> +
> >> +        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
> >> +        qemu_mutex_unlock_iothread();
> >>      }
> >> +
> >> +
> >>      if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
> >>          atomic_set(&cpu->exit_request, 0);
> >>          cpu->exception_index = EXCP_INTERRUPT;
> >> @@ -643,6 +659,9 @@ int cpu_exec(CPUState *cpu)
> >>  #endif /* buggy compiler */
> >>          cpu->can_do_io = 1;
> >>          tb_lock_reset();
> >> +        if (qemu_mutex_iothread_locked()) {
> >> +            qemu_mutex_unlock_iothread();
> >> +        }
> >>      }
> >>
> >>      /* if an exception is pending, we execute it here */
> >> diff --git a/cpus.c b/cpus.c
> >> index 860034a794..0ae8f69be5 100644
> >> --- a/cpus.c
> >> +++ b/cpus.c
> >> @@ -1027,8 +1027,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
> >>  #endif /* _WIN32 */
> >>
> >>  static QemuMutex qemu_global_mutex;
> >> -static QemuCond qemu_io_proceeded_cond;
> >> -static unsigned iothread_requesting_mutex;
> >>
> >>  static QemuThread io_thread;
> >>
> >> @@ -1042,7 +1040,6 @@ void qemu_init_cpu_loop(void)
> >>      qemu_init_sigbus();
> >>      qemu_cond_init(&qemu_cpu_cond);
> >>      qemu_cond_init(&qemu_pause_cond);
> >> -    qemu_cond_init(&qemu_io_proceeded_cond);
> >>      qemu_mutex_init(&qemu_global_mutex);
> >>
> >>      qemu_thread_get_self(&io_thread);
> >> @@ -1085,10 +1082,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
> >>
> >>      start_tcg_kick_timer();
> >>
> >> -    while (iothread_requesting_mutex) {
> >> -        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
> >> -    }
> >> -
> >>      CPU_FOREACH(cpu) {
> >>          qemu_wait_io_event_common(cpu);
> >>      }
> >> @@ -1249,9 +1242,11 @@ static int tcg_cpu_exec(CPUState *cpu)
> >>          cpu->icount_decr.u16.low = decr;
> >>          cpu->icount_extra = count;
> >>      }
> >> +    qemu_mutex_unlock_iothread();
> >>      cpu_exec_start(cpu);
> >>      ret = cpu_exec(cpu);
> >>      cpu_exec_end(cpu);
> >> +    qemu_mutex_lock_iothread();
> >>  #ifdef CONFIG_PROFILER
> >>      tcg_time += profile_getclock() - ti;
> >>  #endif
> >> @@ -1479,27 +1474,14 @@ bool qemu_mutex_iothread_locked(void)
> >>
> >>  void qemu_mutex_lock_iothread(void)
> >>  {
> >> -    atomic_inc(&iothread_requesting_mutex);
> >> -    /* In the simple case there is no need to bump the VCPU thread out of
> >> -     * TCG code execution.
> >> -     */
> >> -    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
> >> -        !first_cpu || !first_cpu->created) {
> >> -        qemu_mutex_lock(&qemu_global_mutex);
> >> -        atomic_dec(&iothread_requesting_mutex);
> >> -    } else {
> >> -        if (qemu_mutex_trylock(&qemu_global_mutex)) {
> >> -            qemu_cpu_kick_rr_cpu();
> >> -            qemu_mutex_lock(&qemu_global_mutex);
> >> -        }
> >> -        atomic_dec(&iothread_requesting_mutex);
> >> -        qemu_cond_broadcast(&qemu_io_proceeded_cond);
> >> -    }
> >> +    g_assert(!qemu_mutex_iothread_locked());
> >> +    qemu_mutex_lock(&qemu_global_mutex);
> >>      iothread_locked = true;
> >>  }
> >>
> >>  void qemu_mutex_unlock_iothread(void)
> >>  {
> >> +    g_assert(qemu_mutex_iothread_locked());
> >>      iothread_locked = false;
> >>      qemu_mutex_unlock(&qemu_global_mutex);
> >>  }
> >> diff --git a/cputlb.c b/cputlb.c
> >> index 6c39927455..1cc9d9da51 100644
> >> --- a/cputlb.c
> >> +++ b/cputlb.c
> >> @@ -18,6 +18,7 @@
> >>   */
> >>
> >>  #include "qemu/osdep.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "cpu.h"
> >>  #include "exec/exec-all.h"
> >>  #include "exec/memory.h"
> >> @@ -495,6 +496,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
> >>      hwaddr physaddr = iotlbentry->addr;
> >>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
> >>      uint64_t val;
> >> +    bool locked = false;
> >>
> >>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
> >>      cpu->mem_io_pc = retaddr;
> >> @@ -503,7 +505,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
> >>      }
> >>
> >>      cpu->mem_io_vaddr = addr;
> >> +
> >> +    if (mr->global_locking) {
> >> +        qemu_mutex_lock_iothread();
> >> +        locked = true;
> >> +    }
> >>      memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
> >> +    if (locked) {
> >> +        qemu_mutex_unlock_iothread();
> >> +    }
> >> +
> >>      return val;
> >>  }
> >>
> >> @@ -514,15 +525,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
> >>      CPUState *cpu = ENV_GET_CPU(env);
> >>      hwaddr physaddr = iotlbentry->addr;
> >>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
> >> +    bool locked = false;
> >>
> >>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
> >>      if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
> >>          cpu_io_recompile(cpu, retaddr);
> >>      }
> >> -
> >>      cpu->mem_io_vaddr = addr;
> >>      cpu->mem_io_pc = retaddr;
> >> +
> >> +    if (mr->global_locking) {
> >> +        qemu_mutex_lock_iothread();
> >> +        locked = true;
> >> +    }
> >>      memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
> >> +    if (locked) {
> >> +        qemu_mutex_unlock_iothread();
> >> +    }
> >>  }
> >>
> >>  /* Return true if ADDR is present in the victim tlb, and has been copied
> >> diff --git a/exec.c b/exec.c
> >> index 865a1e8295..3adf2b1861 100644
> >> --- a/exec.c
> >> +++ b/exec.c
> >> @@ -2134,9 +2134,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
> >>                  }
> >>                  cpu->watchpoint_hit = wp;
> >>
> >> -                /* The tb_lock will be reset when cpu_loop_exit or
> >> -                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
> >> -                 * main loop.
> >> +                /* Both tb_lock and iothread_mutex will be reset when
> >> +                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
> >> +                 * back into the cpu_exec main loop.
> >>                   */
> >>                  tb_lock();
> >>                  tb_check_watchpoint(cpu);
> >> @@ -2371,8 +2371,14 @@ static void io_mem_init(void)
> >>      memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
> >>      memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
> >>                            NULL, UINT64_MAX);
> >> +
> >> +    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
> >> +     * which can be called without the iothread mutex.
> >> +     */
> >>      memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
> >>                            NULL, UINT64_MAX);
> >> +    memory_region_clear_global_locking(&io_mem_notdirty);
> >> +
> >>      memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
> >>                            NULL, UINT64_MAX);
> >>  }
> >> diff --git a/hw/core/irq.c b/hw/core/irq.c
> >> index 49ff2e64fe..b98d1d69f5 100644
> >> --- a/hw/core/irq.c
> >> +++ b/hw/core/irq.c
> >> @@ -22,6 +22,7 @@
> >>   * THE SOFTWARE.
> >>   */
> >>  #include "qemu/osdep.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "qemu-common.h"
> >>  #include "hw/irq.h"
> >>  #include "qom/object.h"
> >> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
> >> index 7135633863..82a49556af 100644
> >> --- a/hw/i386/kvmvapic.c
> >> +++ b/hw/i386/kvmvapic.c
> >> @@ -457,8 +457,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
> >>      resume_all_vcpus();
> >>
> >>      if (!kvm_enabled()) {
> >> -        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
> >> -         * back into the cpu_exec loop. */
> >> +        /* Both tb_lock and iothread_mutex will be reset when
> >> +         *  longjmps back into the cpu_exec loop. */
> >>          tb_lock();
> >>          tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
> >>          cpu_loop_exit_noexc(cs);
> >> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
> >> index c25ee03556..f775aba507 100644
> >> --- a/hw/intc/arm_gicv3_cpuif.c
> >> +++ b/hw/intc/arm_gicv3_cpuif.c
> >> @@ -14,6 +14,7 @@
> >>
> >>  #include "qemu/osdep.h"
> >>  #include "qemu/bitops.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "trace.h"
> >>  #include "gicv3_internal.h"
> >>  #include "cpu.h"
> >> @@ -733,6 +734,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
> >>      ARMCPU *cpu = ARM_CPU(cs->cpu);
> >>      CPUARMState *env = &cpu->env;
> >>
> >> +    g_assert(qemu_mutex_iothread_locked());
> >> +
> >>      trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
> >>                               cs->hppi.grp, cs->hppi.prio);
> >>
> >> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> >> index d171e60b5c..5f93083d4a 100644
> >> --- a/hw/ppc/ppc.c
> >> +++ b/hw/ppc/ppc.c
> >> @@ -62,7 +62,16 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
> >>  {
> >>      CPUState *cs = CPU(cpu);
> >>      CPUPPCState *env = &cpu->env;
> >> -    unsigned int old_pending = env->pending_interrupts;
> >> +    unsigned int old_pending;
> >> +    bool locked = false;
> >> +
> >> +    /* We may already have the BQL if coming from the reset path */
> >> +    if (!qemu_mutex_iothread_locked()) {
> >> +        locked = true;
> >> +        qemu_mutex_lock_iothread();
> >> +    }
> >> +
> >> +    old_pending = env->pending_interrupts;
> >>
> >>      if (level) {
> >>          env->pending_interrupts |= 1 << n_IRQ;
> >> @@ -80,9 +89,14 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
> >>  #endif
> >>      }
> >>
> >> +
> >>      LOG_IRQ("%s: %p n_IRQ %d level %d => pending %08" PRIx32
> >>                  "req %08x\n", __func__, env, n_IRQ, level,
> >>                  env->pending_interrupts, CPU(cpu)->interrupt_request);
> >> +
> >> +    if (locked) {
> >> +        qemu_mutex_unlock_iothread();
> >> +    }
> >>  }
> >>
> >>  /* PowerPC 6xx / 7xx internal IRQ controller */
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index e465d7ac98..b1e374f3f9 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -1010,6 +1010,9 @@ static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
> >>  {
> >>      CPUPPCState *env = &cpu->env;
> >>
> >> +    /* The TCG path should also be holding the BQL at this point */
> >> +    g_assert(qemu_mutex_iothread_locked());
> >> +
> >>      if (msr_pr) {
> >>          hcall_dprintf("Hypercall made with MSR[PR]=1\n");
> >>          env->gpr[3] = H_PRIVILEGE;
> >> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> >> index 2cf4ecf144..10db89b16a 100644
> >> --- a/include/qom/cpu.h
> >> +++ b/include/qom/cpu.h
> >> @@ -329,6 +329,7 @@ struct CPUState {
> >>      bool unplug;
> >>      bool crash_occurred;
> >>      bool exit_request;
> >> +    /* updates protected by BQL */
> >>      uint32_t interrupt_request;
> >>      int singlestep_enabled;
> >>      int64_t icount_extra;
> >> diff --git a/memory.c b/memory.c
> >> index ed8b5aa83e..d61caee867 100644
> >> --- a/memory.c
> >> +++ b/memory.c
> >> @@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
> >>      AddressSpace *as;
> >>
> >>      assert(memory_region_transaction_depth);
> >> +    assert(qemu_mutex_iothread_locked());
> >> +
> >>      --memory_region_transaction_depth;
> >>      if (!memory_region_transaction_depth) {
> >>          if (memory_region_update_pending) {
> >> diff --git a/qom/cpu.c b/qom/cpu.c
> >> index ed87c50cea..58784bcbea 100644
> >> --- a/qom/cpu.c
> >> +++ b/qom/cpu.c
> >> @@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
> >>      error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
> >>  }
> >>
> >> +/* Resetting the IRQ comes from across the code base so we take the
> >> + * BQL here if we need to.  cpu_interrupt assumes it is held.*/
> >>  void cpu_reset_interrupt(CPUState *cpu, int mask)
> >>  {
> >> +    bool need_lock = !qemu_mutex_iothread_locked();
> >> +
> >> +    if (need_lock) {
> >> +        qemu_mutex_lock_iothread();
> >> +    }
> >>      cpu->interrupt_request &= ~mask;
> >> +    if (need_lock) {
> >> +        qemu_mutex_unlock_iothread();
> >> +    }
> >>  }
> >>
> >>  void cpu_exit(CPUState *cpu)
> >> diff --git a/target/arm/helper.c b/target/arm/helper.c
> >> index 47250bcf16..753a69d40d 100644
> >> --- a/target/arm/helper.c
> >> +++ b/target/arm/helper.c
> >> @@ -6769,6 +6769,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
> >>          arm_cpu_do_interrupt_aarch32(cs);
> >>      }
> >>
> >> +    /* Hooks may change global state so BQL should be held, also the
> >> +     * BQL needs to be held for any modification of
> >> +     * cs->interrupt_request.
> >> +     */
> >> +    g_assert(qemu_mutex_iothread_locked());
> >> +
> >>      arm_call_el_change_hook(cpu);
> >>
> >>      if (!kvm_enabled()) {
> >> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> >> index fb366fdc35..5f3e3bdae2 100644
> >> --- a/target/arm/op_helper.c
> >> +++ b/target/arm/op_helper.c
> >> @@ -18,6 +18,7 @@
> >>   */
> >>  #include "qemu/osdep.h"
> >>  #include "qemu/log.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "cpu.h"
> >>  #include "exec/helper-proto.h"
> >>  #include "internals.h"
> >> @@ -487,7 +488,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
> >>       */
> >>      env->regs[15] &= (env->thumb ? ~1 : ~3);
> >>
> >> +    qemu_mutex_lock_iothread();
> >>      arm_call_el_change_hook(arm_env_get_cpu(env));
> >> +    qemu_mutex_unlock_iothread();
> >>  }
> >>
> >>  /* Access to user mode registers from privileged modes.  */
> >> @@ -735,28 +738,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
> >>  {
> >>      const ARMCPRegInfo *ri = rip;
> >>
> >> -    ri->writefn(env, ri, value);
> >> +    if (ri->type & ARM_CP_IO) {
> >> +        qemu_mutex_lock_iothread();
> >> +        ri->writefn(env, ri, value);
> >> +        qemu_mutex_unlock_iothread();
> >> +    } else {
> >> +        ri->writefn(env, ri, value);
> >> +    }
> >>  }
> >>
> >>  uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
> >>  {
> >>      const ARMCPRegInfo *ri = rip;
> >> +    uint32_t res;
> >>
> >> -    return ri->readfn(env, ri);
> >> +    if (ri->type & ARM_CP_IO) {
> >> +        qemu_mutex_lock_iothread();
> >> +        res = ri->readfn(env, ri);
> >> +        qemu_mutex_unlock_iothread();
> >> +    } else {
> >> +        res = ri->readfn(env, ri);
> >> +    }
> >> +
> >> +    return res;
> >>  }
> >>
> >>  void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
> >>  {
> >>      const ARMCPRegInfo *ri = rip;
> >>
> >> -    ri->writefn(env, ri, value);
> >> +    if (ri->type & ARM_CP_IO) {
> >> +        qemu_mutex_lock_iothread();
> >> +        ri->writefn(env, ri, value);
> >> +        qemu_mutex_unlock_iothread();
> >> +    } else {
> >> +        ri->writefn(env, ri, value);
> >> +    }
> >>  }
> >>
> >>  uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
> >>  {
> >>      const ARMCPRegInfo *ri = rip;
> >> +    uint64_t res;
> >> +
> >> +    if (ri->type & ARM_CP_IO) {
> >> +        qemu_mutex_lock_iothread();
> >> +        res = ri->readfn(env, ri);
> >> +        qemu_mutex_unlock_iothread();
> >> +    } else {
> >> +        res = ri->readfn(env, ri);
> >> +    }
> >>
> >> -    return ri->readfn(env, ri);
> >> +    return res;
> >>  }
> >>
> >>  void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
> >> @@ -989,7 +1022,9 @@ void HELPER(exception_return)(CPUARMState *env)
> >>                        cur_el, new_el, env->pc);
> >>      }
> >>
> >> +    qemu_mutex_lock_iothread();
> >>      arm_call_el_change_hook(arm_env_get_cpu(env));
> >> +    qemu_mutex_unlock_iothread();
> >>
> >>      return;
> >>
> >> diff --git a/target/i386/smm_helper.c b/target/i386/smm_helper.c
> >> index 4dd6a2c544..f051a77c4a 100644
> >> --- a/target/i386/smm_helper.c
> >> +++ b/target/i386/smm_helper.c
> >> @@ -18,6 +18,7 @@
> >>   */
> >>
> >>  #include "qemu/osdep.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "cpu.h"
> >>  #include "exec/helper-proto.h"
> >>  #include "exec/log.h"
> >> @@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
> >>  #define SMM_REVISION_ID 0x00020000
> >>  #endif
> >>
> >> +/* Called with iothread lock taken */
> >>  void cpu_smm_update(X86CPU *cpu)
> >>  {
> >>      CPUX86State *env = &cpu->env;
> >>      bool smm_enabled = (env->hflags & HF_SMM_MASK);
> >>
> >> +    g_assert(qemu_mutex_iothread_locked());
> >> +
> >>      if (cpu->smram) {
> >>          memory_region_set_enabled(cpu->smram, smm_enabled);
> >>      }
> >> @@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
> >>      }
> >>      env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
> >>      env->hflags &= ~HF_SMM_MASK;
> >> +
> >> +    qemu_mutex_lock_iothread();
> >>      cpu_smm_update(cpu);
> >> +    qemu_mutex_unlock_iothread();
> >>
> >>      qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
> >>      log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
> >> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
> >> index c9604ea9c7..3cb942e8bb 100644
> >> --- a/target/s390x/misc_helper.c
> >> +++ b/target/s390x/misc_helper.c
> >> @@ -25,6 +25,7 @@
> >>  #include "exec/helper-proto.h"
> >>  #include "sysemu/kvm.h"
> >>  #include "qemu/timer.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "exec/address-spaces.h"
> >>  #ifdef CONFIG_KVM
> >>  #include <linux/kvm.h>
> >> @@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
> >>  /* SCLP service call */
> >>  uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
> >>  {
> >> +    qemu_mutex_lock_iothread();
> >>      int r = sclp_service_call(env, r1, r2);
> >>      if (r < 0) {
> >>          program_interrupt(env, -r, 4);
> >> -        return 0;
> >> +        r = 0;
> >>      }
> >> +    qemu_mutex_unlock_iothread();
> >>      return r;
> >>  }
> >>
> >> diff --git a/translate-all.c b/translate-all.c
> >> index 8a861cb583..f810259c41 100644
> >> --- a/translate-all.c
> >> +++ b/translate-all.c
> >> @@ -55,6 +55,7 @@
> >>  #include "translate-all.h"
> >>  #include "qemu/bitmap.h"
> >>  #include "qemu/timer.h"
> >> +#include "qemu/main-loop.h"
> >>  #include "exec/log.h"
> >>
> >>  /* #define DEBUG_TB_INVALIDATE */
> >> @@ -1523,7 +1524,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
> >>  #ifdef CONFIG_SOFTMMU
> >>  /* len must be <= 8 and start must be a multiple of len.
> >>   * Called via softmmu_template.h when code areas are written to with
> >> - * tb_lock held.
> >> + * iothread mutex not held.
> >>   */
> >>  void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
> >>  {
> >> @@ -1725,7 +1726,10 @@ void tb_check_watchpoint(CPUState *cpu)
> >>
> >>  #ifndef CONFIG_USER_ONLY
> >>  /* in deterministic execution mode, instructions doing device I/Os
> >> -   must be at the end of the TB */
> >> + * must be at the end of the TB.
> >> + *
> >> + * Called by softmmu_template.h, with iothread mutex not held.
> >> + */
> >>  void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
> >>  {
> >>  #if defined(TARGET_MIPS) || defined(TARGET_SH4)
> >> @@ -1937,6 +1941,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
> >>
> >>  void cpu_interrupt(CPUState *cpu, int mask)
> >>  {
> >> +    g_assert(qemu_mutex_iothread_locked());
> >>      cpu->interrupt_request |= mask;
> >>      cpu->tcg_exit_req = 1;
> >>  }
> >> diff --git a/translate-common.c b/translate-common.c
> >> index 5e989cdf70..d504dd0d33 100644
> >> --- a/translate-common.c
> >> +++ b/translate-common.c
> >> @@ -21,6 +21,7 @@
> >>  #include "qemu-common.h"
> >>  #include "qom/cpu.h"
> >>  #include "sysemu/cpus.h"
> >> +#include "qemu/main-loop.h"
> >>
> >>  uintptr_t qemu_real_host_page_size;
> >>  intptr_t qemu_real_host_page_mask;
> >> @@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
> >>  static void tcg_handle_interrupt(CPUState *cpu, int mask)
> >>  {
> >>      int old_mask;
> >> +    g_assert(qemu_mutex_iothread_locked());
> >>
> >>      old_mask = cpu->interrupt_request;
> >>      cpu->interrupt_request |= mask;
> >> @@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
> >>       */
> >>      if (!qemu_cpu_is_self(cpu)) {
> >>          qemu_cpu_kick(cpu);
> >> -        return;
> >> -    }
> >> -
> >> -    if (use_icount) {
> >> -        cpu->icount_decr.u16.high = 0xffff;
> >> -        if (!cpu->can_do_io
> >> -            && (mask & ~old_mask) != 0) {
> >> -            cpu_abort(cpu, "Raised interrupt while not in I/O function");
> >> -        }
> >>      } else {
> >> -        cpu->tcg_exit_req = 1;
> >> +        if (use_icount) {
> >> +            cpu->icount_decr.u16.high = 0xffff;
> >> +            if (!cpu->can_do_io
> >> +                && (mask & ~old_mask) != 0) {
> >> +                cpu_abort(cpu, "Raised interrupt while not in I/O function");
> >> +            }
> >> +        } else {
> >> +            cpu->tcg_exit_req = 1;
> >> +        }
> >>      }
> >>  }
> >>
> >> --
> >> 2.11.0
> >>
> >>
> 
> 
> --
> Alex Bennée
> 

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution
  2017-03-03 20:59       ` Aaron Lindsay
@ 2017-03-03 21:08         ` Alex Bennée
  0 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-03-03 21:08 UTC (permalink / raw)
  To: Aaron Lindsay
  Cc: Laurent Desnogues, Jan Kiszka, open list:ARM cores, qemu-devel


Aaron Lindsay <alindsay@codeaurora.org> writes:

> On Feb 27 14:39, Alex Bennée wrote:
>>
>> Laurent Desnogues <laurent.desnogues@gmail.com> writes:
>>
>> > Hello,
>> >
>> > On Fri, Feb 24, 2017 at 12:20 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>> >> From: Jan Kiszka <jan.kiszka@siemens.com>
>> >>
>> >> This finally allows TCG to benefit from the iothread introduction: Drop
>> >> the global mutex while running pure TCG CPU code. Reacquire the lock
>> >> when entering MMIO or PIO emulation, or when leaving the TCG loop.
>> >>
>> >> We have to revert a few optimization for the current TCG threading
>> >> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
>> >> kicking it in qemu_cpu_kick. We also need to disable RAM block
>> >> reordering until we have a more efficient locking mechanism at hand.
>> >>
>> >> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
>> >> These numbers demonstrate where we gain something:
>> >>
>> >> 20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
>> >> 20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
>> >>
>> >> The guest CPU was fully loaded, but the iothread could still run mostly
>> >> independent on a second core. Without the patch we don't get beyond
>> >>
>> >> 32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
>> >> 32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
>> >>
>> >> We don't benefit significantly, though, when the guest is not fully
>> >> loading a host CPU.
>> >
>> > I tried this patch (8d04fb55 in the repository) with the following image:
>> >
>> >    http://wiki.qemu.org/download/arm-test-0.2.tar.gz
>> >
>> > Running the image with no option works fine.  But specifying '-icount
>> > 1' results in a (guest) deadlock. Enabling some heavy logging (-d
>> > in_asm,exec) sometimes results in a 'Bad ram offset'.
>> >
>> > Is it expected that this patch breaks -icount?
>>
>> Not really. Using icount will disable MTTCG and run single threaded as
>> before. Paolo reported another icount failure so they may be related. I
>> shall have a look at it.
>>
>> Thanks for the report.
>
> I have not experienced a guest deadlock, but for me this patch makes
> booting a simple Linux distribution take about an order of magnitude
> longer when using '-icount 0' (from about 1.6 seconds to 17.9). It was
> slow enough to get to the printk the first time after recompiling that I
> killed it thinking it *had* deadlocked.
>
> `perf report` from before this patch (snipped to >1%):
>  23.81%  qemu-system-aar  perf-9267.map        [.] 0x0000000041a5cc9e
>   7.15%  qemu-system-aar  [kernel.kallsyms]    [k] 0xffffffff8172bc82
>   6.29%  qemu-system-aar  qemu-system-aarch64  [.] cpu_exec
>   4.99%  qemu-system-aar  qemu-system-aarch64  [.] tcg_gen_code
>   4.71%  qemu-system-aar  qemu-system-aarch64  [.] cpu_get_tb_cpu_state
>   4.39%  qemu-system-aar  qemu-system-aarch64  [.] tcg_optimize
>   3.28%  qemu-system-aar  qemu-system-aarch64  [.] helper_dc_zva
>   2.66%  qemu-system-aar  qemu-system-aarch64  [.] liveness_pass_1
>   1.98%  qemu-system-aar  qemu-system-aarch64  [.] qht_lookup
>   1.93%  qemu-system-aar  qemu-system-aarch64  [.] tcg_out_opc
>   1.81%  qemu-system-aar  qemu-system-aarch64  [.] get_phys_addr_lpae
>   1.71%  qemu-system-aar  qemu-system-aarch64  [.] object_class_dynamic_cast_assert
>   1.38%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi1
>   1.10%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi0
>
> and after this patch:
>  20.10%  qemu-system-aar  perf-3285.map        [.] 0x0000000040a3b690
> *18.08%  qemu-system-aar  [kernel.kallsyms]    [k] 0xffffffff81371865
>   7.87%  qemu-system-aar  qemu-system-aarch64  [.] cpu_exec
>   4.70%  qemu-system-aar  qemu-system-aarch64  [.] cpu_get_tb_cpu_state
> * 2.64%  qemu-system-aar  qemu-system-aarch64  [.] g_mutex_get_impl
>   2.39%  qemu-system-aar  qemu-system-aarch64  [.] gic_update
> * 1.89%  qemu-system-aar  qemu-system-aarch64  [.] pthread_mutex_unlock
>   1.61%  qemu-system-aar  qemu-system-aarch64  [.] object_class_dynamic_cast_assert
> * 1.55%  qemu-system-aar  qemu-system-aarch64  [.] pthread_mutex_lock
>   1.31%  qemu-system-aar  qemu-system-aarch64  [.] get_phys_addr_lpae
>   1.21%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi0
>   1.13%  qemu-system-aar  qemu-system-aarch64  [.] arm_regime_tbi1
>
> I've put asterisks by a few suspicious mutex-related functions, though I wonder
> if the slowdowns are also partially inlined into some of the other functions.
> The kernel also jumps up, presumably from handling more mutexes?
>
> I confess I'm not familiar enough with this code to suggest optimizations, but
> I'll be glad to test any.

Please see the series Paolo posted:

  Subject: [PATCH 0/6] tcg: fix icount super slowdown
  Date: Fri,  3 Mar 2017 14:11:08 +0100
  Message-Id: <20170303131113.25898-1-pbonzini@redhat.com>

>
> -Aaron
>
>>
>> >
>> > Thanks,
>> >
>> > Laurent
>> >
>> > PS - To clarify 791158d9 works.
>> >
>> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> >> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
>> >> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
>> >> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
>> >> [EGC: fixed iothread lock for cpu-exec IRQ handling]
>> >> Signed-off-by: Emilio G. Cota <cota@braap.org>
>> >> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
>> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> >> Reviewed-by: Richard Henderson <rth@twiddle.net>
>> >> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
>> >> [PM: target-arm changes]
>> >> Acked-by: Peter Maydell <peter.maydell@linaro.org>
>> >> ---
>> >>  cpu-exec.c                 | 23 +++++++++++++++++++++--
>> >>  cpus.c                     | 28 +++++-----------------------
>> >>  cputlb.c                   | 21 ++++++++++++++++++++-
>> >>  exec.c                     | 12 +++++++++---
>> >>  hw/core/irq.c              |  1 +
>> >>  hw/i386/kvmvapic.c         |  4 ++--
>> >>  hw/intc/arm_gicv3_cpuif.c  |  3 +++
>> >>  hw/ppc/ppc.c               | 16 +++++++++++++++-
>> >>  hw/ppc/spapr.c             |  3 +++
>> >>  include/qom/cpu.h          |  1 +
>> >>  memory.c                   |  2 ++
>> >>  qom/cpu.c                  | 10 ++++++++++
>> >>  target/arm/helper.c        |  6 ++++++
>> >>  target/arm/op_helper.c     | 43 +++++++++++++++++++++++++++++++++++++++----
>> >>  target/i386/smm_helper.c   |  7 +++++++
>> >>  target/s390x/misc_helper.c |  5 ++++-
>> >>  translate-all.c            |  9 +++++++--
>> >>  translate-common.c         | 21 +++++++++++----------
>> >>  18 files changed, 166 insertions(+), 49 deletions(-)
>> >>
>> >> diff --git a/cpu-exec.c b/cpu-exec.c
>> >> index 06a6b25564..1bd3d72002 100644
>> >> --- a/cpu-exec.c
>> >> +++ b/cpu-exec.c
>> >> @@ -29,6 +29,7 @@
>> >>  #include "qemu/rcu.h"
>> >>  #include "exec/tb-hash.h"
>> >>  #include "exec/log.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
>> >>  #include "hw/i386/apic.h"
>> >>  #endif
>> >> @@ -388,8 +389,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
>> >>          if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
>> >>              && replay_interrupt()) {
>> >>              X86CPU *x86_cpu = X86_CPU(cpu);
>> >> +            qemu_mutex_lock_iothread();
>> >>              apic_poll_irq(x86_cpu->apic_state);
>> >>              cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
>> >> +            qemu_mutex_unlock_iothread();
>> >>          }
>> >>  #endif
>> >>          if (!cpu_has_work(cpu)) {
>> >> @@ -443,7 +446,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
>> >>  #else
>> >>              if (replay_exception()) {
>> >>                  CPUClass *cc = CPU_GET_CLASS(cpu);
>> >> +                qemu_mutex_lock_iothread();
>> >>                  cc->do_interrupt(cpu);
>> >> +                qemu_mutex_unlock_iothread();
>> >>                  cpu->exception_index = -1;
>> >>              } else if (!replay_has_interrupt()) {
>> >>                  /* give a chance to iothread in replay mode */
>> >> @@ -469,9 +474,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>> >>                                          TranslationBlock **last_tb)
>> >>  {
>> >>      CPUClass *cc = CPU_GET_CLASS(cpu);
>> >> -    int interrupt_request = cpu->interrupt_request;
>> >>
>> >> -    if (unlikely(interrupt_request)) {
>> >> +    if (unlikely(atomic_read(&cpu->interrupt_request))) {
>> >> +        int interrupt_request;
>> >> +        qemu_mutex_lock_iothread();
>> >> +        interrupt_request = cpu->interrupt_request;
>> >>          if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
>> >>              /* Mask out external interrupts for this step. */
>> >>              interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
>> >> @@ -479,6 +486,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>> >>          if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>> >>              cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
>> >>              cpu->exception_index = EXCP_DEBUG;
>> >> +            qemu_mutex_unlock_iothread();
>> >>              return true;
>> >>          }
>> >>          if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
>> >> @@ -488,6 +496,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>> >>              cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
>> >>              cpu->halted = 1;
>> >>              cpu->exception_index = EXCP_HLT;
>> >> +            qemu_mutex_unlock_iothread();
>> >>              return true;
>> >>          }
>> >>  #if defined(TARGET_I386)
>> >> @@ -498,12 +507,14 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>> >>              cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
>> >>              do_cpu_init(x86_cpu);
>> >>              cpu->exception_index = EXCP_HALTED;
>> >> +            qemu_mutex_unlock_iothread();
>> >>              return true;
>> >>          }
>> >>  #else
>> >>          else if (interrupt_request & CPU_INTERRUPT_RESET) {
>> >>              replay_interrupt();
>> >>              cpu_reset(cpu);
>> >> +            qemu_mutex_unlock_iothread();
>> >>              return true;
>> >>          }
>> >>  #endif
>> >> @@ -526,7 +537,12 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>> >>                 the program flow was changed */
>> >>              *last_tb = NULL;
>> >>          }
>> >> +
>> >> +        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
>> >> +        qemu_mutex_unlock_iothread();
>> >>      }
>> >> +
>> >> +
>> >>      if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
>> >>          atomic_set(&cpu->exit_request, 0);
>> >>          cpu->exception_index = EXCP_INTERRUPT;
>> >> @@ -643,6 +659,9 @@ int cpu_exec(CPUState *cpu)
>> >>  #endif /* buggy compiler */
>> >>          cpu->can_do_io = 1;
>> >>          tb_lock_reset();
>> >> +        if (qemu_mutex_iothread_locked()) {
>> >> +            qemu_mutex_unlock_iothread();
>> >> +        }
>> >>      }
>> >>
>> >>      /* if an exception is pending, we execute it here */
>> >> diff --git a/cpus.c b/cpus.c
>> >> index 860034a794..0ae8f69be5 100644
>> >> --- a/cpus.c
>> >> +++ b/cpus.c
>> >> @@ -1027,8 +1027,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
>> >>  #endif /* _WIN32 */
>> >>
>> >>  static QemuMutex qemu_global_mutex;
>> >> -static QemuCond qemu_io_proceeded_cond;
>> >> -static unsigned iothread_requesting_mutex;
>> >>
>> >>  static QemuThread io_thread;
>> >>
>> >> @@ -1042,7 +1040,6 @@ void qemu_init_cpu_loop(void)
>> >>      qemu_init_sigbus();
>> >>      qemu_cond_init(&qemu_cpu_cond);
>> >>      qemu_cond_init(&qemu_pause_cond);
>> >> -    qemu_cond_init(&qemu_io_proceeded_cond);
>> >>      qemu_mutex_init(&qemu_global_mutex);
>> >>
>> >>      qemu_thread_get_self(&io_thread);
>> >> @@ -1085,10 +1082,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
>> >>
>> >>      start_tcg_kick_timer();
>> >>
>> >> -    while (iothread_requesting_mutex) {
>> >> -        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
>> >> -    }
>> >> -
>> >>      CPU_FOREACH(cpu) {
>> >>          qemu_wait_io_event_common(cpu);
>> >>      }
>> >> @@ -1249,9 +1242,11 @@ static int tcg_cpu_exec(CPUState *cpu)
>> >>          cpu->icount_decr.u16.low = decr;
>> >>          cpu->icount_extra = count;
>> >>      }
>> >> +    qemu_mutex_unlock_iothread();
>> >>      cpu_exec_start(cpu);
>> >>      ret = cpu_exec(cpu);
>> >>      cpu_exec_end(cpu);
>> >> +    qemu_mutex_lock_iothread();
>> >>  #ifdef CONFIG_PROFILER
>> >>      tcg_time += profile_getclock() - ti;
>> >>  #endif
>> >> @@ -1479,27 +1474,14 @@ bool qemu_mutex_iothread_locked(void)
>> >>
>> >>  void qemu_mutex_lock_iothread(void)
>> >>  {
>> >> -    atomic_inc(&iothread_requesting_mutex);
>> >> -    /* In the simple case there is no need to bump the VCPU thread out of
>> >> -     * TCG code execution.
>> >> -     */
>> >> -    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
>> >> -        !first_cpu || !first_cpu->created) {
>> >> -        qemu_mutex_lock(&qemu_global_mutex);
>> >> -        atomic_dec(&iothread_requesting_mutex);
>> >> -    } else {
>> >> -        if (qemu_mutex_trylock(&qemu_global_mutex)) {
>> >> -            qemu_cpu_kick_rr_cpu();
>> >> -            qemu_mutex_lock(&qemu_global_mutex);
>> >> -        }
>> >> -        atomic_dec(&iothread_requesting_mutex);
>> >> -        qemu_cond_broadcast(&qemu_io_proceeded_cond);
>> >> -    }
>> >> +    g_assert(!qemu_mutex_iothread_locked());
>> >> +    qemu_mutex_lock(&qemu_global_mutex);
>> >>      iothread_locked = true;
>> >>  }
>> >>
>> >>  void qemu_mutex_unlock_iothread(void)
>> >>  {
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >>      iothread_locked = false;
>> >>      qemu_mutex_unlock(&qemu_global_mutex);
>> >>  }
>> >> diff --git a/cputlb.c b/cputlb.c
>> >> index 6c39927455..1cc9d9da51 100644
>> >> --- a/cputlb.c
>> >> +++ b/cputlb.c
>> >> @@ -18,6 +18,7 @@
>> >>   */
>> >>
>> >>  #include "qemu/osdep.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "cpu.h"
>> >>  #include "exec/exec-all.h"
>> >>  #include "exec/memory.h"
>> >> @@ -495,6 +496,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>> >>      hwaddr physaddr = iotlbentry->addr;
>> >>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
>> >>      uint64_t val;
>> >> +    bool locked = false;
>> >>
>> >>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>> >>      cpu->mem_io_pc = retaddr;
>> >> @@ -503,7 +505,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>> >>      }
>> >>
>> >>      cpu->mem_io_vaddr = addr;
>> >> +
>> >> +    if (mr->global_locking) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        locked = true;
>> >> +    }
>> >>      memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
>> >> +    if (locked) {
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    }
>> >> +
>> >>      return val;
>> >>  }
>> >>
>> >> @@ -514,15 +525,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
>> >>      CPUState *cpu = ENV_GET_CPU(env);
>> >>      hwaddr physaddr = iotlbentry->addr;
>> >>      MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
>> >> +    bool locked = false;
>> >>
>> >>      physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
>> >>      if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
>> >>          cpu_io_recompile(cpu, retaddr);
>> >>      }
>> >> -
>> >>      cpu->mem_io_vaddr = addr;
>> >>      cpu->mem_io_pc = retaddr;
>> >> +
>> >> +    if (mr->global_locking) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        locked = true;
>> >> +    }
>> >>      memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
>> >> +    if (locked) {
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    }
>> >>  }
>> >>
>> >>  /* Return true if ADDR is present in the victim tlb, and has been copied
>> >> diff --git a/exec.c b/exec.c
>> >> index 865a1e8295..3adf2b1861 100644
>> >> --- a/exec.c
>> >> +++ b/exec.c
>> >> @@ -2134,9 +2134,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
>> >>                  }
>> >>                  cpu->watchpoint_hit = wp;
>> >>
>> >> -                /* The tb_lock will be reset when cpu_loop_exit or
>> >> -                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
>> >> -                 * main loop.
>> >> +                /* Both tb_lock and iothread_mutex will be reset when
>> >> +                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
>> >> +                 * back into the cpu_exec main loop.
>> >>                   */
>> >>                  tb_lock();
>> >>                  tb_check_watchpoint(cpu);
>> >> @@ -2371,8 +2371,14 @@ static void io_mem_init(void)
>> >>      memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
>> >>      memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
>> >>                            NULL, UINT64_MAX);
>> >> +
>> >> +    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
>> >> +     * which can be called without the iothread mutex.
>> >> +     */
>> >>      memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
>> >>                            NULL, UINT64_MAX);
>> >> +    memory_region_clear_global_locking(&io_mem_notdirty);
>> >> +
>> >>      memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
>> >>                            NULL, UINT64_MAX);
>> >>  }
>> >> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> >> index 49ff2e64fe..b98d1d69f5 100644
>> >> --- a/hw/core/irq.c
>> >> +++ b/hw/core/irq.c
>> >> @@ -22,6 +22,7 @@
>> >>   * THE SOFTWARE.
>> >>   */
>> >>  #include "qemu/osdep.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "qemu-common.h"
>> >>  #include "hw/irq.h"
>> >>  #include "qom/object.h"
>> >> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
>> >> index 7135633863..82a49556af 100644
>> >> --- a/hw/i386/kvmvapic.c
>> >> +++ b/hw/i386/kvmvapic.c
>> >> @@ -457,8 +457,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
>> >>      resume_all_vcpus();
>> >>
>> >>      if (!kvm_enabled()) {
>> >> -        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
>> >> -         * back into the cpu_exec loop. */
>> >> +        /* Both tb_lock and iothread_mutex will be reset when
>> >> +         *  longjmps back into the cpu_exec loop. */
>> >>          tb_lock();
>> >>          tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
>> >>          cpu_loop_exit_noexc(cs);
>> >> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
>> >> index c25ee03556..f775aba507 100644
>> >> --- a/hw/intc/arm_gicv3_cpuif.c
>> >> +++ b/hw/intc/arm_gicv3_cpuif.c
>> >> @@ -14,6 +14,7 @@
>> >>
>> >>  #include "qemu/osdep.h"
>> >>  #include "qemu/bitops.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "trace.h"
>> >>  #include "gicv3_internal.h"
>> >>  #include "cpu.h"
>> >> @@ -733,6 +734,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
>> >>      ARMCPU *cpu = ARM_CPU(cs->cpu);
>> >>      CPUARMState *env = &cpu->env;
>> >>
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >> +
>> >>      trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
>> >>                               cs->hppi.grp, cs->hppi.prio);
>> >>
>> >> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
>> >> index d171e60b5c..5f93083d4a 100644
>> >> --- a/hw/ppc/ppc.c
>> >> +++ b/hw/ppc/ppc.c
>> >> @@ -62,7 +62,16 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>> >>  {
>> >>      CPUState *cs = CPU(cpu);
>> >>      CPUPPCState *env = &cpu->env;
>> >> -    unsigned int old_pending = env->pending_interrupts;
>> >> +    unsigned int old_pending;
>> >> +    bool locked = false;
>> >> +
>> >> +    /* We may already have the BQL if coming from the reset path */
>> >> +    if (!qemu_mutex_iothread_locked()) {
>> >> +        locked = true;
>> >> +        qemu_mutex_lock_iothread();
>> >> +    }
>> >> +
>> >> +    old_pending = env->pending_interrupts;
>> >>
>> >>      if (level) {
>> >>          env->pending_interrupts |= 1 << n_IRQ;
>> >> @@ -80,9 +89,14 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level)
>> >>  #endif
>> >>      }
>> >>
>> >> +
>> >>      LOG_IRQ("%s: %p n_IRQ %d level %d => pending %08" PRIx32
>> >>                  "req %08x\n", __func__, env, n_IRQ, level,
>> >>                  env->pending_interrupts, CPU(cpu)->interrupt_request);
>> >> +
>> >> +    if (locked) {
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    }
>> >>  }
>> >>
>> >>  /* PowerPC 6xx / 7xx internal IRQ controller */
>> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> >> index e465d7ac98..b1e374f3f9 100644
>> >> --- a/hw/ppc/spapr.c
>> >> +++ b/hw/ppc/spapr.c
>> >> @@ -1010,6 +1010,9 @@ static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
>> >>  {
>> >>      CPUPPCState *env = &cpu->env;
>> >>
>> >> +    /* The TCG path should also be holding the BQL at this point */
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >> +
>> >>      if (msr_pr) {
>> >>          hcall_dprintf("Hypercall made with MSR[PR]=1\n");
>> >>          env->gpr[3] = H_PRIVILEGE;
>> >> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
>> >> index 2cf4ecf144..10db89b16a 100644
>> >> --- a/include/qom/cpu.h
>> >> +++ b/include/qom/cpu.h
>> >> @@ -329,6 +329,7 @@ struct CPUState {
>> >>      bool unplug;
>> >>      bool crash_occurred;
>> >>      bool exit_request;
>> >> +    /* updates protected by BQL */
>> >>      uint32_t interrupt_request;
>> >>      int singlestep_enabled;
>> >>      int64_t icount_extra;
>> >> diff --git a/memory.c b/memory.c
>> >> index ed8b5aa83e..d61caee867 100644
>> >> --- a/memory.c
>> >> +++ b/memory.c
>> >> @@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
>> >>      AddressSpace *as;
>> >>
>> >>      assert(memory_region_transaction_depth);
>> >> +    assert(qemu_mutex_iothread_locked());
>> >> +
>> >>      --memory_region_transaction_depth;
>> >>      if (!memory_region_transaction_depth) {
>> >>          if (memory_region_update_pending) {
>> >> diff --git a/qom/cpu.c b/qom/cpu.c
>> >> index ed87c50cea..58784bcbea 100644
>> >> --- a/qom/cpu.c
>> >> +++ b/qom/cpu.c
>> >> @@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
>> >>      error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
>> >>  }
>> >>
>> >> +/* Resetting the IRQ comes from across the code base so we take the
>> >> + * BQL here if we need to.  cpu_interrupt assumes it is held.*/
>> >>  void cpu_reset_interrupt(CPUState *cpu, int mask)
>> >>  {
>> >> +    bool need_lock = !qemu_mutex_iothread_locked();
>> >> +
>> >> +    if (need_lock) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +    }
>> >>      cpu->interrupt_request &= ~mask;
>> >> +    if (need_lock) {
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    }
>> >>  }
>> >>
>> >>  void cpu_exit(CPUState *cpu)
>> >> diff --git a/target/arm/helper.c b/target/arm/helper.c
>> >> index 47250bcf16..753a69d40d 100644
>> >> --- a/target/arm/helper.c
>> >> +++ b/target/arm/helper.c
>> >> @@ -6769,6 +6769,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
>> >>          arm_cpu_do_interrupt_aarch32(cs);
>> >>      }
>> >>
>> >> +    /* Hooks may change global state so BQL should be held, also the
>> >> +     * BQL needs to be held for any modification of
>> >> +     * cs->interrupt_request.
>> >> +     */
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >> +
>> >>      arm_call_el_change_hook(cpu);
>> >>
>> >>      if (!kvm_enabled()) {
>> >> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
>> >> index fb366fdc35..5f3e3bdae2 100644
>> >> --- a/target/arm/op_helper.c
>> >> +++ b/target/arm/op_helper.c
>> >> @@ -18,6 +18,7 @@
>> >>   */
>> >>  #include "qemu/osdep.h"
>> >>  #include "qemu/log.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "cpu.h"
>> >>  #include "exec/helper-proto.h"
>> >>  #include "internals.h"
>> >> @@ -487,7 +488,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
>> >>       */
>> >>      env->regs[15] &= (env->thumb ? ~1 : ~3);
>> >>
>> >> +    qemu_mutex_lock_iothread();
>> >>      arm_call_el_change_hook(arm_env_get_cpu(env));
>> >> +    qemu_mutex_unlock_iothread();
>> >>  }
>> >>
>> >>  /* Access to user mode registers from privileged modes.  */
>> >> @@ -735,28 +738,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
>> >>  {
>> >>      const ARMCPRegInfo *ri = rip;
>> >>
>> >> -    ri->writefn(env, ri, value);
>> >> +    if (ri->type & ARM_CP_IO) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        ri->writefn(env, ri, value);
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    } else {
>> >> +        ri->writefn(env, ri, value);
>> >> +    }
>> >>  }
>> >>
>> >>  uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
>> >>  {
>> >>      const ARMCPRegInfo *ri = rip;
>> >> +    uint32_t res;
>> >>
>> >> -    return ri->readfn(env, ri);
>> >> +    if (ri->type & ARM_CP_IO) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        res = ri->readfn(env, ri);
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    } else {
>> >> +        res = ri->readfn(env, ri);
>> >> +    }
>> >> +
>> >> +    return res;
>> >>  }
>> >>
>> >>  void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
>> >>  {
>> >>      const ARMCPRegInfo *ri = rip;
>> >>
>> >> -    ri->writefn(env, ri, value);
>> >> +    if (ri->type & ARM_CP_IO) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        ri->writefn(env, ri, value);
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    } else {
>> >> +        ri->writefn(env, ri, value);
>> >> +    }
>> >>  }
>> >>
>> >>  uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
>> >>  {
>> >>      const ARMCPRegInfo *ri = rip;
>> >> +    uint64_t res;
>> >> +
>> >> +    if (ri->type & ARM_CP_IO) {
>> >> +        qemu_mutex_lock_iothread();
>> >> +        res = ri->readfn(env, ri);
>> >> +        qemu_mutex_unlock_iothread();
>> >> +    } else {
>> >> +        res = ri->readfn(env, ri);
>> >> +    }
>> >>
>> >> -    return ri->readfn(env, ri);
>> >> +    return res;
>> >>  }
>> >>
>> >>  void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
>> >> @@ -989,7 +1022,9 @@ void HELPER(exception_return)(CPUARMState *env)
>> >>                        cur_el, new_el, env->pc);
>> >>      }
>> >>
>> >> +    qemu_mutex_lock_iothread();
>> >>      arm_call_el_change_hook(arm_env_get_cpu(env));
>> >> +    qemu_mutex_unlock_iothread();
>> >>
>> >>      return;
>> >>
>> >> diff --git a/target/i386/smm_helper.c b/target/i386/smm_helper.c
>> >> index 4dd6a2c544..f051a77c4a 100644
>> >> --- a/target/i386/smm_helper.c
>> >> +++ b/target/i386/smm_helper.c
>> >> @@ -18,6 +18,7 @@
>> >>   */
>> >>
>> >>  #include "qemu/osdep.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "cpu.h"
>> >>  #include "exec/helper-proto.h"
>> >>  #include "exec/log.h"
>> >> @@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
>> >>  #define SMM_REVISION_ID 0x00020000
>> >>  #endif
>> >>
>> >> +/* Called with iothread lock taken */
>> >>  void cpu_smm_update(X86CPU *cpu)
>> >>  {
>> >>      CPUX86State *env = &cpu->env;
>> >>      bool smm_enabled = (env->hflags & HF_SMM_MASK);
>> >>
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >> +
>> >>      if (cpu->smram) {
>> >>          memory_region_set_enabled(cpu->smram, smm_enabled);
>> >>      }
>> >> @@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
>> >>      }
>> >>      env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
>> >>      env->hflags &= ~HF_SMM_MASK;
>> >> +
>> >> +    qemu_mutex_lock_iothread();
>> >>      cpu_smm_update(cpu);
>> >> +    qemu_mutex_unlock_iothread();
>> >>
>> >>      qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
>> >>      log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
>> >> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
>> >> index c9604ea9c7..3cb942e8bb 100644
>> >> --- a/target/s390x/misc_helper.c
>> >> +++ b/target/s390x/misc_helper.c
>> >> @@ -25,6 +25,7 @@
>> >>  #include "exec/helper-proto.h"
>> >>  #include "sysemu/kvm.h"
>> >>  #include "qemu/timer.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "exec/address-spaces.h"
>> >>  #ifdef CONFIG_KVM
>> >>  #include <linux/kvm.h>
>> >> @@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
>> >>  /* SCLP service call */
>> >>  uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
>> >>  {
>> >> +    qemu_mutex_lock_iothread();
>> >>      int r = sclp_service_call(env, r1, r2);
>> >>      if (r < 0) {
>> >>          program_interrupt(env, -r, 4);
>> >> -        return 0;
>> >> +        r = 0;
>> >>      }
>> >> +    qemu_mutex_unlock_iothread();
>> >>      return r;
>> >>  }
>> >>
>> >> diff --git a/translate-all.c b/translate-all.c
>> >> index 8a861cb583..f810259c41 100644
>> >> --- a/translate-all.c
>> >> +++ b/translate-all.c
>> >> @@ -55,6 +55,7 @@
>> >>  #include "translate-all.h"
>> >>  #include "qemu/bitmap.h"
>> >>  #include "qemu/timer.h"
>> >> +#include "qemu/main-loop.h"
>> >>  #include "exec/log.h"
>> >>
>> >>  /* #define DEBUG_TB_INVALIDATE */
>> >> @@ -1523,7 +1524,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
>> >>  #ifdef CONFIG_SOFTMMU
>> >>  /* len must be <= 8 and start must be a multiple of len.
>> >>   * Called via softmmu_template.h when code areas are written to with
>> >> - * tb_lock held.
>> >> + * iothread mutex not held.
>> >>   */
>> >>  void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
>> >>  {
>> >> @@ -1725,7 +1726,10 @@ void tb_check_watchpoint(CPUState *cpu)
>> >>
>> >>  #ifndef CONFIG_USER_ONLY
>> >>  /* in deterministic execution mode, instructions doing device I/Os
>> >> -   must be at the end of the TB */
>> >> + * must be at the end of the TB.
>> >> + *
>> >> + * Called by softmmu_template.h, with iothread mutex not held.
>> >> + */
>> >>  void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
>> >>  {
>> >>  #if defined(TARGET_MIPS) || defined(TARGET_SH4)
>> >> @@ -1937,6 +1941,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
>> >>
>> >>  void cpu_interrupt(CPUState *cpu, int mask)
>> >>  {
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >>      cpu->interrupt_request |= mask;
>> >>      cpu->tcg_exit_req = 1;
>> >>  }
>> >> diff --git a/translate-common.c b/translate-common.c
>> >> index 5e989cdf70..d504dd0d33 100644
>> >> --- a/translate-common.c
>> >> +++ b/translate-common.c
>> >> @@ -21,6 +21,7 @@
>> >>  #include "qemu-common.h"
>> >>  #include "qom/cpu.h"
>> >>  #include "sysemu/cpus.h"
>> >> +#include "qemu/main-loop.h"
>> >>
>> >>  uintptr_t qemu_real_host_page_size;
>> >>  intptr_t qemu_real_host_page_mask;
>> >> @@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
>> >>  static void tcg_handle_interrupt(CPUState *cpu, int mask)
>> >>  {
>> >>      int old_mask;
>> >> +    g_assert(qemu_mutex_iothread_locked());
>> >>
>> >>      old_mask = cpu->interrupt_request;
>> >>      cpu->interrupt_request |= mask;
>> >> @@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
>> >>       */
>> >>      if (!qemu_cpu_is_self(cpu)) {
>> >>          qemu_cpu_kick(cpu);
>> >> -        return;
>> >> -    }
>> >> -
>> >> -    if (use_icount) {
>> >> -        cpu->icount_decr.u16.high = 0xffff;
>> >> -        if (!cpu->can_do_io
>> >> -            && (mask & ~old_mask) != 0) {
>> >> -            cpu_abort(cpu, "Raised interrupt while not in I/O function");
>> >> -        }
>> >>      } else {
>> >> -        cpu->tcg_exit_req = 1;
>> >> +        if (use_icount) {
>> >> +            cpu->icount_decr.u16.high = 0xffff;
>> >> +            if (!cpu->can_do_io
>> >> +                && (mask & ~old_mask) != 0) {
>> >> +                cpu_abort(cpu, "Raised interrupt while not in I/O function");
>> >> +            }
>> >> +        } else {
>> >> +            cpu->tcg_exit_req = 1;
>> >> +        }
>> >>      }
>> >>  }
>> >>
>> >> --
>> >> 2.11.0
>> >>
>> >>
>>
>>
>> --
>> Alex Bennée
>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-02-27 14:38     ` Alex Bennée
@ 2017-03-13 14:03       ` Laurent Vivier
  2017-03-13 16:58         ` Alex Bennée
                           ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Laurent Vivier @ 2017-03-13 14:03 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson

Le 27/02/2017 à 15:38, Alex Bennée a écrit :
> 
> Laurent Vivier <laurent@vivier.eu> writes:
> 
>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>> There are a couple of changes that occur at the same time here:
>>>
>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>
>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>   single threaded function.
>>>
>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>     vCPU threads. This is for future work where async jobs need to know
>>>     the vCPU context they are operating in.
>>>
>>> The user to switch on multi-thread behaviour and spawn a thread
>>> per-vCPU. For a simple test kvm-unit-test like:
>>>
>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>
>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>> unexpected PASS) as the default mode of the test has no protection when
>>> incrementing a shared variable.
>>>
>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>> and atomic code if supported by the front and backends. This doesn't
>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>> check the configuration is supported.
>>
>> This commit breaks linux-user mode:
>>
>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>
>> cd /opt/ltp
>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>> setgroups03
>>
>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>> sysconf(_SC_NGROUPS_MAX), errno=22
>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>> ...
> 
> Interesting. I can only think the current_cpu change has broken it
> because most of the changes in this commit affect softmmu targets only
> (linux-user has its own run loop).
> 
> Thanks for the report - I'll look into it.

After:

     95b0eca Merge remote-tracking branch
'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging

[Tested with my HEAD on:
b1616fe Merge remote-tracking branch
'remotes/famz/tags/docker-pull-request' into staging]

I have now:

<<<test_start>>>
tag=setgroups03 stime=1489413401
cmdline="setgroups03"
contacts=""
analysis=exit
<<<test_output>>>
**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
**

Laurent

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-13 14:03       ` Laurent Vivier
@ 2017-03-13 16:58         ` Alex Bennée
  2017-03-13 18:21           ` Laurent Vivier
  2017-03-16 17:31         ` Alex Bennée
  2017-03-17 20:43         ` Alex Bennée
  2 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-03-13 16:58 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson


Laurent Vivier <laurent@vivier.eu> writes:

> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>
>> Laurent Vivier <laurent@vivier.eu> writes:
>>
>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>> There are a couple of changes that occur at the same time here:
>>>>
>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>
>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>   single threaded function.
>>>>
>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>     the vCPU context they are operating in.
>>>>
>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>
>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>
>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>> unexpected PASS) as the default mode of the test has no protection when
>>>> incrementing a shared variable.
>>>>
>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>> and atomic code if supported by the front and backends. This doesn't
>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>> check the configuration is supported.
>>>
>>> This commit breaks linux-user mode:
>>>
>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>
>>> cd /opt/ltp
>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>> setgroups03
>>>
>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> ...
>>
>> Interesting. I can only think the current_cpu change has broken it
>> because most of the changes in this commit affect softmmu targets only
>> (linux-user has its own run loop).
>>
>> Thanks for the report - I'll look into it.
>
> After:
>
>      95b0eca Merge remote-tracking branch
> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>
> [Tested with my HEAD on:
> b1616fe Merge remote-tracking branch
> 'remotes/famz/tags/docker-pull-request' into staging]
>
> I have now:
>
> <<<test_start>>>
> tag=setgroups03 stime=1489413401
> cmdline="setgroups03"
> contacts=""
> analysis=exit
> <<<test_output>>>
> **
> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
> failed: (cpu == current_cpu)
> **

So I think this is saying that we were outside the tcg_exec_loop for
this cpu and somehow longjmp'ed back into the loop.

I'll start setting up LTP on my system but in the meantime you might
find it useful adding the cpu == current_cpu assert into all the places
in cpu-exec-common.c before siglongjmp is called. Then a backtrace of
the offending call will be easier to follow.

>
> Laurent


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-13 16:58         ` Alex Bennée
@ 2017-03-13 18:21           ` Laurent Vivier
  0 siblings, 0 replies; 55+ messages in thread
From: Laurent Vivier @ 2017-03-13 18:21 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson

Le 13/03/2017 à 17:58, Alex Bennée a écrit :
> 
> Laurent Vivier <laurent@vivier.eu> writes:
> 
>> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>>
>>> Laurent Vivier <laurent@vivier.eu> writes:
>>>
>>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>>> There are a couple of changes that occur at the same time here:
>>>>>
>>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>>
>>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>>   single threaded function.
>>>>>
>>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>>     the vCPU context they are operating in.
>>>>>
>>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>>
>>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>>
>>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>>> unexpected PASS) as the default mode of the test has no protection when
>>>>> incrementing a shared variable.
>>>>>
>>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>>> and atomic code if supported by the front and backends. This doesn't
>>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>>> check the configuration is supported.
>>>>
>>>> This commit breaks linux-user mode:
>>>>
>>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>>
>>>> cd /opt/ltp
>>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>>> setgroups03
>>>>
>>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> ...
>>>
>>> Interesting. I can only think the current_cpu change has broken it
>>> because most of the changes in this commit affect softmmu targets only
>>> (linux-user has its own run loop).
>>>
>>> Thanks for the report - I'll look into it.
>>
>> After:
>>
>>      95b0eca Merge remote-tracking branch
>> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>>
>> [Tested with my HEAD on:
>> b1616fe Merge remote-tracking branch
>> 'remotes/famz/tags/docker-pull-request' into staging]
>>
>> I have now:
>>
>> <<<test_start>>>
>> tag=setgroups03 stime=1489413401
>> cmdline="setgroups03"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> **
>> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
>> failed: (cpu == current_cpu)
>> **
> 
> So I think this is saying that we were outside the tcg_exec_loop for
> this cpu and somehow longjmp'ed back into the loop.
> 
> I'll start setting up LTP on my system but in the meantime you might
> find it useful adding the cpu == current_cpu assert into all the places
> in cpu-exec-common.c before siglongjmp is called. Then a backtrace of
> the offending call will be easier to follow.

If I patch cpu-exec-common.c:
diff --git a/cpu-exec-common.c b/cpu-exec-common.c
index 0504a94..4bdf295 100644
--- a/cpu-exec-common.c
+++ b/cpu-exec-common.c
@@ -29,6 +29,7 @@ void cpu_loop_exit_noexc(CPUState *cpu)
     /* XXX: restore cpu registers saved in host registers */

     cpu->exception_index = -1;
+g_assert(cpu == current_cpu);
     siglongjmp(cpu->jmp_env, 1);
 }

@@ -64,6 +65,7 @@ void cpu_reloading_memory_map(void)

 void cpu_loop_exit(CPUState *cpu)
 {
+g_assert(cpu == current_cpu);
     siglongjmp(cpu->jmp_env, 1);
 }

@@ -72,6 +74,7 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
     if (pc) {
         cpu_restore_state(cpu, pc);
     }
+g_assert(cpu == current_cpu);
     siglongjmp(cpu->jmp_env, 1);
 }

I have exactly the same trace:

**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)

QEMU_STRACE gives:

6805 close(3) = 0
6805 setgroups(65536,-159891448,0,-150998360,0,0)**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)

and strace gives:

sudo strace -ffff chroot /var/lib/lxc/debian-8-ppc/rootfs
/opt/ltp/testcases/bin/setgroups03
...
[pid  6690] futex(0x7ffce8bc3340, FUTEX_WAIT_PRIVATE, 1, NULL
<unfinished ...>
[pid  6691] --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL,
si_pid=6690, si_uid=0} ---
[pid  6691] setgroups(65536, [65534, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]) = 0
[pid  6691] futex(0x7f656a601d1c, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  6691] futex(0x7ffce8bc3340, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  6690] <... futex resumed> )       = 0
[pid  6691] <... futex resumed> )       = 1
[pid  6690] setgroups(65536, [65534, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]
<unfinished ...>
[pid  6691] rt_sigreturn({mask=~[KILL STOP RTMIN RT_1]} <unfinished ...>
[pid  6690] <... setgroups resumed> )   = -1 EPERM (Operation not permitted)
[pid  6691] <... rt_sigreturn resumed> ) = 202
[pid  6690] rt_sigprocmask(SIG_UNBLOCK, [ABRT],  <unfinished ...>
[pid  6691] futex(0x625ffba4, FUTEX_WAIT, 4294967295, NULL <unfinished ...>
[pid  6690] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid  6690] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid  6690] getpid()                    = 6690
[pid  6690] gettid()                    = 6690
[pid  6690] tgkill(6690, 6690, SIGABRT) = 0
[pid  6690] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  6690] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL,
si_pid=6690, si_uid=0} ---
[pid  6690] rt_sigreturn({mask=~[BUS SEGV]}) = 0
[pid  6690] rt_sigaction(SIGABRT, {sa_handler=SIG_DFL, sa_mask=~[],
sa_flags=SA_RESTORER, sa_restorer=0x6018b100}, NULL, 8) = 0
[pid  6690] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], ~[BUS KILL SEGV
STOP], 8) = 0
[pid  6690] getpid()                    = 6690
[pid  6690] gettid()                    = 6690
[pid  6690] tgkill(6690, 6690, SIGABRT) = 0
[pid  6690] rt_sigprocmask(SIG_SETMASK, ~[BUS KILL SEGV STOP], NULL, 8) = 0
[pid  6690] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL,
si_addr=NULL} ---
[pid  6690] rt_sigprocmask(SIG_SETMASK, ~[BUS KILL SEGV STOP], NULL, 8) = 0
[pid  6690] open("/usr/lib64/charset.alias", O_RDONLY) = -1 ENOENT (No
such file or directory)
[pid  6690] open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = -1
ENOENT (No such file or directory)
[pid  6690] open("/usr/lib64/gconv/gconv-modules", O_RDONLY|O_CLOEXEC) =
-1 ENOENT (No such file or directory)
[pid  6690] futex(0x62605a30, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  6690] brk(0x636dc000)             = 0x636dc000
[pid  6690] write(2, "**\nERROR:/home/laurent/Projects/"..., 101**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
) = 101
[pid  6690] brk(0x636d4000)             = 0x636d4000
[pid  6690] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL,
si_addr=NULL} ---
[pid  6690] rt_sigprocmask(SIG_SETMASK, ~[BUS KILL SEGV STOP], NULL, 8) = 0
[pid  6690] write(2, "**\nERROR:/home/laurent/Projects/"..., 101**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
) = 101
[pid  6690] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL,
si_addr=NULL} ---
[pid  6690] rt_sigprocmask(SIG_SETMASK, ~[BUS KILL SEGV STOP], NULL, 8) = 0
[pid  6690] write(2, "**\nERROR:/home/laurent/Projects/"..., 101**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)
) = 101
[pid  6690] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL,
si_addr=NULL} ---
[pid  6690] rt_sigprocmask(SIG_SETMASK, ~[BUS KILL SEGV STOP], NULL, 8) = 0
[pid  6690] write(2, "**\nERROR:/home/laurent/Projects/"..., 101**
ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
failed: (cpu == current_cpu)

Laurent

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-13 14:03       ` Laurent Vivier
  2017-03-13 16:58         ` Alex Bennée
@ 2017-03-16 17:31         ` Alex Bennée
  2017-03-16 18:36           ` Laurent Vivier
  2017-03-17 20:43         ` Alex Bennée
  2 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-03-16 17:31 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson


Laurent Vivier <laurent@vivier.eu> writes:

> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>
>> Laurent Vivier <laurent@vivier.eu> writes:
>>
>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>> There are a couple of changes that occur at the same time here:
>>>>
>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>
>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>   single threaded function.
>>>>
>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>     the vCPU context they are operating in.
>>>>
>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>
>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>
>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>> unexpected PASS) as the default mode of the test has no protection when
>>>> incrementing a shared variable.
>>>>
>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>> and atomic code if supported by the front and backends. This doesn't
>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>> check the configuration is supported.
>>>
>>> This commit breaks linux-user mode:
>>>
>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>
>>> cd /opt/ltp
>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>> setgroups03
>>>
>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> ...
>>
>> Interesting. I can only think the current_cpu change has broken it
>> because most of the changes in this commit affect softmmu targets only
>> (linux-user has its own run loop).
>>
>> Thanks for the report - I'll look into it.
>
> After:
>
>      95b0eca Merge remote-tracking branch
> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>
> [Tested with my HEAD on:
> b1616fe Merge remote-tracking branch
> 'remotes/famz/tags/docker-pull-request' into staging]
>
> I have now:
>
> <<<test_start>>>
> tag=setgroups03 stime=1489413401
> cmdline="setgroups03"
> contacts=""
> analysis=exit
> <<<test_output>>>
> **
> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
> failed: (cpu == current_cpu)
> **

Sorry about the delay. After lengthy fighting to get LTP on PowerPC
built I got this behaviour:

  17:26 alex@zen taken:41, git:mttcg/more-fixes-for-rc1, [/home/alex/lsrc/qemu/qemu.git]> sudo ./ppc-linux-user/qemu-ppc ./ppc-linux-user/setgroups03
  setgroups03    1  TPASS  :  setgroups(65537) fails, Size is > sysconf(_SC_NGROUPS_MAX), errno=22
  setgroups03    2  TBROK  :  tst_sig.c:233: unexpected signal SIGSEGV(11) received (pid = 22137).
  setgroups03    3  TBROK  :  tst_sig.c:233: Remaining cases broken

I'm afraid I can't compare the result to real hardware so maybe my LTP
build is broken. But the main thing is I can't seem to reproduce it
here.

Could you ping me your LTP binary so I can have a look?

The other thing to note is the assert you now see firing is a guard for
buggy compilers. What version are you building with and can you try any
other versions?

--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-16 17:31         ` Alex Bennée
@ 2017-03-16 18:36           ` Laurent Vivier
  0 siblings, 0 replies; 55+ messages in thread
From: Laurent Vivier @ 2017-03-16 18:36 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 4243 bytes --]

Le 16/03/2017 à 18:31, Alex Bennée a écrit :
> 
> Laurent Vivier <laurent@vivier.eu> writes:
> 
>> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>>
>>> Laurent Vivier <laurent@vivier.eu> writes:
>>>
>>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>>> There are a couple of changes that occur at the same time here:
>>>>>
>>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>>
>>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>>   single threaded function.
>>>>>
>>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>>     the vCPU context they are operating in.
>>>>>
>>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>>
>>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>>
>>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>>> unexpected PASS) as the default mode of the test has no protection when
>>>>> incrementing a shared variable.
>>>>>
>>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>>> and atomic code if supported by the front and backends. This doesn't
>>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>>> check the configuration is supported.
>>>>
>>>> This commit breaks linux-user mode:
>>>>
>>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>>
>>>> cd /opt/ltp
>>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>>> setgroups03
>>>>
>>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> ...
>>>
>>> Interesting. I can only think the current_cpu change has broken it
>>> because most of the changes in this commit affect softmmu targets only
>>> (linux-user has its own run loop).
>>>
>>> Thanks for the report - I'll look into it.
>>
>> After:
>>
>>      95b0eca Merge remote-tracking branch
>> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>>
>> [Tested with my HEAD on:
>> b1616fe Merge remote-tracking branch
>> 'remotes/famz/tags/docker-pull-request' into staging]
>>
>> I have now:
>>
>> <<<test_start>>>
>> tag=setgroups03 stime=1489413401
>> cmdline="setgroups03"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> **
>> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
>> failed: (cpu == current_cpu)
>> **
> 
> Sorry about the delay. After lengthy fighting to get LTP on PowerPC
> built I got this behaviour:
> 
>   17:26 alex@zen taken:41, git:mttcg/more-fixes-for-rc1, [/home/alex/lsrc/qemu/qemu.git]> sudo ./ppc-linux-user/qemu-ppc ./ppc-linux-user/setgroups03
>   setgroups03    1  TPASS  :  setgroups(65537) fails, Size is > sysconf(_SC_NGROUPS_MAX), errno=22
>   setgroups03    2  TBROK  :  tst_sig.c:233: unexpected signal SIGSEGV(11) received (pid = 22137).
>   setgroups03    3  TBROK  :  tst_sig.c:233: Remaining cases broken

I've just tested with master (272d7de) and I always have the "(cpu ==
current_cpu)".
I think this test has never worked correctly (I mean I had also the
SIGSEGV signal before), what is annoying here is the infinite loop
generated by this error (the test never ends).

> 
> I'm afraid I can't compare the result to real hardware so maybe my LTP
> build is broken. But the main thing is I can't seem to reproduce it
> here.

In attachment.

> Could you ping me your LTP binary so I can have a look?
> 
> The other thing to note is the assert you now see firing is a guard for
> buggy compilers. What version are you building with and can you try any
> other versions?

gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)

Laurent



[-- Attachment #2: setgroups03.xz --]
[-- Type: application/x-xz, Size: 84476 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-13 14:03       ` Laurent Vivier
  2017-03-13 16:58         ` Alex Bennée
  2017-03-16 17:31         ` Alex Bennée
@ 2017-03-17 20:43         ` Alex Bennée
  2017-03-18 11:19           ` Laurent Vivier
  2017-03-20 11:19           ` Paolo Bonzini
  2 siblings, 2 replies; 55+ messages in thread
From: Alex Bennée @ 2017-03-17 20:43 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson


Laurent Vivier <laurent@vivier.eu> writes:

> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>
>> Laurent Vivier <laurent@vivier.eu> writes:
>>
>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>> There are a couple of changes that occur at the same time here:
>>>>
>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>
>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>   single threaded function.
>>>>
>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>     the vCPU context they are operating in.
>>>>
>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>
>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>
>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>> unexpected PASS) as the default mode of the test has no protection when
>>>> incrementing a shared variable.
>>>>
>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>> and atomic code if supported by the front and backends. This doesn't
>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>> check the configuration is supported.
>>>
>>> This commit breaks linux-user mode:
>>>
>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>
>>> cd /opt/ltp
>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>> setgroups03
>>>
>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>> ...
>>
>> Interesting. I can only think the current_cpu change has broken it
>> because most of the changes in this commit affect softmmu targets only
>> (linux-user has its own run loop).
>>
>> Thanks for the report - I'll look into it.
>
> After:
>
>      95b0eca Merge remote-tracking branch
> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>
> [Tested with my HEAD on:
> b1616fe Merge remote-tracking branch
> 'remotes/famz/tags/docker-pull-request' into staging]
>
> I have now:
>
> <<<test_start>>>
> tag=setgroups03 stime=1489413401
> cmdline="setgroups03"
> contacts=""
> analysis=exit
> <<<test_output>>>
> **
> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
> failed: (cpu == current_cpu)
> **

OK we now understand what's happening:

 - setgroups calls __nptl_setxid_error, triggers abort()
   - this sends sig_num 6, then 11
 - host_signal_handler tries to handle 11
 - -> handle_cpu_signal

Pre: tcg: enable thread-per-vCPU caused this problem:

 - current_cpu was reset to NULL on the way out of the loop
 - therefore handle_cpu_signal went boom because
     cpu = current_cpu;
     cc = CPU_GET_CLASS(cpu);

Post: tcg: enable thread-per-vCPU caused this problem:

 - current_cpu is now live outside cpu_exec_loop
   - this is mainly so async_work functions can assert (cpu == current_cpu)
 - hence handle_cpu_signal gets further and calls
    cpu_loop_exit(cpu);
 - hilarity ensues as we siglongjmp into a stale context

Obviously we shouldn't try to siglongjmp. But we also shouldn't rely on
current_cpu as a proxy to crash early when outside of the loop. There is
a slight wrinkle that we also have funny handling of segs during
translation if a guest jumps to code in an as-yet un-mapped region of
memory.

There is currently cpu->running which is set/cleared by
cpu_exec_start/end. Although if we crash between cpu_exec_start and
sigsetjmp the same sort of brokenness might happen.

Anyway understood now. If anyone has any suggestions for neater stuff
over the weekend please shout, otherwise I'll probably just hack
handle_cpu_signal to do:

   cpu = current_cpu;
   if (!cpu->running) {
      /* we weren't running or translating JIT code when the signal came */
      return 1;
   }


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-17 20:43         ` Alex Bennée
@ 2017-03-18 11:19           ` Laurent Vivier
  2017-03-20 11:19           ` Paolo Bonzini
  1 sibling, 0 replies; 55+ messages in thread
From: Laurent Vivier @ 2017-03-18 11:19 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, Paolo Bonzini,
	KONRAD Frederic, Richard Henderson

Le 17/03/2017 à 21:43, Alex Bennée a écrit :
> 
> Laurent Vivier <laurent@vivier.eu> writes:
> 
>> Le 27/02/2017 à 15:38, Alex Bennée a écrit :
>>>
>>> Laurent Vivier <laurent@vivier.eu> writes:
>>>
>>>> Le 24/02/2017 à 12:20, Alex Bennée a écrit :
>>>>> There are a couple of changes that occur at the same time here:
>>>>>
>>>>>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>>>>>
>>>>>   One of these is spawned per vCPU with its own Thread and Condition
>>>>>   variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
>>>>>   single threaded function.
>>>>>
>>>>>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
>>>>>     vCPU threads. This is for future work where async jobs need to know
>>>>>     the vCPU context they are operating in.
>>>>>
>>>>> The user to switch on multi-thread behaviour and spawn a thread
>>>>> per-vCPU. For a simple test kvm-unit-test like:
>>>>>
>>>>>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi
>>>>>
>>>>> Will now use 4 vCPU threads and have an expected FAIL (instead of the
>>>>> unexpected PASS) as the default mode of the test has no protection when
>>>>> incrementing a shared variable.
>>>>>
>>>>> We enable the parallel_cpus flag to ensure we generate correct barrier
>>>>> and atomic code if supported by the front and backends. This doesn't
>>>>> automatically enable MTTCG until default_mttcg_enabled() is updated to
>>>>> check the configuration is supported.
>>>>
>>>> This commit breaks linux-user mode:
>>>>
>>>> debian-8 with qemu-ppc on x86_64 with ltp-full-20170116
>>>>
>>>> cd /opt/ltp
>>>> ./runltp -p -l "qemu-$(date +%FT%T).log" -f /opt/ltp/runtest/syscalls -s
>>>> setgroups03
>>>>
>>>> setgroups03    1  TPASS  :  setgroups(65537) fails, Size is >
>>>> sysconf(_SC_NGROUPS_MAX), errno=22
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> qemu-ppc: /home/laurent/Projects/qemu/include/qemu/rcu.h:89:
>>>> rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
>>>> ...
>>>
>>> Interesting. I can only think the current_cpu change has broken it
>>> because most of the changes in this commit affect softmmu targets only
>>> (linux-user has its own run loop).
>>>
>>> Thanks for the report - I'll look into it.
>>
>> After:
>>
>>      95b0eca Merge remote-tracking branch
>> 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
>>
>> [Tested with my HEAD on:
>> b1616fe Merge remote-tracking branch
>> 'remotes/famz/tags/docker-pull-request' into staging]
>>
>> I have now:
>>
>> <<<test_start>>>
>> tag=setgroups03 stime=1489413401
>> cmdline="setgroups03"
>> contacts=""
>> analysis=exit
>> <<<test_output>>>
>> **
>> ERROR:/home/laurent/Projects/qemu/cpu-exec.c:656:cpu_exec: assertion
>> failed: (cpu == current_cpu)
>> **
> 
> OK we now understand what's happening:
> 
>  - setgroups calls __nptl_setxid_error, triggers abort()
>    - this sends sig_num 6, then 11
>  - host_signal_handler tries to handle 11
>  - -> handle_cpu_signal
> 
> Pre: tcg: enable thread-per-vCPU caused this problem:
> 
>  - current_cpu was reset to NULL on the way out of the loop
>  - therefore handle_cpu_signal went boom because
>      cpu = current_cpu;
>      cc = CPU_GET_CLASS(cpu);
> 
> Post: tcg: enable thread-per-vCPU caused this problem:
> 
>  - current_cpu is now live outside cpu_exec_loop
>    - this is mainly so async_work functions can assert (cpu == current_cpu)
>  - hence handle_cpu_signal gets further and calls
>     cpu_loop_exit(cpu);
>  - hilarity ensues as we siglongjmp into a stale context
> 
> Obviously we shouldn't try to siglongjmp. But we also shouldn't rely on
> current_cpu as a proxy to crash early when outside of the loop. There is
> a slight wrinkle that we also have funny handling of segs during
> translation if a guest jumps to code in an as-yet un-mapped region of
> memory.
> 
> There is currently cpu->running which is set/cleared by
> cpu_exec_start/end. Although if we crash between cpu_exec_start and
> sigsetjmp the same sort of brokenness might happen.
> 
> Anyway understood now. If anyone has any suggestions for neater stuff
> over the weekend please shout, otherwise I'll probably just hack
> handle_cpu_signal to do:
> 
>    cpu = current_cpu;
>    if (!cpu->running) {
>       /* we weren't running or translating JIT code when the signal came */
>       return 1;
>    }

The return doesn't break the loop, but an abort() does.
I think we can put abort() here as it can be seen as an internal error
(and we get back the previous behavior).

Laurent

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-17 20:43         ` Alex Bennée
  2017-03-18 11:19           ` Laurent Vivier
@ 2017-03-20 11:19           ` Paolo Bonzini
  2017-03-20 11:47             ` Alex Bennée
  1 sibling, 1 reply; 55+ messages in thread
From: Paolo Bonzini @ 2017-03-20 11:19 UTC (permalink / raw)
  To: Alex Bennée, Laurent Vivier
  Cc: peter.maydell, Peter Crosthwaite, qemu-devel, KONRAD Frederic,
	Richard Henderson



On 17/03/2017 21:43, Alex Bennée wrote:
> There is currently cpu->running which is set/cleared by
> cpu_exec_start/end. Although if we crash between cpu_exec_start and
> sigsetjmp the same sort of brokenness might happen.

I think cpu_exec_start/end should be moved into cpu_exec itself (but
probably just in 2.10).

Paolo

> Anyway understood now. If anyone has any suggestions for neater stuff
> over the weekend please shout, otherwise I'll probably just hack
> handle_cpu_signal to do:
> 
>    cpu = current_cpu;
>    if (!cpu->running) {
>       /* we weren't running or translating JIT code when the signal came */
>       return 1;
>    }

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU
  2017-03-20 11:19           ` Paolo Bonzini
@ 2017-03-20 11:47             ` Alex Bennée
  0 siblings, 0 replies; 55+ messages in thread
From: Alex Bennée @ 2017-03-20 11:47 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Laurent Vivier, peter.maydell, Peter Crosthwaite, qemu-devel,
	KONRAD Frederic, Richard Henderson


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 17/03/2017 21:43, Alex Bennée wrote:
>> There is currently cpu->running which is set/cleared by
>> cpu_exec_start/end. Although if we crash between cpu_exec_start and
>> sigsetjmp the same sort of brokenness might happen.
>
> I think cpu_exec_start/end should be moved into cpu_exec itself (but
> probably just in 2.10).

Sure. Although hopefully we can resist the temptation to insert segging
code into that small window in the meantime ;-)

>
> Paolo
>
>> Anyway understood now. If anyone has any suggestions for neater stuff
>> over the weekend please shout, otherwise I'll probably just hack
>> handle_cpu_signal to do:
>>
>>    cpu = current_cpu;
>>    if (!cpu->running) {
>>       /* we weren't running or translating JIT code when the signal came */
>>       return 1;
>>    }


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-02-24 11:21 ` [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete Alex Bennée
@ 2017-09-17 13:07   ` Dmitry Osipenko
  2017-09-17 13:22     ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Dmitry Osipenko @ 2017-09-17 13:07 UTC (permalink / raw)
  To: Alex Bennée, peter.maydell; +Cc: open list:ARM, qemu-devel

On 24.02.2017 14:21, Alex Bennée wrote:
> Previously flushes on other vCPUs would only get serviced when they
> exited their TranslationBlocks. While this isn't overly problematic it
> violates the semantics of TLB flush from the point of view of source
> vCPU.
> 
> To solve this we call the cputlb *_all_cpus_synced() functions to do
> the flushes which ensures all flushes are completed by the time the
> vCPU next schedules its own work. As the TLB instructions are modelled
> as CP writes the TB ends at this point meaning cpu->exit_request will
> be checked before the next instruction is executed.
> 
> Deferring the work until the architectural sync point is a possible
> future optimisation.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>  1 file changed, 69 insertions(+), 96 deletions(-)
> 

Hello,

I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
single-threaded TCG is affected. Git bisection lead to this patch, any ideas?

Example:

qemu-system-arm -M vexpress-a9 -smp cpus=2 -accel accel=tcg,thread=single
-kernel arch/arm/boot/zImage -dtb arch/arm/boot/dts/vexpress-v2p-ca9.dtb -serial
stdio -net nic,model=lan9118 -net user -d in_asm,out_asm -D /tmp/qemulog

Last TB from the log:
----------------
IN:
0xc011a450:  ee080f73      mcr	15, 0, r0, cr8, cr3, {3}

OUT: [size=68]
0x7f32d8b93f80:  mov    -0x18(%r14),%ebp
0x7f32d8b93f84:  test   %ebp,%ebp
0x7f32d8b93f86:  jne    0x7f32d8b93fb8
0x7f32d8b93f8c:  mov    %r14,%rdi
0x7f32d8b93f8f:  mov    $0x5620f2aea5d0,%rsi
0x7f32d8b93f99:  mov    (%r14),%edx
0x7f32d8b93f9c:  mov    $0x5620f18107ca,%r10
0x7f32d8b93fa6:  callq  *%r10
0x7f32d8b93fa9:  movl   $0xc011a454,0x3c(%r14)
0x7f32d8b93fb1:  xor    %eax,%eax
0x7f32d8b93fb3:  jmpq   0x7f32d7a4e016
0x7f32d8b93fb8:  lea    -0x14aa07c(%rip),%rax        # 0x7f32d76e9f43
0x7f32d8b93fbf:  jmpq   0x7f32d7a4e016


> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index b41d0494d1..bcedb4a808 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -536,41 +536,33 @@ static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush(other_cs);
> -    }
> +    tlb_flush_all_cpus_synced(cs);
>  }
>  
>  static void tlbiasid_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush(other_cs);
> -    }
> +    tlb_flush_all_cpus_synced(cs);
>  }
>  
>  static void tlbimva_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
> -    }
> +    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
>  }
>  
>  static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                               uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
> -    }
> +    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
>  }
>  
>  static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -587,14 +579,12 @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                    uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_by_mmuidx(other_cs,
> -                            (1 << ARMMMUIdx_S12NSE1) |
> -                            (1 << ARMMMUIdx_S12NSE0) |
> -                            (1 << ARMMMUIdx_S2NS));
> -    }
> +    tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                        (1 << ARMMMUIdx_S12NSE1) |
> +                                        (1 << ARMMMUIdx_S12NSE0) |
> +                                        (1 << ARMMMUIdx_S2NS));
>  }
>  
>  static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -621,7 +611,7 @@ static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                 uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>      uint64_t pageaddr;
>  
>      if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
> @@ -630,9 +620,8 @@ static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  
>      pageaddr = sextract64(value << 12, 0, 40);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
> -    }
> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                             (1 << ARMMMUIdx_S2NS));
>  }
>  
>  static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -646,11 +635,9 @@ static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                   uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
> -    }
> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
>  }
>  
>  static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -665,12 +652,11 @@ static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                   uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>      uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
> -    }
> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                             (1 << ARMMMUIdx_S1E2));
>  }
>  
>  static const ARMCPRegInfo cp_reginfo[] = {
> @@ -2904,8 +2890,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
>  static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                      uint64_t value)
>  {
> -    ARMCPU *cpu = arm_env_get_cpu(env);
> -    CPUState *cs = CPU(cpu);
> +    CPUState *cs = ENV_GET_CPU(env);
>  
>      if (arm_is_secure_below_el3(env)) {
>          tlb_flush_by_mmuidx(cs,
> @@ -2921,19 +2906,17 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                        uint64_t value)
>  {
> +    CPUState *cs = ENV_GET_CPU(env);
>      bool sec = arm_is_secure_below_el3(env);
> -    CPUState *other_cs;
>  
> -    CPU_FOREACH(other_cs) {
> -        if (sec) {
> -            tlb_flush_by_mmuidx(other_cs,
> -                                (1 << ARMMMUIdx_S1SE1) |
> -                                (1 << ARMMMUIdx_S1SE0));
> -        } else {
> -            tlb_flush_by_mmuidx(other_cs,
> -                                (1 << ARMMMUIdx_S12NSE1) |
> -                                (1 << ARMMMUIdx_S12NSE0));
> -        }
> +    if (sec) {
> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                            (1 << ARMMMUIdx_S1SE1) |
> +                                            (1 << ARMMMUIdx_S1SE0));
> +    } else {
> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                            (1 << ARMMMUIdx_S12NSE1) |
> +                                            (1 << ARMMMUIdx_S12NSE0));
>      }
>  }
>  
> @@ -2990,46 +2973,40 @@ static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>       * stage 2 translations, whereas most other scopes only invalidate
>       * stage 1 translations.
>       */
> +    CPUState *cs = ENV_GET_CPU(env);
>      bool sec = arm_is_secure_below_el3(env);
>      bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
> -    CPUState *other_cs;
> -
> -    CPU_FOREACH(other_cs) {
> -        if (sec) {
> -            tlb_flush_by_mmuidx(other_cs,
> -                                (1 << ARMMMUIdx_S1SE1) |
> -                                (1 << ARMMMUIdx_S1SE0));
> -        } else if (has_el2) {
> -            tlb_flush_by_mmuidx(other_cs,
> -                                (1 << ARMMMUIdx_S12NSE1) |
> -                                (1 << ARMMMUIdx_S12NSE0) |
> -                                (1 << ARMMMUIdx_S2NS));
> -        } else {
> -            tlb_flush_by_mmuidx(other_cs,
> -                                (1 << ARMMMUIdx_S12NSE1) |
> -                                (1 << ARMMMUIdx_S12NSE0));
> -        }
> +
> +    if (sec) {
> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                            (1 << ARMMMUIdx_S1SE1) |
> +                                            (1 << ARMMMUIdx_S1SE0));
> +    } else if (has_el2) {
> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                            (1 << ARMMMUIdx_S12NSE1) |
> +                                            (1 << ARMMMUIdx_S12NSE0) |
> +                                            (1 << ARMMMUIdx_S2NS));
> +    } else {
> +          tlb_flush_by_mmuidx_all_cpus_synced(cs,
> +                                              (1 << ARMMMUIdx_S12NSE1) |
> +                                              (1 << ARMMMUIdx_S12NSE0));
>      }
>  }
>  
>  static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                      uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
> -    }
> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
>  }
>  
>  static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                      uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E3));
> -    }
> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E3));
>  }
>  
>  static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -3086,43 +3063,40 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                     uint64_t value)
>  {
> +    ARMCPU *cpu = arm_env_get_cpu(env);
> +    CPUState *cs = CPU(cpu);
>      bool sec = arm_is_secure_below_el3(env);
> -    CPUState *other_cs;
>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>  
> -    CPU_FOREACH(other_cs) {
> -        if (sec) {
> -            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
> -                                     (1 << ARMMMUIdx_S1SE1) |
> -                                     (1 << ARMMMUIdx_S1SE0));
> -        } else {
> -            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
> -                                     (1 << ARMMMUIdx_S12NSE1) |
> -                                     (1 << ARMMMUIdx_S12NSE0));
> -        }
> +    if (sec) {
> +        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                                 (1 << ARMMMUIdx_S1SE1) |
> +                                                 (1 << ARMMMUIdx_S1SE0));
> +    } else {
> +        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                                 (1 << ARMMMUIdx_S12NSE1) |
> +                                                 (1 << ARMMMUIdx_S12NSE0));
>      }
>  }
>  
>  static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                     uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
> -    }
> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                             (1 << ARMMMUIdx_S1E2));
>  }
>  
>  static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                     uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E3));
> -    }
> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                             (1 << ARMMMUIdx_S1E3));
>  }
>  
>  static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
> @@ -3150,7 +3124,7 @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>                                        uint64_t value)
>  {
> -    CPUState *other_cs;
> +    CPUState *cs = ENV_GET_CPU(env);
>      uint64_t pageaddr;
>  
>      if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
> @@ -3159,9 +3133,8 @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>  
>      pageaddr = sextract64(value << 12, 0, 48);
>  
> -    CPU_FOREACH(other_cs) {
> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
> -    }
> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
> +                                             (1 << ARMMMUIdx_S2NS));
>  }
>  
>  static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
> 


-- 
Dmitry

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-17 13:07   ` Dmitry Osipenko
@ 2017-09-17 13:22     ` Alex Bennée
  2017-09-17 13:46       ` Dmitry Osipenko
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-09-17 13:22 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: peter.maydell, open list:ARM, qemu-devel


Dmitry Osipenko <digetx@gmail.com> writes:

> On 24.02.2017 14:21, Alex Bennée wrote:
>> Previously flushes on other vCPUs would only get serviced when they
>> exited their TranslationBlocks. While this isn't overly problematic it
>> violates the semantics of TLB flush from the point of view of source
>> vCPU.
>>
>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>> the flushes which ensures all flushes are completed by the time the
>> vCPU next schedules its own work. As the TLB instructions are modelled
>> as CP writes the TB ends at this point meaning cpu->exit_request will
>> be checked before the next instruction is executed.
>>
>> Deferring the work until the architectural sync point is a possible
>> future optimisation.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>> ---
>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>
>
> Hello,
>
> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
> single-threaded TCG is affected. Git bisection lead to this patch, any
> ideas?

It shouldn't cause a problem but can you obtain a backtrace of the
system when hung?

>
> Example:
>
> qemu-system-arm -M vexpress-a9 -smp cpus=2 -accel accel=tcg,thread=single
> -kernel arch/arm/boot/zImage -dtb arch/arm/boot/dts/vexpress-v2p-ca9.dtb -serial
> stdio -net nic,model=lan9118 -net user -d in_asm,out_asm -D /tmp/qemulog
>
> Last TB from the log:
> ----------------
> IN:
> 0xc011a450:  ee080f73      mcr	15, 0, r0, cr8, cr3, {3}
>
> OUT: [size=68]
> 0x7f32d8b93f80:  mov    -0x18(%r14),%ebp
> 0x7f32d8b93f84:  test   %ebp,%ebp
> 0x7f32d8b93f86:  jne    0x7f32d8b93fb8
> 0x7f32d8b93f8c:  mov    %r14,%rdi
> 0x7f32d8b93f8f:  mov    $0x5620f2aea5d0,%rsi
> 0x7f32d8b93f99:  mov    (%r14),%edx
> 0x7f32d8b93f9c:  mov    $0x5620f18107ca,%r10
> 0x7f32d8b93fa6:  callq  *%r10
> 0x7f32d8b93fa9:  movl   $0xc011a454,0x3c(%r14)
> 0x7f32d8b93fb1:  xor    %eax,%eax
> 0x7f32d8b93fb3:  jmpq   0x7f32d7a4e016
> 0x7f32d8b93fb8:  lea    -0x14aa07c(%rip),%rax        # 0x7f32d76e9f43
> 0x7f32d8b93fbf:  jmpq   0x7f32d7a4e016
>
>
>> diff --git a/target/arm/helper.c b/target/arm/helper.c
>> index b41d0494d1..bcedb4a808 100644
>> --- a/target/arm/helper.c
>> +++ b/target/arm/helper.c
>> @@ -536,41 +536,33 @@ static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbiall_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                               uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush(other_cs);
>> -    }
>> +    tlb_flush_all_cpus_synced(cs);
>>  }
>>
>>  static void tlbiasid_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                               uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush(other_cs);
>> -    }
>> +    tlb_flush_all_cpus_synced(cs);
>>  }
>>
>>  static void tlbimva_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                               uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
>> -    }
>> +    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
>>  }
>>
>>  static void tlbimvaa_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                               uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page(other_cs, value & TARGET_PAGE_MASK);
>> -    }
>> +    tlb_flush_page_all_cpus_synced(cs, value & TARGET_PAGE_MASK);
>>  }
>>
>>  static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -587,14 +579,12 @@ static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                    uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_by_mmuidx(other_cs,
>> -                            (1 << ARMMMUIdx_S12NSE1) |
>> -                            (1 << ARMMMUIdx_S12NSE0) |
>> -                            (1 << ARMMMUIdx_S2NS));
>> -    }
>> +    tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                        (1 << ARMMMUIdx_S12NSE1) |
>> +                                        (1 << ARMMMUIdx_S12NSE0) |
>> +                                        (1 << ARMMMUIdx_S2NS));
>>  }
>>
>>  static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -621,7 +611,7 @@ static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                 uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      uint64_t pageaddr;
>>
>>      if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
>> @@ -630,9 +620,8 @@ static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>
>>      pageaddr = sextract64(value << 12, 0, 40);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
>> -    }
>> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                             (1 << ARMMMUIdx_S2NS));
>>  }
>>
>>  static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -646,11 +635,9 @@ static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                   uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
>> -    }
>> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
>>  }
>>
>>  static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -665,12 +652,11 @@ static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                   uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
>> -    }
>> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                             (1 << ARMMMUIdx_S1E2));
>>  }
>>
>>  static const ARMCPRegInfo cp_reginfo[] = {
>> @@ -2904,8 +2890,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState *env,
>>  static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                      uint64_t value)
>>  {
>> -    ARMCPU *cpu = arm_env_get_cpu(env);
>> -    CPUState *cs = CPU(cpu);
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>>      if (arm_is_secure_below_el3(env)) {
>>          tlb_flush_by_mmuidx(cs,
>> @@ -2921,19 +2906,17 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                        uint64_t value)
>>  {
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      bool sec = arm_is_secure_below_el3(env);
>> -    CPUState *other_cs;
>>
>> -    CPU_FOREACH(other_cs) {
>> -        if (sec) {
>> -            tlb_flush_by_mmuidx(other_cs,
>> -                                (1 << ARMMMUIdx_S1SE1) |
>> -                                (1 << ARMMMUIdx_S1SE0));
>> -        } else {
>> -            tlb_flush_by_mmuidx(other_cs,
>> -                                (1 << ARMMMUIdx_S12NSE1) |
>> -                                (1 << ARMMMUIdx_S12NSE0));
>> -        }
>> +    if (sec) {
>> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                            (1 << ARMMMUIdx_S1SE1) |
>> +                                            (1 << ARMMMUIdx_S1SE0));
>> +    } else {
>> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                            (1 << ARMMMUIdx_S12NSE1) |
>> +                                            (1 << ARMMMUIdx_S12NSE0));
>>      }
>>  }
>>
>> @@ -2990,46 +2973,40 @@ static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>       * stage 2 translations, whereas most other scopes only invalidate
>>       * stage 1 translations.
>>       */
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      bool sec = arm_is_secure_below_el3(env);
>>      bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
>> -    CPUState *other_cs;
>> -
>> -    CPU_FOREACH(other_cs) {
>> -        if (sec) {
>> -            tlb_flush_by_mmuidx(other_cs,
>> -                                (1 << ARMMMUIdx_S1SE1) |
>> -                                (1 << ARMMMUIdx_S1SE0));
>> -        } else if (has_el2) {
>> -            tlb_flush_by_mmuidx(other_cs,
>> -                                (1 << ARMMMUIdx_S12NSE1) |
>> -                                (1 << ARMMMUIdx_S12NSE0) |
>> -                                (1 << ARMMMUIdx_S2NS));
>> -        } else {
>> -            tlb_flush_by_mmuidx(other_cs,
>> -                                (1 << ARMMMUIdx_S12NSE1) |
>> -                                (1 << ARMMMUIdx_S12NSE0));
>> -        }
>> +
>> +    if (sec) {
>> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                            (1 << ARMMMUIdx_S1SE1) |
>> +                                            (1 << ARMMMUIdx_S1SE0));
>> +    } else if (has_el2) {
>> +        tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                            (1 << ARMMMUIdx_S12NSE1) |
>> +                                            (1 << ARMMMUIdx_S12NSE0) |
>> +                                            (1 << ARMMMUIdx_S2NS));
>> +    } else {
>> +          tlb_flush_by_mmuidx_all_cpus_synced(cs,
>> +                                              (1 << ARMMMUIdx_S12NSE1) |
>> +                                              (1 << ARMMMUIdx_S12NSE0));
>>      }
>>  }
>>
>>  static void tlbi_aa64_alle2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                      uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E2));
>> -    }
>> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E2));
>>  }
>>
>>  static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                      uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_by_mmuidx(other_cs, (1 << ARMMMUIdx_S1E3));
>> -    }
>> +    tlb_flush_by_mmuidx_all_cpus_synced(cs, (1 << ARMMMUIdx_S1E3));
>>  }
>>
>>  static void tlbi_aa64_vae1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -3086,43 +3063,40 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                     uint64_t value)
>>  {
>> +    ARMCPU *cpu = arm_env_get_cpu(env);
>> +    CPUState *cs = CPU(cpu);
>>      bool sec = arm_is_secure_below_el3(env);
>> -    CPUState *other_cs;
>>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        if (sec) {
>> -            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
>> -                                     (1 << ARMMMUIdx_S1SE1) |
>> -                                     (1 << ARMMMUIdx_S1SE0));
>> -        } else {
>> -            tlb_flush_page_by_mmuidx(other_cs, pageaddr,
>> -                                     (1 << ARMMMUIdx_S12NSE1) |
>> -                                     (1 << ARMMMUIdx_S12NSE0));
>> -        }
>> +    if (sec) {
>> +        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                                 (1 << ARMMMUIdx_S1SE1) |
>> +                                                 (1 << ARMMMUIdx_S1SE0));
>> +    } else {
>> +        tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                                 (1 << ARMMMUIdx_S12NSE1) |
>> +                                                 (1 << ARMMMUIdx_S12NSE0));
>>      }
>>  }
>>
>>  static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                     uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E2));
>> -    }
>> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                             (1 << ARMMMUIdx_S1E2));
>>  }
>>
>>  static void tlbi_aa64_vae3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                     uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      uint64_t pageaddr = sextract64(value << 12, 0, 56);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1E3));
>> -    }
>> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                             (1 << ARMMMUIdx_S1E3));
>>  }
>>
>>  static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>> @@ -3150,7 +3124,7 @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>  static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>                                        uint64_t value)
>>  {
>> -    CPUState *other_cs;
>> +    CPUState *cs = ENV_GET_CPU(env);
>>      uint64_t pageaddr;
>>
>>      if (!arm_feature(env, ARM_FEATURE_EL2) || !(env->cp15.scr_el3 & SCR_NS)) {
>> @@ -3159,9 +3133,8 @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
>>
>>      pageaddr = sextract64(value << 12, 0, 48);
>>
>> -    CPU_FOREACH(other_cs) {
>> -        tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S2NS));
>> -    }
>> +    tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
>> +                                             (1 << ARMMMUIdx_S2NS));
>>  }
>>
>>  static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
>>


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-17 13:22     ` Alex Bennée
@ 2017-09-17 13:46       ` Dmitry Osipenko
  2017-09-18 10:10         ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Dmitry Osipenko @ 2017-09-17 13:46 UTC (permalink / raw)
  To: Alex Bennée; +Cc: peter.maydell, open list:ARM, qemu-devel

On 17.09.2017 16:22, Alex Bennée wrote:
> 
> Dmitry Osipenko <digetx@gmail.com> writes:
> 
>> On 24.02.2017 14:21, Alex Bennée wrote:
>>> Previously flushes on other vCPUs would only get serviced when they
>>> exited their TranslationBlocks. While this isn't overly problematic it
>>> violates the semantics of TLB flush from the point of view of source
>>> vCPU.
>>>
>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>> the flushes which ensures all flushes are completed by the time the
>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>> be checked before the next instruction is executed.
>>>
>>> Deferring the work until the architectural sync point is a possible
>>> future optimisation.
>>>
>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>> ---
>>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>
>>
>> Hello,
>>
>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>> single-threaded TCG is affected. Git bisection lead to this patch, any
>> ideas?
> 
> It shouldn't cause a problem but can you obtain a backtrace of the
> system when hung?
> 

Actually, it looks like TCG enters infinite loop. Do you mean backtrace of QEMU
by 'backtrace of the system'? If so, here it is:

Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):

#0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84

#1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, __fds=0x7ffa30006dc0) at
/usr/include/bits/poll2.h:46
#2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, userdata=0x557bd603eae0)
at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
#3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
#4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
block=<optimized out>, retval=0x0) at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
#5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
retval=retval@entry=0x0) at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944

#6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100

#7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
/var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81

#8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
pthread_create.c:456

#9  0x00007ffa60193c5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97





Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):


#0  0x00007ffa53e51caf in code_gen_buffer ()


#1  0x0000557bd2fa7f17 in cpu_tb_exec (cpu=0x557bd56160a0, itb=0x7ffa53e51b80
<code_gen_buffer+15481686>) at /home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:166

#2  0x0000557bd2fa8e0f in cpu_loop_exec_tb (cpu=0x557bd56160a0,
tb=0x7ffa53e51b80 <code_gen_buffer+15481686>, last_tb=0x7ffa4adfea68,
tb_exit=0x7ffa4adfea64) at /home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:613
#3  0x0000557bd2fa90ff in cpu_exec (cpu=0x557bd56160a0) at
/home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:711

#4  0x0000557bd2f6dcba in tcg_cpu_exec (cpu=0x557bd56160a0) at
/home/dima/vl/qemu-tests/cpus.c:1270

#5  0x0000557bd2f6dee1 in qemu_tcg_rr_cpu_thread_fn (arg=0x557bd5598e20) at
/home/dima/vl/qemu-tests/cpus.c:1365

#6  0x00007ffa60453657 in start_thread (arg=0x7ffa4adff700) at
pthread_create.c:456

#7  0x00007ffa60193c5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97







Thread 2 (Thread 0x7ffa561bf700 (LWP 20714)):



#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38



#1  0x0000557bd34e1eaa in qemu_futex_wait (f=0x557bd4031798
<rcu_call_ready_event>, val=4294967295) at
/home/dima/vl/qemu-tests/include/qemu/futex.h:26


#2  0x0000557bd34e2071 in qemu_event_wait (ev=0x557bd4031798
<rcu_call_ready_event>) at util/qemu-thread-posix.c:442


#3  0x0000557bd34f9b1f in call_rcu_thread (opaque=0x0) at util/rcu.c:249



#4  0x00007ffa60453657 in start_thread (arg=0x7ffa561bf700) at
pthread_create.c:456


#5  0x00007ffa60193c5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97







Thread 1 (Thread 0x7ffa67502600 (LWP 20713)):



#0  0x00007ffa601889ab in __GI_ppoll (fds=0x557bd5bbf160, nfds=11,
timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39


#1  0x0000557bd34dc460 in qemu_poll_ns (fds=0x557bd5bbf160, nfds=11,
timeout=29841115) at util/qemu-timer.c:334


#2  0x0000557bd34dd488 in os_host_main_loop_wait (timeout=29841115) at
util/main-loop.c:255


#3  0x0000557bd34dd557 in main_loop_wait (nonblocking=0) at util/main-loop.c:515



#4  0x0000557bd3120f0e in main_loop () at vl.c:1999



#5  0x0000557bd3128d4a in main (argc=17, argv=0x7ffe7de2a248,
envp=0x7ffe7de2a2d8) at vl.c:4877

>>
>> Example:
>>
>> qemu-system-arm -M vexpress-a9 -smp cpus=2 -accel accel=tcg,thread=single
>> -kernel arch/arm/boot/zImage -dtb arch/arm/boot/dts/vexpress-v2p-ca9.dtb -serial
>> stdio -net nic,model=lan9118 -net user -d in_asm,out_asm -D /tmp/qemulog
>>
>> Last TB from the log:
>> ----------------
>> IN:
>> 0xc011a450:  ee080f73      mcr	15, 0, r0, cr8, cr3, {3}
>>
>> OUT: [size=68]
>> 0x7f32d8b93f80:  mov    -0x18(%r14),%ebp
>> 0x7f32d8b93f84:  test   %ebp,%ebp
>> 0x7f32d8b93f86:  jne    0x7f32d8b93fb8
>> 0x7f32d8b93f8c:  mov    %r14,%rdi
>> 0x7f32d8b93f8f:  mov    $0x5620f2aea5d0,%rsi
>> 0x7f32d8b93f99:  mov    (%r14),%edx
>> 0x7f32d8b93f9c:  mov    $0x5620f18107ca,%r10
>> 0x7f32d8b93fa6:  callq  *%r10
>> 0x7f32d8b93fa9:  movl   $0xc011a454,0x3c(%r14)
>> 0x7f32d8b93fb1:  xor    %eax,%eax
>> 0x7f32d8b93fb3:  jmpq   0x7f32d7a4e016
>> 0x7f32d8b93fb8:  lea    -0x14aa07c(%rip),%rax        # 0x7f32d76e9f43
>> 0x7f32d8b93fbf:  jmpq   0x7f32d7a4e016
-- 
Dmitry

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-17 13:46       ` Dmitry Osipenko
@ 2017-09-18 10:10         ` Alex Bennée
  2017-09-18 12:23           ` Dmitry Osipenko
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-09-18 10:10 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: peter.maydell, open list:ARM, qemu-devel


Dmitry Osipenko <digetx@gmail.com> writes:

> On 17.09.2017 16:22, Alex Bennée wrote:
>>
>> Dmitry Osipenko <digetx@gmail.com> writes:
>>
>>> On 24.02.2017 14:21, Alex Bennée wrote:
>>>> Previously flushes on other vCPUs would only get serviced when they
>>>> exited their TranslationBlocks. While this isn't overly problematic it
>>>> violates the semantics of TLB flush from the point of view of source
>>>> vCPU.
>>>>
>>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>>> the flushes which ensures all flushes are completed by the time the
>>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>>> be checked before the next instruction is executed.
>>>>
>>>> Deferring the work until the architectural sync point is a possible
>>>> future optimisation.
>>>>
>>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>>> ---
>>>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>>
>>>
>>> Hello,
>>>
>>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
>>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
>>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>>> single-threaded TCG is affected. Git bisection lead to this patch, any
>>> ideas?
>>
>> It shouldn't cause a problem but can you obtain a backtrace of the
>> system when hung?
>>
>
> Actually, it looks like TCG enters infinite loop. Do you mean backtrace of QEMU
> by 'backtrace of the system'? If so, here it is:
>
> Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):
>
> #0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84
>
> #1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, __fds=0x7ffa30006dc0) at
> /usr/include/bits/poll2.h:46
> #2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, userdata=0x557bd603eae0)
> at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
> #3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
> #4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
> block=<optimized out>, retval=0x0) at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
> #5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
> retval=retval@entry=0x0) at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944
>
> #6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100
>
> #7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81
>
> #8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
> pthread_create.c:456
>
> #9  0x00007ffa60193c5f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>
>
>
>
>
> Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):
>
>
> #0  0x00007ffa53e51caf in code_gen_buffer ()
>

Well it's not locked up in servicing any flush tasks as it's executing
code. Maybe the guest code is spinning on something?

In the monitor:

  info registers

Will show you where things are, see if the ip is moving each time. Also
you can do a disassemble dump from there to see what code it is stuck
on.

>
> #1  0x0000557bd2fa7f17 in cpu_tb_exec (cpu=0x557bd56160a0, itb=0x7ffa53e51b80
> <code_gen_buffer+15481686>) at /home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:166
>
> #2  0x0000557bd2fa8e0f in cpu_loop_exec_tb (cpu=0x557bd56160a0,
> tb=0x7ffa53e51b80 <code_gen_buffer+15481686>, last_tb=0x7ffa4adfea68,
> tb_exit=0x7ffa4adfea64) at /home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:613
> #3  0x0000557bd2fa90ff in cpu_exec (cpu=0x557bd56160a0) at
> /home/dima/vl/qemu-tests/accel/tcg/cpu-exec.c:711
>
> #4  0x0000557bd2f6dcba in tcg_cpu_exec (cpu=0x557bd56160a0) at
> /home/dima/vl/qemu-tests/cpus.c:1270
>
> #5  0x0000557bd2f6dee1 in qemu_tcg_rr_cpu_thread_fn (arg=0x557bd5598e20) at
> /home/dima/vl/qemu-tests/cpus.c:1365
>
> #6  0x00007ffa60453657 in start_thread (arg=0x7ffa4adff700) at
> pthread_create.c:456
>
> #7  0x00007ffa60193c5f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>
>
>
>
>
>
>
> Thread 2 (Thread 0x7ffa561bf700 (LWP 20714)):
>
>
>
> #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>
>
>
> #1  0x0000557bd34e1eaa in qemu_futex_wait (f=0x557bd4031798
> <rcu_call_ready_event>, val=4294967295) at
> /home/dima/vl/qemu-tests/include/qemu/futex.h:26
>
>
> #2  0x0000557bd34e2071 in qemu_event_wait (ev=0x557bd4031798
> <rcu_call_ready_event>) at util/qemu-thread-posix.c:442
>
>
> #3  0x0000557bd34f9b1f in call_rcu_thread (opaque=0x0) at util/rcu.c:249
>
>
>
> #4  0x00007ffa60453657 in start_thread (arg=0x7ffa561bf700) at
> pthread_create.c:456
>
>
> #5  0x00007ffa60193c5f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>
>
>
>
>
>
>
> Thread 1 (Thread 0x7ffa67502600 (LWP 20713)):
>
>
>
> #0  0x00007ffa601889ab in __GI_ppoll (fds=0x557bd5bbf160, nfds=11,
> timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
>
>
> #1  0x0000557bd34dc460 in qemu_poll_ns (fds=0x557bd5bbf160, nfds=11,
> timeout=29841115) at util/qemu-timer.c:334
>
>
> #2  0x0000557bd34dd488 in os_host_main_loop_wait (timeout=29841115) at
> util/main-loop.c:255
>
>
> #3  0x0000557bd34dd557 in main_loop_wait (nonblocking=0) at util/main-loop.c:515
>
>
>
> #4  0x0000557bd3120f0e in main_loop () at vl.c:1999
>
>
>
> #5  0x0000557bd3128d4a in main (argc=17, argv=0x7ffe7de2a248,
> envp=0x7ffe7de2a2d8) at vl.c:4877
>
>>>
>>> Example:
>>>
>>> qemu-system-arm -M vexpress-a9 -smp cpus=2 -accel accel=tcg,thread=single
>>> -kernel arch/arm/boot/zImage -dtb arch/arm/boot/dts/vexpress-v2p-ca9.dtb -serial
>>> stdio -net nic,model=lan9118 -net user -d in_asm,out_asm -D /tmp/qemulog
>>>
>>> Last TB from the log:
>>> ----------------
>>> IN:
>>> 0xc011a450:  ee080f73      mcr	15, 0, r0, cr8, cr3, {3}
>>>
>>> OUT: [size=68]
>>> 0x7f32d8b93f80:  mov    -0x18(%r14),%ebp
>>> 0x7f32d8b93f84:  test   %ebp,%ebp
>>> 0x7f32d8b93f86:  jne    0x7f32d8b93fb8
>>> 0x7f32d8b93f8c:  mov    %r14,%rdi
>>> 0x7f32d8b93f8f:  mov    $0x5620f2aea5d0,%rsi
>>> 0x7f32d8b93f99:  mov    (%r14),%edx
>>> 0x7f32d8b93f9c:  mov    $0x5620f18107ca,%r10
>>> 0x7f32d8b93fa6:  callq  *%r10
>>> 0x7f32d8b93fa9:  movl   $0xc011a454,0x3c(%r14)
>>> 0x7f32d8b93fb1:  xor    %eax,%eax
>>> 0x7f32d8b93fb3:  jmpq   0x7f32d7a4e016
>>> 0x7f32d8b93fb8:  lea    -0x14aa07c(%rip),%rax        # 0x7f32d76e9f43
>>> 0x7f32d8b93fbf:  jmpq   0x7f32d7a4e016


--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-18 10:10         ` Alex Bennée
@ 2017-09-18 12:23           ` Dmitry Osipenko
  2017-09-18 14:00             ` Alex Bennée
  0 siblings, 1 reply; 55+ messages in thread
From: Dmitry Osipenko @ 2017-09-18 12:23 UTC (permalink / raw)
  To: Alex Bennée; +Cc: peter.maydell, open list:ARM, qemu-devel

On 18.09.2017 13:10, Alex Bennée wrote:
> 
> Dmitry Osipenko <digetx@gmail.com> writes:
> 
>> On 17.09.2017 16:22, Alex Bennée wrote:
>>>
>>> Dmitry Osipenko <digetx@gmail.com> writes:
>>>
>>>> On 24.02.2017 14:21, Alex Bennée wrote:
>>>>> Previously flushes on other vCPUs would only get serviced when they
>>>>> exited their TranslationBlocks. While this isn't overly problematic it
>>>>> violates the semantics of TLB flush from the point of view of source
>>>>> vCPU.
>>>>>
>>>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>>>> the flushes which ensures all flushes are completed by the time the
>>>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>>>> be checked before the next instruction is executed.
>>>>>
>>>>> Deferring the work until the architectural sync point is a possible
>>>>> future optimisation.
>>>>>
>>>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>>>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>>>> ---
>>>>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>>>
>>>>
>>>> Hello,
>>>>
>>>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
>>>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>>>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
>>>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>>>> single-threaded TCG is affected. Git bisection lead to this patch, any
>>>> ideas?
>>>
>>> It shouldn't cause a problem but can you obtain a backtrace of the
>>> system when hung?
>>>
>>
>> Actually, it looks like TCG enters infinite loop. Do you mean backtrace of QEMU
>> by 'backtrace of the system'? If so, here it is:
>>
>> Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):
>>
>> #0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84
>>
>> #1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, __fds=0x7ffa30006dc0) at
>> /usr/include/bits/poll2.h:46
>> #2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, userdata=0x557bd603eae0)
>> at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
>> #3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
>> #4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
>> block=<optimized out>, retval=0x0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
>> #5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
>> retval=retval@entry=0x0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944
>>
>> #6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100
>>
>> #7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81
>>
>> #8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
>> pthread_create.c:456
>>
>> #9  0x00007ffa60193c5f in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>>
>>
>>
>>
>>
>> Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):
>>
>>
>> #0  0x00007ffa53e51caf in code_gen_buffer ()
>>
> 
> Well it's not locked up in servicing any flush tasks as it's executing
> code. Maybe the guest code is spinning on something?
> 

Indeed, I should have used 'exec' instead of 'in_asm'.

> In the monitor:
> 
>   info registers
> 
> Will show you where things are, see if the ip is moving each time. Also
> you can do a disassemble dump from there to see what code it is stuck
> on.
> 

I've attached with GDB to QEMU to see where it got stuck. Turned out it is
caused by CONFIG_STRICT_KERNEL_RWX=y of the Linux kernel. Upon boot completion
kernel changes memory permissions and that changing is executed on a dedicated
CPU, while other CPUs are 'stopped' in a busy loop.

This patch just introduced a noticeable performance regression for a
single-threaded TCG, which is probably fine since MTTCG is the default now.
Thank you very much for the suggestions and all your work on MTTCG!

-- 
Dmitry

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-18 12:23           ` Dmitry Osipenko
@ 2017-09-18 14:00             ` Alex Bennée
  2017-09-18 15:32               ` Dmitry Osipenko
  0 siblings, 1 reply; 55+ messages in thread
From: Alex Bennée @ 2017-09-18 14:00 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: peter.maydell, open list:ARM, qemu-devel


Dmitry Osipenko <digetx@gmail.com> writes:

> On 18.09.2017 13:10, Alex Bennée wrote:
>>
>> Dmitry Osipenko <digetx@gmail.com> writes:
>>
>>> On 17.09.2017 16:22, Alex Bennée wrote:
>>>>
>>>> Dmitry Osipenko <digetx@gmail.com> writes:
>>>>
>>>>> On 24.02.2017 14:21, Alex Bennée wrote:
>>>>>> Previously flushes on other vCPUs would only get serviced when they
>>>>>> exited their TranslationBlocks. While this isn't overly problematic it
>>>>>> violates the semantics of TLB flush from the point of view of source
>>>>>> vCPU.
>>>>>>
>>>>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>>>>> the flushes which ensures all flushes are completed by the time the
>>>>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>>>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>>>>> be checked before the next instruction is executed.
>>>>>>
>>>>>> Deferring the work until the architectural sync point is a possible
>>>>>> future optimisation.
>>>>>>
>>>>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>>>>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>>>>> ---
>>>>>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>>>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
>>>>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>>>>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
>>>>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>>>>> single-threaded TCG is affected. Git bisection lead to this patch, any
>>>>> ideas?
>>>>
>>>> It shouldn't cause a problem but can you obtain a backtrace of the
>>>> system when hung?
>>>>
>>>
>>> Actually, it looks like TCG enters infinite loop. Do you mean backtrace of QEMU
>>> by 'backtrace of the system'? If so, here it is:
>>>
>>> Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):
>>>
>>> #0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84
>>>
>>> #1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, __fds=0x7ffa30006dc0) at
>>> /usr/include/bits/poll2.h:46
>>> #2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, userdata=0x557bd603eae0)
>>> at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
>>> #3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
>>> #4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
>>> block=<optimized out>, retval=0x0) at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
>>> #5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
>>> retval=retval@entry=0x0) at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944
>>>
>>> #6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100
>>>
>>> #7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81
>>>
>>> #8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
>>> pthread_create.c:456
>>>
>>> #9  0x00007ffa60193c5f in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>>>
>>>
>>>
>>>
>>>
>>> Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):
>>>
>>>
>>> #0  0x00007ffa53e51caf in code_gen_buffer ()
>>>
>>
>> Well it's not locked up in servicing any flush tasks as it's executing
>> code. Maybe the guest code is spinning on something?
>>
>
> Indeed, I should have used 'exec' instead of 'in_asm'.
>
>> In the monitor:
>>
>>   info registers
>>
>> Will show you where things are, see if the ip is moving each time. Also
>> you can do a disassemble dump from there to see what code it is stuck
>> on.
>>
>
> I've attached with GDB to QEMU to see where it got stuck. Turned out it is
> caused by CONFIG_STRICT_KERNEL_RWX=y of the Linux kernel. Upon boot completion
> kernel changes memory permissions and that changing is executed on a dedicated
> CPU, while other CPUs are 'stopped' in a busy loop.
>
> This patch just introduced a noticeable performance regression for a
> single-threaded TCG, which is probably fine since MTTCG is the default now.
> Thank you very much for the suggestions and all your work on MTTCG!

Hmm well it would be nice to know the exact mechanism for that failure.
If we just end up with a very long list of tasks in
cpu->queued_work_first then I guess that explains it but it would be
nice to quantify the problem.

I had trouble seeing where this loop is in the kernel code, got a pointer?

--
Alex Bennée

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete
  2017-09-18 14:00             ` Alex Bennée
@ 2017-09-18 15:32               ` Dmitry Osipenko
  0 siblings, 0 replies; 55+ messages in thread
From: Dmitry Osipenko @ 2017-09-18 15:32 UTC (permalink / raw)
  To: Alex Bennée; +Cc: peter.maydell, open list:ARM, qemu-devel

On 18.09.2017 17:00, Alex Bennée wrote:
> 
> Dmitry Osipenko <digetx@gmail.com> writes:
> 
>> On 18.09.2017 13:10, Alex Bennée wrote:
>>>
>>> Dmitry Osipenko <digetx@gmail.com> writes:
>>>
>>>> On 17.09.2017 16:22, Alex Bennée wrote:
>>>>>
>>>>> Dmitry Osipenko <digetx@gmail.com> writes:
>>>>>
>>>>>> On 24.02.2017 14:21, Alex Bennée wrote:
>>>>>>> Previously flushes on other vCPUs would only get serviced when they
>>>>>>> exited their TranslationBlocks. While this isn't overly problematic it
>>>>>>> violates the semantics of TLB flush from the point of view of source
>>>>>>> vCPU.
>>>>>>>
>>>>>>> To solve this we call the cputlb *_all_cpus_synced() functions to do
>>>>>>> the flushes which ensures all flushes are completed by the time the
>>>>>>> vCPU next schedules its own work. As the TLB instructions are modelled
>>>>>>> as CP writes the TB ends at this point meaning cpu->exit_request will
>>>>>>> be checked before the next instruction is executed.
>>>>>>>
>>>>>>> Deferring the work until the architectural sync point is a possible
>>>>>>> future optimisation.
>>>>>>>
>>>>>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>>> Reviewed-by: Richard Henderson <rth@twiddle.net>
>>>>>>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>>>>>> ---
>>>>>>>  target/arm/helper.c | 165 ++++++++++++++++++++++------------------------------
>>>>>>>  1 file changed, 69 insertions(+), 96 deletions(-)
>>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I have an issue with Linux kernel stopping to boot on a SMP 32bit ARM (haven't
>>>>>> checked 64bit) in a single-threaded TCG mode. Kernel reaches point where it
>>>>>> should mount rootfs over NFS and vCPUs stop. This issue is reproducible with any
>>>>>> 32bit ARM machine type. Kernel boots fine with a MTTCG accel, only
>>>>>> single-threaded TCG is affected. Git bisection lead to this patch, any
>>>>>> ideas?
>>>>>
>>>>> It shouldn't cause a problem but can you obtain a backtrace of the
>>>>> system when hung?
>>>>>
>>>>
>>>> Actually, it looks like TCG enters infinite loop. Do you mean backtrace of QEMU
>>>> by 'backtrace of the system'? If so, here it is:
>>>>
>>>> Thread 4 (Thread 0x7ffa37f10700 (LWP 20716)):
>>>>
>>>> #0  0x00007ffa601888bd in poll () at ../sysdeps/unix/syscall-template.S:84
>>>>
>>>> #1  0x00007ffa5e3aa561 in poll (__timeout=-1, __nfds=2, __fds=0x7ffa30006dc0) at
>>>> /usr/include/bits/poll2.h:46
>>>> #2  poll_func (ufds=0x7ffa30006dc0, nfds=2, timeout=-1, userdata=0x557bd603eae0)
>>>> at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:69
>>>> #3  0x00007ffa5e39bbb1 in pa_mainloop_poll (m=m@entry=0x557bd60401f0) at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:844
>>>> #4  0x00007ffa5e39c24e in pa_mainloop_iterate (m=0x557bd60401f0,
>>>> block=<optimized out>, retval=0x0) at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:926
>>>> #5  0x00007ffa5e39c300 in pa_mainloop_run (m=0x557bd60401f0,
>>>> retval=retval@entry=0x0) at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/mainloop.c:944
>>>>
>>>> #6  0x00007ffa5e3aa4a9 in thread (userdata=0x557bd60400f0) at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulse/thread-mainloop.c:100
>>>>
>>>> #7  0x00007ffa599eea38 in internal_thread_func (userdata=0x557bd603e090) at
>>>> /var/tmp/portage/media-sound/pulseaudio-10.0/work/pulseaudio-10.0/src/pulsecore/thread-posix.c:81
>>>>
>>>> #8  0x00007ffa60453657 in start_thread (arg=0x7ffa37f10700) at
>>>> pthread_create.c:456
>>>>
>>>> #9  0x00007ffa60193c5f in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Thread 3 (Thread 0x7ffa4adff700 (LWP 20715)):
>>>>
>>>>
>>>> #0  0x00007ffa53e51caf in code_gen_buffer ()
>>>>
>>>
>>> Well it's not locked up in servicing any flush tasks as it's executing
>>> code. Maybe the guest code is spinning on something?
>>>
>>
>> Indeed, I should have used 'exec' instead of 'in_asm'.
>>
>>> In the monitor:
>>>
>>>   info registers
>>>
>>> Will show you where things are, see if the ip is moving each time. Also
>>> you can do a disassemble dump from there to see what code it is stuck
>>> on.
>>>
>>
>> I've attached with GDB to QEMU to see where it got stuck. Turned out it is
>> caused by CONFIG_STRICT_KERNEL_RWX=y of the Linux kernel. Upon boot completion
>> kernel changes memory permissions and that changing is executed on a dedicated
>> CPU, while other CPUs are 'stopped' in a busy loop.
>>
>> This patch just introduced a noticeable performance regression for a
>> single-threaded TCG, which is probably fine since MTTCG is the default now.
>> Thank you very much for the suggestions and all your work on MTTCG!
> 
> Hmm well it would be nice to know the exact mechanism for that failure.
> If we just end up with a very long list of tasks in
> cpu->queued_work_first then I guess that explains it but it would be
> nice to quantify the problem.
> 
> I had trouble seeing where this loop is in the kernel code, got a pointer?
> 
The memory permissions changing starts here:

http://elixir.free-electrons.com/linux/v4.14-rc1/source/arch/arm/mm/init.c#L739

The busy loop is here:

http://elixir.free-electrons.com/linux/v4.14-rc1/source/kernel/stop_machine.c#L195

Interestingly, I tried to attach to a 'hanged' QEMU another time and got into
other code. That code has the same pattern, one CPU flushes cache a lot in
shmem_rename2()->need_update()->memchr_inv() and the other is executing something.

So seems busy loop isn't the problem, it's just the TLB flushing is very-very
expensive in TCG. On the other hand I don't see such a problem with MTTCG, so
not sure what's going on with a single-threaded TCG.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2017-09-18 15:32 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-24 11:20 [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 01/24] docs: new design document multi-thread-tcg.txt Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 02/24] mttcg: translate-all: Enable locking debug in a debug build Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 03/24] mttcg: Add missing tb_lock/unlock() in cpu_exec_step() Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 04/24] tcg: move TCG_MO/BAR types into own file Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 05/24] tcg: add options for enabling MTTCG Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 06/24] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 07/24] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 08/24] tcg: drop global lock during TCG code execution Alex Bennée
2017-02-27 12:48   ` Laurent Desnogues
2017-02-27 14:39     ` Alex Bennée
2017-03-03 20:59       ` Aaron Lindsay
2017-03-03 21:08         ` Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 09/24] tcg: remove global exit_request Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 10/24] tcg: enable tb_lock() for SoftMMU Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 11/24] tcg: enable thread-per-vCPU Alex Bennée
2017-02-27 12:48   ` Laurent Vivier
2017-02-27 14:38     ` Alex Bennée
2017-03-13 14:03       ` Laurent Vivier
2017-03-13 16:58         ` Alex Bennée
2017-03-13 18:21           ` Laurent Vivier
2017-03-16 17:31         ` Alex Bennée
2017-03-16 18:36           ` Laurent Vivier
2017-03-17 20:43         ` Alex Bennée
2017-03-18 11:19           ` Laurent Vivier
2017-03-20 11:19           ` Paolo Bonzini
2017-03-20 11:47             ` Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 12/24] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 13/24] cputlb: add assert_cpu_is_self checks Alex Bennée
2017-02-24 11:20 ` [Qemu-devel] [PULL 14/24] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 15/24] cputlb: introduce tlb_flush_* async work Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 16/24] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 17/24] cputlb: add tlb_flush_by_mmuidx async routines Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 18/24] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 19/24] cputlb: introduce tlb_flush_*_all_cpus[_synced] Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 20/24] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 21/24] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 22/24] target-arm: ensure all cross vCPUs TLB flushes complete Alex Bennée
2017-09-17 13:07   ` Dmitry Osipenko
2017-09-17 13:22     ` Alex Bennée
2017-09-17 13:46       ` Dmitry Osipenko
2017-09-18 10:10         ` Alex Bennée
2017-09-18 12:23           ` Dmitry Osipenko
2017-09-18 14:00             ` Alex Bennée
2017-09-18 15:32               ` Dmitry Osipenko
2017-02-24 11:21 ` [Qemu-devel] [PULL 23/24] hw/misc/imx6_src: defer clearing of SRC_SCR reset bits Alex Bennée
2017-02-24 11:21 ` [Qemu-devel] [PULL 24/24] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
2017-02-25 21:14 ` [Qemu-devel] [PULL 00/24] MTTCG Base enabling patches with ARM enablement Peter Maydell
2017-02-27  8:48   ` Christian Borntraeger
2017-02-27  9:11     ` Alex Bennée
2017-02-27  9:25       ` Christian Borntraeger
2017-02-27  9:35       ` Christian Borntraeger
2017-02-27 12:39 ` Paolo Bonzini
2017-02-27 15:48   ` Alex Bennée
2017-02-27 16:17     ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.