All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement
@ 2016-11-09 14:57 Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt Alex Bennée
                   ` (20 more replies)
  0 siblings, 21 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée

Hi,

A chunk of the MTTCG work was merged for 2.8 so this constitutes what
is left for the next cycle. The changes are minor except for a new
patch to reduce BQL contention on ARM during yield/wfi instructions.

I've also taken the time to update the design document which now
covers all the solutions for the various design requirements in the
document.

The one outstanding question is how to deal with the TLB flush
semantics of the various guest architectures. Currently flushes to
other vCPUs will happen at the end of their currently executing
Translation Block which could mean the originating vCPU makes
assumptions about flushes having been completed when they haven't. In
practice this hasn't been a problem and I haven't been able to
construct a test case so far that would fail in such a case. This is
probably because most tear downs of the other vCPU TLBs tend to be
done while the other vCPUs are not doing much. If anyone can come up
with a test case that would fail if this assumption isn't met then
please let me know.

We already have all the tools to meet these requirements if we want by
scheduling safe async work however this might slow things down if
these sort of flushes are frequent.

These patches apply cleanly on top of the current master. Please
review the code and I look forward to seeing other architectures
enable MTTCG on top of this series.

Alex Bennée (15):
  docs: new design document multi-thread-tcg.txt
  tcg: add kick timer for single-threaded vCPU emulation
  tcg: rename tcg_current_cpu to tcg_current_rr_cpu
  tcg: remove global exit_request
  tcg: enable tb_lock() for SoftMMU
  tcg: enable thread-per-vCPU
  cputlb: add assert_cpu_is_self checks
  cputlb: tweak qemu_ram_addr_from_host_nofail reporting
  cputlb: atomically update tlb fields used by tlb_reset_dirty
  target-arm/powerctl: defer cpu reset work to CPU context
  target-arm/cpu: don't reset TLB structures, use cputlb to do it
  target-arm: ensure BQL taken for ARM_CP_IO register access
  target-arm: helpers which may affect global state need the BQL
  target-arm: don't generate WFE/YIELD calls for MTTCG
  tcg: enable MTTCG by default for ARM on x86 hosts

Jan Kiszka (1):
  tcg: drop global lock during TCG code execution

KONRAD Frederic (2):
  tcg: add options for enabling MTTCG
  cputlb: introduce tlb_flush_* async work.

Pranith Kumar (1):
  tcg: handle EXCP_ATOMIC exception for system emulation

 configure                       |  12 ++
 cpu-exec-common.c               |   3 -
 cpu-exec.c                      |  37 ++--
 cpus.c                          | 314 ++++++++++++++++++++++++--------
 cputlb.c                        | 386 +++++++++++++++++++++++++++++++---------
 default-configs/arm-softmmu.mak |   2 +
 docs/multi-thread-tcg.txt       | 343 +++++++++++++++++++++++++++++++++++
 exec.c                          |  12 +-
 hw/core/irq.c                   |   1 +
 hw/i386/kvmvapic.c              |   4 +-
 hw/intc/arm_gicv3_cpuif.c       |   3 +
 hw/ppc/spapr.c                  |   3 +
 include/exec/cputlb.h           |   2 -
 include/exec/exec-all.h         |   5 +-
 include/qom/cpu.h               |  16 ++
 include/sysemu/cpus.h           |   2 +
 memory.c                        |   2 +
 qemu-options.hx                 |  20 +++
 qom/cpu.c                       |  10 ++
 target-arm/arm-powerctl.c       | 144 +++++++++------
 target-arm/cpu.c                |   6 +
 target-arm/helper.c             |   6 +
 target-arm/op_helper.c          |  50 +++++-
 target-arm/translate-a64.c      |   8 +-
 target-arm/translate.c          |  20 ++-
 target-i386/smm_helper.c        |   7 +
 target-s390x/misc_helper.c      |   5 +-
 translate-all.c                 |  27 ++-
 translate-common.c              |  21 +--
 vl.c                            |  49 ++++-
 30 files changed, 1241 insertions(+), 279 deletions(-)
 create mode 100644 docs/multi-thread-tcg.txt

-- 
2.10.1

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 15:00   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 02/19] tcg: add options for enabling MTTCG Alex Bennée
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée

This documents the current design for upgrading TCG emulation to take
advantage of modern CPUs by running a thread-per-CPU. The document goes
through the various areas of the code affected by such a change and
proposes design requirements for each part of the solution.

The text marked with (Current solution[s]) to document what the current
approaches being used are.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v1
  - initial version
v2
  - update discussion on locks
  - bit more detail on vCPU scheduling
  - explicitly mention Translation Blocks
  - emulated hardware state already covered by iomutex
  - a few minor rewords
v3
  - mention this covers system-mode
  - describe main main-loop and lookup hot-path
  - mention multi-concurrent-reader lookups
  - enumerate reasons for invalidation
  - add more details on lookup structures
  - describe the softmmu hot-path better
  - mention store-after-load barrier problem
v4
  - mention some cross-over between linux-user/system emulation
  - various minor grammar and scanning fixes
  - fix reference to tb_ctx.htbale
  - describe the solution for hot-path
  - more detail on TB flushing and invalidation
  - add (Current solution) following design requirements
  - more detail on iothread/BQL mutex
  - mention implicit memory barriers
  - add links to current LL/SC and cmpxchg patch sets
  - add TLB flag setting as an additional requirement
v6
 - remove DRAFTING, update copyright dates
 - document current solutions to each design requirement
   - tb_lock() serialisation for codegen/patch
   - cputlb changes to defer cross-vCPU flushes
   - cputlb atomic updates for slow-path
   - BQL usage for hardware serialisation
   - cmpxchg as initial atomic/synchronisation support mechanism
---
 docs/multi-thread-tcg.txt | 343 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 343 insertions(+)
 create mode 100644 docs/multi-thread-tcg.txt

diff --git a/docs/multi-thread-tcg.txt b/docs/multi-thread-tcg.txt
new file mode 100644
index 0000000..9e73aac
--- /dev/null
+++ b/docs/multi-thread-tcg.txt
@@ -0,0 +1,343 @@
+Copyright (c) 2015-2016 Linaro Ltd.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+Introduction
+============
+
+This document outlines the design for multi-threaded TCG system-mode
+emulation. The current user-mode emulation mirrors the thread
+structure of the translated executable. Some of the work will be
+applicable to both system and linux-user emulation.
+
+The original system-mode TCG implementation was single threaded and
+dealt with multiple CPUs with simple round-robin scheduling. This
+simplified a lot of things but became increasingly limited as systems
+being emulated gained additional cores and per-core performance gains
+for host systems started to level off.
+
+vCPU Scheduling
+===============
+
+We introduce a new running mode where each vCPU will run on its own
+user-space thread. This will be enabled by default for all FE/BE
+combinations that have had the required work done to support this
+safely.
+
+In the general case of running translated code there should be no
+inter-vCPU dependencies and all vCPUs should be able to run at full
+speed. Synchronisation will only be required while accessing internal
+shared data structures or when the emulated architecture requires a
+coherent representation of the emulated machine state.
+
+Shared Data Structures
+======================
+
+Main Run Loop
+-------------
+
+Even when there is no code being generated there are a number of
+structures associated with the hot-path through the main run-loop.
+These are associated with looking up the next translation block to
+execute. These include:
+
+    tb_jmp_cache (per-vCPU, cache of recent jumps)
+    tb_ctx.htable (global hash table, phys address->tb lookup)
+
+As TB linking only occurs when blocks are in the same page this code
+is critical to performance as looking up the next TB to execute is the
+most common reason to exit the generated code.
+
+DESIGN REQUIREMENT: Make access to lookup structures safe with
+multiple reader/writer threads. Minimise any lock contention to do it.
+
+The hot-path avoids using locks where possible. The tb_jmp_cache is
+updated with atomic accesses to ensure consistent results. The fall
+back QHT based hash table is also designed for lockless lookups. Locks
+are only taken when code generation is required or TranslationBlocks
+have their block-to-block jumps patched.
+
+Global TCG State
+----------------
+
+We need to protect the entire code generation cycle including any post
+generation patching of the translated code. This also implies a shared
+translation buffer which contains code running on all cores. Any
+execution path that comes to the main run loop will need to hold a
+mutex for code generation. This also includes times when we need flush
+code or entries from any shared lookups/caches. Structures held on a
+per-vCPU basis won't need locking unless other vCPUs will need to
+modify them.
+
+DESIGN REQUIREMENT: Add locking around all code generation and TB
+patching.
+
+(Current solution)
+
+Mainly as part of the linux-user work all code generation is
+serialised with a tb_lock(). For the SoftMMU tb_lock() also takes the
+place of mmap_lock() in linux-user.
+
+Translation Blocks
+------------------
+
+Currently the whole system shares a single code generation buffer
+which when full will force a flush of all translations and start from
+scratch again. Some operations also force a full flush of translations
+including:
+
+  - debugging operations (breakpoint insertion/removal)
+  - some CPU helper functions
+
+This is done with the async_safe_run_on_cpu() mechanism to ensure all
+vCPUs are quiescent when changes are being made to shared global
+structures.
+
+More granular translation invalidation events are typically due
+to a change of the state of a physical page:
+
+  - code modification (self modify code, patching code)
+  - page changes (new page mapping in linux-user mode)
+
+While setting the invalid flag in a TranslationBlock will stop it
+being used when looked up in the hot-path there are a number of other
+book-keeping structures that need to be safely cleared.
+
+Any TranslationBlocks which have been patched to jump directly to the
+now invalid blocks need the jump patches reversing so they will return
+to the C code.
+
+There are a number of look-up caches that need to be properly updated
+including the:
+
+  - jump lookup cache
+  - the physical-to-tb lookup hash table
+  - the global page table
+
+The global page table (l1_map) which provides a multi-level look-up
+for PageDesc structures which contain pointers to the start of a
+linked list of all Translation Blocks in that page (see page_next).
+
+Both the jump patching and the page cache involve linked lists that
+the invalidated TranslationBlock needs to be removed from.
+
+DESIGN REQUIREMENT: Safely handle invalidation of TBs
+                      - safely patch/revert direct jumps
+                      - remove central PageDesc lookup entries
+                      - ensure lookup caches/hashes are safely updated
+
+(Current solution)
+
+The direct jump themselves are updated atomically by the TCG
+tb_set_jmp_target() code. Modification to the linked lists that allow
+searching for linked pages are done under the protect of the
+tb_lock().
+
+The global page table is protected by the tb_lock() in system-mode and
+mmap_lock() in linux-user mode.
+
+The lookup caches are updated atomically and the lookup hash uses QHT
+which is designed for concurrent safe lookup.
+
+
+Memory maps and TLBs
+--------------------
+
+The memory handling code is fairly critical to the speed of memory
+access in the emulated system. The SoftMMU code is designed so the
+hot-path can be handled entirely within translated code. This is
+handled with a per-vCPU TLB structure which once populated will allow
+a series of accesses to the page to occur without exiting the
+translated code. It is possible to set flags in the TLB address which
+will ensure the slow-path is taken for each access. This can be done
+to support:
+
+  - Memory regions (dividing up access to PIO, MMIO and RAM)
+  - Dirty page tracking (for code gen, SMC detection, migration and display)
+  - Virtual TLB (for translating guest address->real address)
+
+When the TLB tables are updated by a vCPU thread other than their own
+we need to ensure it is done in a safe way so no inconsistent state is
+seen by the vCPU thread.
+
+Some operations require updating a number of vCPUs TLBs at the same
+time in a synchronised manner.
+
+DESIGN REQUIREMENTS:
+
+  - TLB Flush All/Page
+    - can be across-vCPUs
+    - cross vCPU TLB flush may need other vCPU brought to halt
+    - change may need to be visible to the calling vCPU immediately
+  - TLB Flag Update
+    - usually cross-vCPU
+    - want change to be visible as soon as possible
+  - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs)
+    - This is a per-vCPU table - by definition can't race
+    - updated by its own thread when the slow-path is forced
+
+(Current solution)
+
+We have updated cputlb.c to defer operations when a cross-vCPU
+operation with async_run_on_cpu() which ensures each vCPU sees a
+coherent state when it next runs it's work (in a few instructions
+time).
+
+TLB flag updates are all done atomically and are also protected by the
+tb_lock() which is used by the functions that update the TLB in bulk.
+
+(Known limitation)
+
+Currently there is no mechanism to ensure the vCPU that sourced the
+flush request can be notified of completion. However the sourcing vCPU
+use an async_safe_run_on_cpu() and exit the run loop at that point to
+execute the deferred work immediately. The deferred work could then
+operate all all vCPUs while they were in a quiescent state.
+
+Emulated hardware state
+-----------------------
+
+Currently thanks to KVM work any access to IO memory is automatically
+protected by the global iothread mutex, also known as the BQL (Big
+Qemu Lock). Any IO region that doesn't use global mutex is expected to
+do its own locking.
+
+However IO memory isn't the only way emulated hardware state can be
+modified. Some architectures have model specific registers that
+trigger hardware emulation features. Generally any translation helper
+that needs to update more than a single vCPUs of state should take the
+BQL.
+
+As the BQL, or global iothread mutex is shared across the system we
+push the use of the lock as far down into the TCG code as possible to
+minimise contention.
+
+(Current solution)
+
+MMIO access automatically serialises hardware emulation by way of the
+BQL. Currently ARM targets serialise all ARM_CP_IO register accesses
+and also defer the reset/startup of vCPUs to the vCPU context by way
+of async_run_on_cpu().
+
+Memory Consistency
+==================
+
+Between emulated guests and host systems there are a range of memory
+consistency models. Even emulating weakly ordered systems on strongly
+ordered hosts needs to ensure things like store-after-load re-ordering
+can be prevented when the guest wants to.
+
+Memory Barriers
+---------------
+
+Barriers (sometimes known as fences) provide a mechanism for software
+to enforce a particular ordering of memory operations from the point
+of view of external observers (e.g. another processor core). They can
+apply to any memory operations as well as just loads or stores.
+
+The Linux kernel has an excellent write-up on the various forms of
+memory barrier and the guarantees they can provide [1].
+
+Barriers are often wrapped around synchronisation primitives to
+provide explicit memory ordering semantics. However they can be used
+by themselves to provide safe lockless access by ensuring for example
+a change to a signal flag will only be visible once the changes to
+payload are.
+
+DESIGN REQUIREMENT: Add a new tcg_memory_barrier op
+
+This would enforce a strong load/store ordering so all loads/stores
+complete at the memory barrier. On single-core non-SMP strongly
+ordered backends this could become a NOP.
+
+Aside from explicit standalone memory barrier instructions there are
+also implicit memory ordering semantics which comes with each guest
+memory access instruction. For example all x86 load/stores come with
+fairly strong guarantees of sequential consistency where as ARM has
+special variants of load/store instructions that imply acquire/release
+semantics.
+
+In the case of a strongly ordered guest architecture being emulated on
+a weakly ordered host the scope for a heavy performance impact is
+quite high.
+
+DESIGN REQUIREMENTS: Be efficient with use of memory barriers
+       - host systems with stronger implied guarantees can skip some barriers
+       - merge consecutive barriers to the strongest one
+
+(Current solution)
+
+The system currently has a tcg_gen_mb() which will add memory barrier
+operations if code generation is being done in a parallel context. The
+tcg_optimize() function attempts to merge barriers up to their
+strongest form before any load/store operations. The solution was
+originally developed and tested for linux-user based systems. All
+backends have been converted to emit fences when required. So far the
+following front-ends have been updated to emit fences when required:
+
+    - target-i386
+    - target-arm
+    - target-aarch64
+    - target-alpha
+
+Memory Control and Maintenance
+------------------------------
+
+This includes a class of instructions for controlling system cache
+behaviour. While QEMU doesn't model cache behaviour these instructions
+are often seen when code modification has taken place to ensure the
+changes take effect.
+
+Synchronisation Primitives
+--------------------------
+
+There are two broad types of synchronisation primitives found in
+modern ISAs: atomic instructions and exclusive regions.
+
+The first type offer a simple atomic instruction which will guarantee
+some sort of test and conditional store will be truly atomic w.r.t.
+other cores sharing access to the memory. The classic example is the
+x86 cmpxchg instruction.
+
+The second type offer a pair of load/store instructions which offer a
+guarantee that an region of memory has not been touched between the
+load and store instructions. An example of this is ARM's ldrex/strex
+pair where the strex instruction will return a flag indicating a
+successful store only if no other CPU has accessed the memory region
+since the ldrex.
+
+Traditionally TCG has generated a series of operations that work
+because they are within the context of a single translation block so
+will have completed before another CPU is scheduled. However with
+the ability to have multiple threads running to emulate multiple CPUs
+we will need to explicitly expose these semantics.
+
+DESIGN REQUIREMENTS:
+  - Support classic atomic instructions
+  - Support load/store exclusive (or load link/store conditional) pairs
+  - Generic enough infrastructure to support all guest architectures
+CURRENT OPEN QUESTIONS:
+  - How problematic is the ABA problem in general?
+
+(Current solution)
+
+The TCG provides a number of atomic helpers (tcg_gen_atomic_*) which
+can be used directly or combined to emulate other instructions like
+ARM's ldrex/strex instructions. While they are susceptible to the ABA
+problem so far common guests have not implemented patterns where
+this may be a problem - typically presenting a locking ABI which
+assumes cmpxchg like semantics.
+
+The code also includes a fall-back for cases where multi-threaded TCG
+ops can't work (e.g. guest atomic width > host atomic width). In this
+case an EXCP_ATOMIC exit occurs and the instruction is emulated with
+an exclusive lock which ensures all emulation is serialised.
+
+While the atomic helpers look good enough for now there may be a need
+to look at solutions that can more closely model the guest
+architectures semantics.
+
+==========
+
+[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 02/19] tcg: add options for enabling MTTCG
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

From: KONRAD Frederic <fred.konrad@greensocs.com>

We know there will be cases where MTTCG won't work until additional work
is done in the front/back ends to support. It will however be useful to
be able to turn it on.

As a result MTTCG will default to off unless the combination is
supported. However the user can turn it on for the sake of testing.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: move to -accel tcg,thread=multi|single, defaults]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
v1:
  - merge with add mttcg option.
  - update commit message
v2:
  - machine_init->opts_init
v3:
  - moved from -tcg to -accel tcg,thread=single|multi
  - fix checkpatch warnings
v4:
  - make mttcg_enabled extern, qemu_tcg_mttcg_enabled() now just macro
  - qemu_tcg_configure now propagates Error instead of exiting
  - better error checking of thread=foo
  - use CONFIG flags for default_mttcg_enabled()
  - disable mttcg with icount, error if both forced on
---
 cpus.c                | 43 +++++++++++++++++++++++++++++++++++++++++++
 include/qom/cpu.h     |  9 +++++++++
 include/sysemu/cpus.h |  2 ++
 qemu-options.hx       | 20 ++++++++++++++++++++
 vl.c                  | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 5213351..73ff851 100644
--- a/cpus.c
+++ b/cpus.c
@@ -25,6 +25,7 @@
 /* Needed early for CONFIG_BSD etc. */
 #include "qemu/osdep.h"
 #include "qemu-common.h"
+#include "qemu/config-file.h"
 #include "cpu.h"
 #include "monitor/monitor.h"
 #include "qapi/qmp/qerror.h"
@@ -148,6 +149,48 @@ typedef struct TimersState {
 } TimersState;
 
 static TimersState timers_state;
+bool mttcg_enabled;
+
+/*
+ * We default to false if we know other options have been enabled
+ * which are currently incompatible with MTTCG. Otherwise when each
+ * guest (target) and host has been updated to support:
+ *   - atomic instructions
+ *   - memory ordering
+ * they can set the appropriate CONFIG flags in ${target}-softmmu.mak
+ * and generated config-host.mak fragments.
+ */
+static bool default_mttcg_enabled(void)
+{
+    QemuOpts *icount_opts = qemu_find_opts_singleton("icount");
+    const char *rr = qemu_opt_get(icount_opts, "rr");
+
+    if (rr) {
+        return false;
+    } else {
+#if defined(CONFIG_MTTCG_TARGET) && defined(CONFIG_MTTCG_HOST)
+        return true;
+#else
+        return false;
+#endif
+    }
+}
+
+void qemu_tcg_configure(QemuOpts *opts, Error **errp)
+{
+    const char *t = qemu_opt_get(opts, "thread");
+    if (t) {
+        if (strcmp(t, "multi") == 0) {
+            mttcg_enabled = true;
+        } else if (strcmp(t, "single") == 0) {
+            mttcg_enabled = false;
+        } else {
+            error_setg(errp, "Invalid 'thread' setting %s", t);
+        }
+    } else {
+        mttcg_enabled = default_mttcg_enabled();
+    }
+}
 
 int64_t cpu_get_icount_raw(void)
 {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 3f79a8e..541785a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -407,6 +407,15 @@ extern struct CPUTailQ cpus;
 extern __thread CPUState *current_cpu;
 
 /**
+ * qemu_tcg_mttcg_enabled:
+ * Check whether we are running MultiThread TCG or not.
+ *
+ * Returns: %true if we are in MTTCG mode %false otherwise.
+ */
+extern bool mttcg_enabled;
+#define qemu_tcg_mttcg_enabled() (mttcg_enabled)
+
+/**
  * cpu_paging_enabled:
  * @cpu: The CPU whose state is to be inspected.
  *
diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
index 3728a1e..a73b5d4 100644
--- a/include/sysemu/cpus.h
+++ b/include/sysemu/cpus.h
@@ -36,4 +36,6 @@ extern int smp_threads;
 
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg);
 
+void qemu_tcg_configure(QemuOpts *opts, Error **errp);
+
 #endif
diff --git a/qemu-options.hx b/qemu-options.hx
index 4536e18..7857a5a 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -96,6 +96,26 @@ STEXI
 Select CPU model (@code{-cpu help} for list and additional feature selection)
 ETEXI
 
+DEF("accel", HAS_ARG, QEMU_OPTION_accel,
+    "-accel [accel=]accelerator[,thread=single|multi]\n"
+    "               select accelerator ('-accel help for list')\n"
+    "               thread=single|multi (enable multi-threaded TCG)", QEMU_ARCH_ALL)
+STEXI
+@item -accel @var{name}[,prop=@var{value}[,...]]
+@findex -accel
+This is used to enable an accelerator. Depending on the target architecture,
+kvm, xen, or tcg can be available. By default, tcg is used. If there is more
+than one accelerator specified, the next one is used if the previous one fails
+to initialize.
+@table @option
+@item thread=single|multi
+Controls number of TCG threads. When the TCG is multi-threaded there will be one
+thread per vCPU therefor taking advantage of additional host cores. The default
+is to enable multi-threading where both the back-end and front-ends support it and
+no incompatible TCG features have been enabled (e.g. icount/replay).
+@end table
+ETEXI
+
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
     "-smp [cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
     "                set the number of CPUs to 'n' [default=1]\n"
diff --git a/vl.c b/vl.c
index 319f641..a285a58 100644
--- a/vl.c
+++ b/vl.c
@@ -296,6 +296,26 @@ static QemuOptsList qemu_machine_opts = {
     },
 };
 
+static QemuOptsList qemu_accel_opts = {
+    .name = "accel",
+    .implied_opt_name = "accel",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_accel_opts.head),
+    .merge_lists = true,
+    .desc = {
+        {
+            .name = "accel",
+            .type = QEMU_OPT_STRING,
+            .help = "Select the type of accelerator",
+        },
+        {
+            .name = "thread",
+            .type = QEMU_OPT_STRING,
+            .help = "Enable/disable multi-threaded TCG",
+        },
+        { /* end of list */ }
+    },
+};
+
 static QemuOptsList qemu_boot_opts = {
     .name = "boot-opts",
     .implied_opt_name = "order",
@@ -3004,7 +3024,8 @@ int main(int argc, char **argv, char **envp)
     const char *boot_once = NULL;
     DisplayState *ds;
     int cyls, heads, secs, translation;
-    QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
+    QemuOpts *opts, *machine_opts;
+    QemuOpts *hda_opts = NULL, *icount_opts = NULL, *accel_opts = NULL;
     QemuOptsList *olist;
     int optind;
     const char *optarg;
@@ -3059,6 +3080,7 @@ int main(int argc, char **argv, char **envp)
     qemu_add_opts(&qemu_trace_opts);
     qemu_add_opts(&qemu_option_rom_opts);
     qemu_add_opts(&qemu_machine_opts);
+    qemu_add_opts(&qemu_accel_opts);
     qemu_add_opts(&qemu_mem_opts);
     qemu_add_opts(&qemu_smp_opts);
     qemu_add_opts(&qemu_boot_opts);
@@ -3752,6 +3774,26 @@ int main(int argc, char **argv, char **envp)
                 qdev_prop_register_global(&kvm_pit_lost_tick_policy);
                 break;
             }
+            case QEMU_OPTION_accel:
+                accel_opts = qemu_opts_parse_noisily(qemu_find_opts("accel"),
+                                                     optarg, true);
+                optarg = qemu_opt_get(accel_opts, "accel");
+
+                olist = qemu_find_opts("machine");
+                if (strcmp("kvm", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=kvm", false);
+                } else if (strcmp("xen", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=xen", false);
+                } else if (strcmp("tcg", optarg) == 0) {
+                    qemu_opts_parse_noisily(olist, "accel=tcg", false);
+                } else {
+                    if (!is_help_option(optarg)) {
+                        error_printf("Unknown accelerator: %s", optarg);
+                    }
+                    error_printf("Supported accelerators: kvm, xen, tcg\n");
+                    exit(1);
+                }
+                break;
             case QEMU_OPTION_usb:
                 olist = qemu_find_opts("machine");
                 qemu_opts_parse_noisily(olist, "usb=on", false);
@@ -4057,6 +4099,8 @@ int main(int argc, char **argv, char **envp)
 
     replay_configure(icount_opts);
 
+    qemu_tcg_configure(accel_opts, &error_fatal);
+
     machine_class = select_machine();
 
     set_memory_options(&ram_slots, &maxram_size, machine_class);
@@ -4421,6 +4465,9 @@ int main(int argc, char **argv, char **envp)
         if (kvm_enabled() || xen_enabled()) {
             error_report("-icount is not allowed with kvm or xen");
             exit(1);
+        } else if (qemu_tcg_mttcg_enabled()) {
+            error_report("-icount does not currently work with MTTCG");
+            exit(1);
         }
         configure_icount(icount_opts, &error_abort);
         qemu_opts_del(icount_opts);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 02/19] tcg: add options for enabling MTTCG Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 15:10   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 04/19] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

Currently we rely on the side effect of the main loop grabbing the
iothread_mutex to give any long running basic block chains a kick to
ensure the next vCPU is scheduled. As this code is being re-factored and
rationalised we now do it explicitly here.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - re-base fixes
  - get_ticks_per_sec() -> NANOSECONDS_PER_SEC
v3
  - add define for TCG_KICK_FREQ
  - fix checkpatch warning
v4
  - wrap next calc in inline qemu_tcg_next_kick() instead of macro
v5
  - move all kick code into own section
  - use global for timer
  - add helper functions to start/stop timer
  - stop timer when all cores paused
---
 cpus.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/cpus.c b/cpus.c
index 73ff851..b3bf7b8 100644
--- a/cpus.c
+++ b/cpus.c
@@ -736,6 +736,52 @@ void configure_icount(QemuOpts *opts, Error **errp)
 }
 
 /***********************************************************/
+/* TCG vCPU kick timer
+ *
+ * The kick timer is responsible for moving single threaded vCPU
+ * emulation on to the next vCPU. If more than one vCPU is running a
+ * timer event with force a cpu->exit so the next vCPU can get
+ * scheduled.
+ *
+ * The timer is removed if all vCPUs are idle and restarted again once
+ * idleness is complete.
+ */
+
+static QEMUTimer *tcg_kick_vcpu_timer;
+
+static void qemu_cpu_kick_no_halt(void);
+
+#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
+
+static inline int64_t qemu_tcg_next_kick(void)
+{
+    return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
+}
+
+static void kick_tcg_thread(void *opaque)
+{
+    timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    qemu_cpu_kick_no_halt();
+}
+
+static void start_tcg_kick_timer(void)
+{
+    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
+        tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,  kick_tcg_thread, NULL);
+        timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
+    }
+}
+
+static void stop_tcg_kick_timer(void)
+{
+    if (tcg_kick_vcpu_timer) {
+        timer_del(tcg_kick_vcpu_timer);
+        tcg_kick_vcpu_timer = NULL;
+    }
+}
+
+
+/***********************************************************/
 void hw_error(const char *fmt, ...)
 {
     va_list ap;
@@ -989,9 +1035,12 @@ static void qemu_wait_io_event_common(CPUState *cpu)
 static void qemu_tcg_wait_io_event(CPUState *cpu)
 {
     while (all_cpu_threads_idle()) {
+        stop_tcg_kick_timer();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
     }
 
+    start_tcg_kick_timer();
+
     while (iothread_requesting_mutex) {
         qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
     }
@@ -1191,6 +1240,15 @@ static void deal_with_unplugged_cpus(void)
     }
 }
 
+/* Single-threaded TCG
+ *
+ * In the single-threaded case each vCPU is simulated in turn. If
+ * there is more than a single vCPU we create a simple timer to kick
+ * the vCPU and ensure we don't get stuck in a tight loop in one vCPU.
+ * This is done explicitly rather than relying on side-effects
+ * elsewhere.
+ */
+
 static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
@@ -1217,6 +1275,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         }
     }
 
+    start_tcg_kick_timer();
+
     /* process any pending work */
     atomic_mb_set(&exit_request, 1);
 
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 04/19] tcg: rename tcg_current_cpu to tcg_current_rr_cpu
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (2 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution Alex Bennée
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
v4:
  - keep global exit_request setting for now
  - fix merge conflicts
v5:
  - merge conflicts with kick changes
---
 cpu-exec-common.c       |  1 -
 cpu-exec.c              |  3 ---
 cpus.c                  | 41 ++++++++++++++++++++++-------------------
 include/exec/exec-all.h |  1 -
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/cpu-exec-common.c b/cpu-exec-common.c
index 767d9c6..e2bc053 100644
--- a/cpu-exec-common.c
+++ b/cpu-exec-common.c
@@ -24,7 +24,6 @@
 #include "exec/memory-internal.h"
 
 bool exit_request;
-CPUState *tcg_current_cpu;
 
 /* exit the current TB, but without causing any exception to be raised */
 void cpu_loop_exit_noexc(CPUState *cpu)
diff --git a/cpu-exec.c b/cpu-exec.c
index 4188fed..49191b5 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -605,7 +605,6 @@ int cpu_exec(CPUState *cpu)
         return EXCP_HALTED;
     }
 
-    atomic_mb_set(&tcg_current_cpu, cpu);
     rcu_read_lock();
 
     if (unlikely(atomic_mb_read(&exit_request))) {
@@ -664,7 +663,5 @@ int cpu_exec(CPUState *cpu)
     /* fail safe : never use current_cpu outside cpu_exec() */
     current_cpu = NULL;
 
-    /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
-    atomic_set(&tcg_current_cpu, NULL);
     return ret;
 }
diff --git a/cpus.c b/cpus.c
index b3bf7b8..485c2e7 100644
--- a/cpus.c
+++ b/cpus.c
@@ -748,8 +748,7 @@ void configure_icount(QemuOpts *opts, Error **errp)
  */
 
 static QEMUTimer *tcg_kick_vcpu_timer;
-
-static void qemu_cpu_kick_no_halt(void);
+static CPUState *tcg_current_rr_cpu;
 
 #define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10)
 
@@ -758,10 +757,23 @@ static inline int64_t qemu_tcg_next_kick(void)
     return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD;
 }
 
+/* Kick the currently round-robin scheduled vCPU */
+static void qemu_cpu_kick_rr_cpu(void)
+{
+    CPUState *cpu;
+    atomic_mb_set(&exit_request, 1);
+    do {
+        cpu = atomic_mb_read(&tcg_current_rr_cpu);
+        if (cpu) {
+            cpu_exit(cpu);
+        }
+    } while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
+}
+
 static void kick_tcg_thread(void *opaque)
 {
     timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
-    qemu_cpu_kick_no_halt();
+    qemu_cpu_kick_rr_cpu();
 }
 
 static void start_tcg_kick_timer(void)
@@ -780,7 +792,6 @@ static void stop_tcg_kick_timer(void)
     }
 }
 
-
 /***********************************************************/
 void hw_error(const char *fmt, ...)
 {
@@ -1291,6 +1302,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
         }
 
         for (; cpu != NULL && !exit_request; cpu = CPU_NEXT(cpu)) {
+            atomic_mb_set(&tcg_current_rr_cpu, cpu);
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
                               (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0);
@@ -1310,6 +1322,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
 
         } /* for cpu.. */
+        /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
+        atomic_set(&tcg_current_rr_cpu, NULL);
 
         /* Pairs with smp_wmb in qemu_cpu_kick.  */
         atomic_mb_set(&exit_request, 0);
@@ -1342,24 +1356,13 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
 #endif
 }
 
-static void qemu_cpu_kick_no_halt(void)
-{
-    CPUState *cpu;
-    /* Ensure whatever caused the exit has reached the CPU threads before
-     * writing exit_request.
-     */
-    atomic_mb_set(&exit_request, 1);
-    cpu = atomic_mb_read(&tcg_current_cpu);
-    if (cpu) {
-        cpu_exit(cpu);
-    }
-}
-
 void qemu_cpu_kick(CPUState *cpu)
 {
     qemu_cond_broadcast(cpu->halt_cond);
     if (tcg_enabled()) {
-        qemu_cpu_kick_no_halt();
+        cpu_exit(cpu);
+        /* Also ensure current RR cpu is kicked */
+        qemu_cpu_kick_rr_cpu();
     } else {
         qemu_cpu_kick_thread(cpu);
     }
@@ -1400,7 +1403,7 @@ void qemu_mutex_lock_iothread(void)
         atomic_dec(&iothread_requesting_mutex);
     } else {
         if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_no_halt();
+            qemu_cpu_kick_rr_cpu();
             qemu_mutex_lock(&qemu_global_mutex);
         }
         atomic_dec(&iothread_requesting_mutex);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index a8c13ce..5a1b3a3 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -406,7 +406,6 @@ bool memory_region_is_unassigned(MemoryRegion *mr);
 extern int singlestep;
 
 /* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */
-extern CPUState *tcg_current_cpu;
 extern bool exit_request;
 
 #endif
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (3 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 04/19] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 15:18   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 06/19] tcg: remove global exit_request Alex Bennée
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite,
	Michael S. Tsirkin, Eduardo Habkost, David Gibson,
	Alexander Graf, open list:sPAPR

From: Jan Kiszka <jan.kiszka@siemens.com>

This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, rm old info from commit msg, review updates]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v5 (ajb, base patches):
 - added an assert to BQL unlock/lock functions instead of hanging
 - ensure all cpu->interrupt_requests *modifications* protected by BQL
 - add a re-read on cpu->interrupt_request for correctness
 - BQL fixes for:
   - assert BQL held for PPC hypercalls (emulate_spar_hypercall)
   - SCLP service calls on s390x
 - merge conflict with kick timer patch
v4 (ajb, base patches):
 - protect cpu->interrupt updates with BQL
 - fix wording io_mem_notdirty calls
 - s/we/with/
v3 (ajb, base-patches):
  - stale iothread_unlocks removed (cpu_exit/resume_from_signal deals
  with it in the longjmp).
  - fix re-base conflicts
v2 (ajb):
  - merge with tcg: grab iothread lock in cpu-exec interrupt handling
  - use existing fns for tracking lock state
  - lock iothread for mem_region
    - add assert on mem region modification
    - ensure smm_helper holds iothread
  - Add JK s-o-b
  - Fix-up FK s-o-b annotation
v1 (ajb, base-patches):
  - SMP failure now fixed by previous commit

Changes from Fred Konrad (mttcg-v7 via paolo):
  * Rebase on the current HEAD.
  * Fixes a deadlock in qemu_devices_reset().
  * Remove the mutex in address_space_*
---
 cpu-exec.c                 | 20 ++++++++++++++++++--
 cpus.c                     | 28 +++++-----------------------
 cputlb.c                   | 21 ++++++++++++++++++++-
 exec.c                     | 12 +++++++++---
 hw/core/irq.c              |  1 +
 hw/i386/kvmvapic.c         |  4 ++--
 hw/ppc/spapr.c             |  3 +++
 include/qom/cpu.h          |  1 +
 memory.c                   |  2 ++
 qom/cpu.c                  | 10 ++++++++++
 target-i386/smm_helper.c   |  7 +++++++
 target-s390x/misc_helper.c |  5 ++++-
 translate-all.c            |  9 +++++++--
 translate-common.c         | 21 +++++++++++----------
 14 files changed, 100 insertions(+), 44 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 49191b5..59a9fc4 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
 #include "qemu/rcu.h"
 #include "exec/tb-hash.h"
 #include "exec/log.h"
+#include "qemu/main-loop.h"
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
@@ -384,8 +385,10 @@ static inline bool cpu_handle_halt(CPUState *cpu)
         if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
             && replay_interrupt()) {
             X86CPU *x86_cpu = X86_CPU(cpu);
+            qemu_mutex_lock_iothread();
             apic_poll_irq(x86_cpu->apic_state);
             cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
+            qemu_mutex_unlock_iothread();
         }
 #endif
         if (!cpu_has_work(cpu)) {
@@ -439,7 +442,9 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
 #else
             if (replay_exception()) {
                 CPUClass *cc = CPU_GET_CLASS(cpu);
+                qemu_mutex_lock_iothread();
                 cc->do_interrupt(cpu);
+                qemu_mutex_unlock_iothread();
                 cpu->exception_index = -1;
             } else if (!replay_has_interrupt()) {
                 /* give a chance to iothread in replay mode */
@@ -465,9 +470,11 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
                                         TranslationBlock **last_tb)
 {
     CPUClass *cc = CPU_GET_CLASS(cpu);
-    int interrupt_request = cpu->interrupt_request;
 
-    if (unlikely(interrupt_request)) {
+    if (unlikely(atomic_read(&cpu->interrupt_request))) {
+        int interrupt_request;
+        qemu_mutex_lock_iothread();
+        interrupt_request = cpu->interrupt_request;
         if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
             /* Mask out external interrupts for this step. */
             interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
@@ -522,7 +529,12 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
                the program flow was changed */
             *last_tb = NULL;
         }
+
+        /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
+        qemu_mutex_unlock_iothread();
     }
+
+
     if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
         atomic_set(&cpu->exit_request, 0);
         cpu->exception_index = EXCP_INTERRUPT;
@@ -652,8 +664,12 @@ int cpu_exec(CPUState *cpu)
             g_assert(cpu == current_cpu);
             g_assert(cc == CPU_GET_CLASS(cpu));
 #endif /* buggy compiler */
+
             cpu->can_do_io = 1;
             tb_lock_reset();
+            if (qemu_mutex_iothread_locked()) {
+                qemu_mutex_unlock_iothread();
+            }
         }
     } /* for(;;) */
 
diff --git a/cpus.c b/cpus.c
index 485c2e7..7116ac6 100644
--- a/cpus.c
+++ b/cpus.c
@@ -994,8 +994,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
 #endif /* _WIN32 */
 
 static QemuMutex qemu_global_mutex;
-static QemuCond qemu_io_proceeded_cond;
-static unsigned iothread_requesting_mutex;
 
 static QemuThread io_thread;
 
@@ -1009,7 +1007,6 @@ void qemu_init_cpu_loop(void)
     qemu_init_sigbus();
     qemu_cond_init(&qemu_cpu_cond);
     qemu_cond_init(&qemu_pause_cond);
-    qemu_cond_init(&qemu_io_proceeded_cond);
     qemu_mutex_init(&qemu_global_mutex);
 
     qemu_thread_get_self(&io_thread);
@@ -1052,10 +1049,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
 
     start_tcg_kick_timer();
 
-    while (iothread_requesting_mutex) {
-        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
-    }
-
     CPU_FOREACH(cpu) {
         qemu_wait_io_event_common(cpu);
     }
@@ -1216,9 +1209,11 @@ static int tcg_cpu_exec(CPUState *cpu)
         cpu->icount_decr.u16.low = decr;
         cpu->icount_extra = count;
     }
+    qemu_mutex_unlock_iothread();
     cpu_exec_start(cpu);
     ret = cpu_exec(cpu);
     cpu_exec_end(cpu);
+    qemu_mutex_lock_iothread();
 #ifdef CONFIG_PROFILER
     tcg_time += profile_getclock() - ti;
 #endif
@@ -1393,27 +1388,14 @@ bool qemu_mutex_iothread_locked(void)
 
 void qemu_mutex_lock_iothread(void)
 {
-    atomic_inc(&iothread_requesting_mutex);
-    /* In the simple case there is no need to bump the VCPU thread out of
-     * TCG code execution.
-     */
-    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
-        !first_cpu || !first_cpu->created) {
-        qemu_mutex_lock(&qemu_global_mutex);
-        atomic_dec(&iothread_requesting_mutex);
-    } else {
-        if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_rr_cpu();
-            qemu_mutex_lock(&qemu_global_mutex);
-        }
-        atomic_dec(&iothread_requesting_mutex);
-        qemu_cond_broadcast(&qemu_io_proceeded_cond);
-    }
+    g_assert(!qemu_mutex_iothread_locked());
+    qemu_mutex_lock(&qemu_global_mutex);
     iothread_locked = true;
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
+    g_assert(qemu_mutex_iothread_locked());
     iothread_locked = false;
     qemu_mutex_unlock(&qemu_global_mutex);
 }
diff --git a/cputlb.c b/cputlb.c
index 813279f..c6e34f4 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/memory.h"
@@ -504,6 +505,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
     uint64_t val;
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     cpu->mem_io_pc = retaddr;
@@ -512,7 +514,16 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     }
 
     cpu->mem_io_vaddr = addr;
+
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
+
     return val;
 }
 
@@ -523,15 +534,23 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
         cpu_io_recompile(cpu, retaddr);
     }
-
     cpu->mem_io_vaddr = addr;
     cpu->mem_io_pc = retaddr;
+
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 /* Return true if ADDR is present in the victim tlb, and has been copied
diff --git a/exec.c b/exec.c
index 3d867f1..46e2044 100644
--- a/exec.c
+++ b/exec.c
@@ -2108,9 +2108,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
                 }
                 cpu->watchpoint_hit = wp;
 
-                /* The tb_lock will be reset when cpu_loop_exit or
-                 * cpu_loop_exit_noexc longjmp back into the cpu_exec
-                 * main loop.
+                /* Both tb_lock and iothread_mutex will be reset when
+                 * cpu_loop_exit or cpu_loop_exit_noexc longjmp
+                 * back into the cpu_exec main loop.
                  */
                 tb_lock();
                 tb_check_watchpoint(cpu);
@@ -2345,8 +2345,14 @@ static void io_mem_init(void)
     memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
     memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
                           NULL, UINT64_MAX);
+
+    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
+     * which can be called without the iothread mutex.
+     */
     memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
                           NULL, UINT64_MAX);
+    memory_region_clear_global_locking(&io_mem_notdirty);
+
     memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
                           NULL, UINT64_MAX);
 }
diff --git a/hw/core/irq.c b/hw/core/irq.c
index 49ff2e6..b98d1d6 100644
--- a/hw/core/irq.c
+++ b/hw/core/irq.c
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "qemu-common.h"
 #include "hw/irq.h"
 #include "qom/object.h"
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index b30d1b9..c8d908e 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -450,8 +450,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
     resume_all_vcpus();
 
     if (!kvm_enabled()) {
-        /* tb_lock will be reset when cpu_loop_exit_noexc longjmps
-         * back into the cpu_exec loop. */
+        /* Both tb_lock and iothread_mutex will be reset when
+         *  longjmps back into the cpu_exec loop. */
         tb_lock();
         tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
         cpu_loop_exit_noexc(cs);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0cbab24..8035eab 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1012,6 +1012,9 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
 {
     CPUPPCState *env = &cpu->env;
 
+    /* The TCG path should also be holding the BQL at this point */
+    g_assert(qemu_mutex_iothread_locked());
+
     if (msr_pr) {
         hcall_dprintf("Hypercall made with MSR[PR]=1\n");
         env->gpr[3] = H_PRIVILEGE;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 541785a..1735374 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -323,6 +323,7 @@ struct CPUState {
     bool unplug;
     bool crash_occurred;
     bool exit_request;
+    /* updates protected by BQL */
     uint32_t interrupt_request;
     int singlestep_enabled;
     int64_t icount_extra;
diff --git a/memory.c b/memory.c
index 33110e9..a62454b 100644
--- a/memory.c
+++ b/memory.c
@@ -917,6 +917,8 @@ void memory_region_transaction_commit(void)
     AddressSpace *as;
 
     assert(memory_region_transaction_depth);
+    assert(qemu_mutex_iothread_locked());
+
     --memory_region_transaction_depth;
     if (!memory_region_transaction_depth) {
         if (memory_region_update_pending) {
diff --git a/qom/cpu.c b/qom/cpu.c
index 03d9190..3563ef5 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -113,9 +113,19 @@ static void cpu_common_get_memory_mapping(CPUState *cpu,
     error_setg(errp, "Obtaining memory mappings is unsupported on this CPU.");
 }
 
+/* Resetting the IRQ comes from across the code base so we take the
+ * BQL here if we need to.  cpu_interrupt assumes it is held.*/
 void cpu_reset_interrupt(CPUState *cpu, int mask)
 {
+    bool need_lock = !qemu_mutex_iothread_locked();
+
+    if (need_lock) {
+        qemu_mutex_lock_iothread();
+    }
     cpu->interrupt_request &= ~mask;
+    if (need_lock) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 void cpu_exit(CPUState *cpu)
diff --git a/target-i386/smm_helper.c b/target-i386/smm_helper.c
index 4dd6a2c..f051a77 100644
--- a/target-i386/smm_helper.c
+++ b/target-i386/smm_helper.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/log.h"
@@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
 #define SMM_REVISION_ID 0x00020000
 #endif
 
+/* Called with iothread lock taken */
 void cpu_smm_update(X86CPU *cpu)
 {
     CPUX86State *env = &cpu->env;
     bool smm_enabled = (env->hflags & HF_SMM_MASK);
 
+    g_assert(qemu_mutex_iothread_locked());
+
     if (cpu->smram) {
         memory_region_set_enabled(cpu->smram, smm_enabled);
     }
@@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
     }
     env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
     env->hflags &= ~HF_SMM_MASK;
+
+    qemu_mutex_lock_iothread();
     cpu_smm_update(cpu);
+    qemu_mutex_unlock_iothread();
 
     qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
     log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
diff --git a/target-s390x/misc_helper.c b/target-s390x/misc_helper.c
index c9604ea..3cb942e 100644
--- a/target-s390x/misc_helper.c
+++ b/target-s390x/misc_helper.c
@@ -25,6 +25,7 @@
 #include "exec/helper-proto.h"
 #include "sysemu/kvm.h"
 #include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "exec/address-spaces.h"
 #ifdef CONFIG_KVM
 #include <linux/kvm.h>
@@ -109,11 +110,13 @@ void program_interrupt(CPUS390XState *env, uint32_t code, int ilen)
 /* SCLP service call */
 uint32_t HELPER(servc)(CPUS390XState *env, uint64_t r1, uint64_t r2)
 {
+    qemu_mutex_lock_iothread();
     int r = sclp_service_call(env, r1, r2);
     if (r < 0) {
         program_interrupt(env, -r, 4);
-        return 0;
+        r = 0;
     }
+    qemu_mutex_unlock_iothread();
     return r;
 }
 
diff --git a/translate-all.c b/translate-all.c
index 3dd9214..2c8baf5 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -55,6 +55,7 @@
 #include "translate-all.h"
 #include "qemu/bitmap.h"
 #include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "exec/log.h"
 
 /* #define DEBUG_TB_INVALIDATE */
@@ -1541,7 +1542,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
 #ifdef CONFIG_SOFTMMU
 /* len must be <= 8 and start must be a multiple of len.
  * Called via softmmu_template.h when code areas are written to with
- * tb_lock held.
+ * iothread mutex not held.
  */
 void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
 {
@@ -1743,7 +1744,10 @@ void tb_check_watchpoint(CPUState *cpu)
 
 #ifndef CONFIG_USER_ONLY
 /* in deterministic execution mode, instructions doing device I/Os
-   must be at the end of the TB */
+ * must be at the end of the TB.
+ *
+ * Called by softmmu_template.h, with iothread mutex not held.
+ */
 void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
 {
 #if defined(TARGET_MIPS) || defined(TARGET_SH4)
@@ -1955,6 +1959,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
 
 void cpu_interrupt(CPUState *cpu, int mask)
 {
+    g_assert(qemu_mutex_iothread_locked());
     cpu->interrupt_request |= mask;
     cpu->tcg_exit_req = 1;
 }
diff --git a/translate-common.c b/translate-common.c
index 5e989cd..d504dd0 100644
--- a/translate-common.c
+++ b/translate-common.c
@@ -21,6 +21,7 @@
 #include "qemu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/cpus.h"
+#include "qemu/main-loop.h"
 
 uintptr_t qemu_real_host_page_size;
 intptr_t qemu_real_host_page_mask;
@@ -30,6 +31,7 @@ intptr_t qemu_real_host_page_mask;
 static void tcg_handle_interrupt(CPUState *cpu, int mask)
 {
     int old_mask;
+    g_assert(qemu_mutex_iothread_locked());
 
     old_mask = cpu->interrupt_request;
     cpu->interrupt_request |= mask;
@@ -40,17 +42,16 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
      */
     if (!qemu_cpu_is_self(cpu)) {
         qemu_cpu_kick(cpu);
-        return;
-    }
-
-    if (use_icount) {
-        cpu->icount_decr.u16.high = 0xffff;
-        if (!cpu->can_do_io
-            && (mask & ~old_mask) != 0) {
-            cpu_abort(cpu, "Raised interrupt while not in I/O function");
-        }
     } else {
-        cpu->tcg_exit_req = 1;
+        if (use_icount) {
+            cpu->icount_decr.u16.high = 0xffff;
+            if (!cpu->can_do_io
+                && (mask & ~old_mask) != 0) {
+                cpu_abort(cpu, "Raised interrupt while not in I/O function");
+            }
+        } else {
+            cpu->tcg_exit_req = 1;
+        }
     }
 }
 
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 06/19] tcg: remove global exit_request
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (4 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 07/19] tcg: enable tb_lock() for SoftMMU Alex Bennée
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
v5
  - minor merge conflict with kick patch
v4
  - moved to after iothread unlocking patch
  - needed to remove kick exit_request as well.
  - remove extraneous cpu->exit_request check
  - remove stray exit_request setting
  - remove needless atomic operation
---
 cpu-exec-common.c       |  2 --
 cpu-exec.c              |  9 ++-------
 cpus.c                  | 18 ++++++++++--------
 include/exec/exec-all.h |  3 ---
 4 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/cpu-exec-common.c b/cpu-exec-common.c
index e2bc053..0504a94 100644
--- a/cpu-exec-common.c
+++ b/cpu-exec-common.c
@@ -23,8 +23,6 @@
 #include "exec/exec-all.h"
 #include "exec/memory-internal.h"
 
-bool exit_request;
-
 /* exit the current TB, but without causing any exception to be raised */
 void cpu_loop_exit_noexc(CPUState *cpu)
 {
diff --git a/cpu-exec.c b/cpu-exec.c
index 59a9fc4..c4c7a06 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -561,9 +561,8 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
         /* Something asked us to stop executing
          * chained TBs; just continue round the main
          * loop. Whatever requested the exit will also
-         * have set something else (eg exit_request or
-         * interrupt_request) which we will handle
-         * next time around the loop.  But we need to
+         * have set something else (eg interrupt_request) which we
+         * will handle next time around the loop.  But we need to
          * ensure the tcg_exit_req read in generated code
          * comes before the next read of cpu->exit_request
          * or cpu->interrupt_request.
@@ -619,10 +618,6 @@ int cpu_exec(CPUState *cpu)
 
     rcu_read_lock();
 
-    if (unlikely(atomic_mb_read(&exit_request))) {
-        cpu->exit_request = 1;
-    }
-
     cc->cpu_exec_enter(cpu);
 
     /* Calculate difference between guest clock and host clock.
diff --git a/cpus.c b/cpus.c
index 7116ac6..6d43f32 100644
--- a/cpus.c
+++ b/cpus.c
@@ -761,7 +761,6 @@ static inline int64_t qemu_tcg_next_kick(void)
 static void qemu_cpu_kick_rr_cpu(void)
 {
     CPUState *cpu;
-    atomic_mb_set(&exit_request, 1);
     do {
         cpu = atomic_mb_read(&tcg_current_rr_cpu);
         if (cpu) {
@@ -1283,11 +1282,11 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
     start_tcg_kick_timer();
 
-    /* process any pending work */
-    atomic_mb_set(&exit_request, 1);
-
     cpu = first_cpu;
 
+    /* process any pending work */
+    cpu->exit_request = 1;
+
     while (1) {
         /* Account partial waits to QEMU_CLOCK_VIRTUAL.  */
         qemu_account_warp_timer();
@@ -1296,7 +1295,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             cpu = first_cpu;
         }
 
-        for (; cpu != NULL && !exit_request; cpu = CPU_NEXT(cpu)) {
+        while (cpu && !cpu->exit_request) {
             atomic_mb_set(&tcg_current_rr_cpu, cpu);
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
@@ -1316,12 +1315,15 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                 break;
             }
 
-        } /* for cpu.. */
+            cpu = CPU_NEXT(cpu);
+        } /* while (cpu && !cpu->exit_request).. */
+
         /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
         atomic_set(&tcg_current_rr_cpu, NULL);
 
-        /* Pairs with smp_wmb in qemu_cpu_kick.  */
-        atomic_mb_set(&exit_request, 0);
+        if (cpu && cpu->exit_request) {
+            atomic_mb_set(&cpu->exit_request, 0);
+        }
 
         handle_icount_deadline();
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 5a1b3a3..37781e0 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -405,7 +405,4 @@ bool memory_region_is_unassigned(MemoryRegion *mr);
 /* vl.c */
 extern int singlestep;
 
-/* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */
-extern bool exit_request;
-
 #endif
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 07/19] tcg: enable tb_lock() for SoftMMU
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (5 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 06/19] tcg: remove global exit_request Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU Alex Bennée
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

tb_lock() has long been used for linux-user mode to protect code
generation. By enabling it now we prepare for MTTCG and ensure all code
generation is serialised by this lock. The other major structure that
needs protecting is the l1_map and its PageDesc structures. For the
SoftMMU case we also use tb_lock() to protect these structures instead
of linux-user mmap_lock() which as the name suggests serialises updates
to the structure as a result of guest mmap operations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
v4
  - split from main tcg: enable thread-per-vCPU patch
---
 translate-all.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index 2c8baf5..cf828aa 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -82,7 +82,11 @@
 #endif
 
 #ifdef CONFIG_SOFTMMU
-#define assert_memory_lock() do { /* nothing */ } while (0)
+#define assert_memory_lock() do {           \
+        if (DEBUG_MEM_LOCKS) {              \
+            g_assert(have_tb_lock);         \
+        }                                   \
+    } while (0)
 #else
 #define assert_memory_lock() do {               \
         if (DEBUG_MEM_LOCKS) {                  \
@@ -146,9 +150,7 @@ TCGContext tcg_ctx;
 bool parallel_cpus;
 
 /* translation block context */
-#ifdef CONFIG_USER_ONLY
 __thread int have_tb_lock;
-#endif
 
 static void page_table_config_init(void)
 {
@@ -172,30 +174,24 @@ static void page_table_config_init(void)
 
 void tb_lock(void)
 {
-#ifdef CONFIG_USER_ONLY
     assert(!have_tb_lock);
     qemu_mutex_lock(&tcg_ctx.tb_ctx.tb_lock);
     have_tb_lock++;
-#endif
 }
 
 void tb_unlock(void)
 {
-#ifdef CONFIG_USER_ONLY
     assert(have_tb_lock);
     have_tb_lock--;
     qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock);
-#endif
 }
 
 void tb_lock_reset(void)
 {
-#ifdef CONFIG_USER_ONLY
     if (have_tb_lock) {
         qemu_mutex_unlock(&tcg_ctx.tb_ctx.tb_lock);
         have_tb_lock = 0;
     }
-#endif
 }
 
 #ifdef DEBUG_LOCKING
@@ -204,15 +200,11 @@ void tb_lock_reset(void)
 #define DEBUG_TB_LOCKS 0
 #endif
 
-#ifdef CONFIG_SOFTMMU
-#define assert_tb_lock() do { /* nothing */ } while (0)
-#else
 #define assert_tb_lock() do {               \
         if (DEBUG_TB_LOCKS) {               \
             g_assert(have_tb_lock);         \
         }                                   \
     } while (0)
-#endif
 
 
 static TranslationBlock *tb_find_pc(uintptr_t tc_ptr);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (6 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 07/19] tcg: enable tb_lock() for SoftMMU Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 16:35   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

There are a couple of changes that occur at the same time here:

  - introduce a single vCPU qemu_tcg_cpu_thread_fn

  One of these is spawned per vCPU with its own Thread and Condition
  variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
  single threaded function.

  - the TLS current_cpu variable is now live for the lifetime of MTTCG
    vCPU threads. This is for future work where async jobs need to know
    the vCPU context they are operating in.

The user to switch on multi-thread behaviour and spawn a thread
per-vCPU. For a simple test kvm-unit-test like:

  ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi

Will now use 4 vCPU threads and have an expected FAIL (instead of the
unexpected PASS) as the default mode of the test has no protection when
incrementing a shared variable.

We enable the parallel_cpus flag to ensure we generate correct barrier
and atomic code if supported by the front and backends. As each back end
and front end is updated they can add CONFIG_MTTCG_TARGET and
CONFIG_MTTCG_HOST to their respective make configurations so
default_mttcg_enabled does the right thing.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: Some fixes, conditionally, commit rewording]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v1 (ajb):
  - fix merge conflicts
  - maintain single-thread approach
v2
  - re-base fixes (no longer has tb_find_fast lock tweak ahead)
  - remove bogus break condition on cpu->stop/stopped
  - only process exiting cpus exit_request
  - handle all cpus idle case (fixes shutdown issues)
  - sleep on EXCP_HALTED in mttcg mode (prevent crash on start-up)
  - move icount timer into helper
v3
  - update the commit message
  - rm kick_timer tweaks (move to earlier tcg_current_cpu tweaks)
  - ensure linux-user clears cpu->exit_request in loop
  - purging of global exit_request and tcg_current_cpu in earlier patches
  - fix checkpatch warnings
v4
  - don't break loop on stopped, we may never schedule next in RR mode
  - make sure we flush iorequests of current cpu if we exited on one
  - add tcg_cpu_exec_start/end wraps for async work functions
  - stop killing of current_cpu on loop exit
  - set current_cpu in the single thread function
  - remove sleep special case, add qemu_tcg_should_sleep() for mttcg
  - no need to atomic set cpu->exit_request going into the loop
  - removed extraneous setting of exit_request
  - split tb_lock() part of patch
  - rename single thread fn to qemu_tcg_rr_cpu_thread_fn
v5
  - enable parallel_cpus for MTTCG (for barriers/atomics)
  - expand on CONFIG_ flags in commit message
---
 cpu-exec.c |   5 ---
 cpus.c     | 135 +++++++++++++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 104 insertions(+), 36 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index c4c7a06..aa8318d 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -392,7 +392,6 @@ static inline bool cpu_handle_halt(CPUState *cpu)
         }
 #endif
         if (!cpu_has_work(cpu)) {
-            current_cpu = NULL;
             return true;
         }
 
@@ -536,7 +535,6 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
 
 
     if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
-        atomic_set(&cpu->exit_request, 0);
         cpu->exception_index = EXCP_INTERRUPT;
         cpu_loop_exit(cpu);
     }
@@ -671,8 +669,5 @@ int cpu_exec(CPUState *cpu)
     cc->cpu_exec_exit(cpu);
     rcu_read_unlock();
 
-    /* fail safe : never use current_cpu outside cpu_exec() */
-    current_cpu = NULL;
-
     return ret;
 }
diff --git a/cpus.c b/cpus.c
index 6d43f32..b8d8b87 100644
--- a/cpus.c
+++ b/cpus.c
@@ -44,6 +44,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/bitmap.h"
 #include "qemu/seqlock.h"
+#include "tcg.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
 #include "sysemu/replay.h"
@@ -777,7 +778,7 @@ static void kick_tcg_thread(void *opaque)
 
 static void start_tcg_kick_timer(void)
 {
-    if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
+    if (!mttcg_enabled && !tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) {
         tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,  kick_tcg_thread, NULL);
         timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
     }
@@ -1030,27 +1031,34 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu)
 
 static void qemu_wait_io_event_common(CPUState *cpu)
 {
+    atomic_mb_set(&cpu->thread_kicked, false);
     if (cpu->stop) {
         cpu->stop = false;
         cpu->stopped = true;
         qemu_cond_broadcast(&qemu_pause_cond);
     }
     process_queued_cpu_work(cpu);
-    cpu->thread_kicked = false;
+}
+
+static bool qemu_tcg_should_sleep(CPUState *cpu)
+{
+    if (mttcg_enabled) {
+        return cpu_thread_is_idle(cpu);
+    } else {
+        return all_cpu_threads_idle();
+    }
 }
 
 static void qemu_tcg_wait_io_event(CPUState *cpu)
 {
-    while (all_cpu_threads_idle()) {
+    while (qemu_tcg_should_sleep(cpu)) {
         stop_tcg_kick_timer();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
     }
 
     start_tcg_kick_timer();
 
-    CPU_FOREACH(cpu) {
-        qemu_wait_io_event_common(cpu);
-    }
+    qemu_wait_io_event_common(cpu);
 }
 
 static void qemu_kvm_wait_io_event(CPUState *cpu)
@@ -1121,6 +1129,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu->can_do_io = 1;
+    current_cpu = cpu;
 
     sigemptyset(&waitset);
     sigaddset(&waitset, SIG_IPI);
@@ -1129,9 +1138,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
     cpu->created = true;
     qemu_cond_signal(&qemu_cpu_cond);
 
-    current_cpu = cpu;
     while (1) {
-        current_cpu = NULL;
         qemu_mutex_unlock_iothread();
         do {
             int sig;
@@ -1142,7 +1149,6 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
             exit(1);
         }
         qemu_mutex_lock_iothread();
-        current_cpu = cpu;
         qemu_wait_io_event_common(cpu);
     }
 
@@ -1254,7 +1260,7 @@ static void deal_with_unplugged_cpus(void)
  * elsewhere.
  */
 
-static void *qemu_tcg_cpu_thread_fn(void *arg)
+static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
 
@@ -1276,6 +1282,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
         /* process any pending work */
         CPU_FOREACH(cpu) {
+            current_cpu = cpu;
             qemu_wait_io_event_common(cpu);
         }
     }
@@ -1297,6 +1304,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
         while (cpu && !cpu->exit_request) {
             atomic_mb_set(&tcg_current_rr_cpu, cpu);
+            current_cpu = cpu;
 
             qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
                               (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0);
@@ -1308,7 +1316,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                     cpu_handle_guest_debug(cpu);
                     break;
                 }
-            } else if (cpu->stop || cpu->stopped) {
+            } else if (cpu->stop) {
                 if (cpu->unplug) {
                     cpu = CPU_NEXT(cpu);
                 }
@@ -1327,13 +1335,71 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
         handle_icount_deadline();
 
-        qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
+        qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
         deal_with_unplugged_cpus();
     }
 
     return NULL;
 }
 
+/* Multi-threaded TCG
+ *
+ * In the multi-threaded case each vCPU has its own thread. The TLS
+ * variable current_cpu can be used deep in the code to find the
+ * current CPUState for a given thread.
+ */
+
+static void *qemu_tcg_cpu_thread_fn(void *arg)
+{
+    CPUState *cpu = arg;
+
+    rcu_register_thread();
+
+    qemu_mutex_lock_iothread();
+    qemu_thread_get_self(cpu->thread);
+
+    cpu->thread_id = qemu_get_thread_id();
+    cpu->created = true;
+    cpu->can_do_io = 1;
+    current_cpu = cpu;
+    qemu_cond_signal(&qemu_cpu_cond);
+
+    /* process any pending work */
+    cpu->exit_request = 1;
+
+    while (1) {
+        if (cpu_can_run(cpu)) {
+            int r;
+            r = tcg_cpu_exec(cpu);
+            switch (r) {
+            case EXCP_DEBUG:
+                cpu_handle_guest_debug(cpu);
+                break;
+            case EXCP_HALTED:
+                /* during start-up the vCPU is reset and the thread is
+                 * kicked several times. If we don't ensure we go back
+                 * to sleep in the halted state we won't cleanly
+                 * start-up when the vCPU is enabled.
+                 *
+                 * cpu->halted should ensure we sleep in wait_io_event
+                 */
+                g_assert(cpu->halted);
+                break;
+            default:
+                /* Ignore everything else? */
+                break;
+            }
+        }
+
+        handle_icount_deadline();
+
+        atomic_mb_set(&cpu->exit_request, 0);
+        qemu_tcg_wait_io_event(cpu);
+    }
+
+    return NULL;
+}
+
 static void qemu_cpu_kick_thread(CPUState *cpu)
 {
 #ifndef _WIN32
@@ -1358,7 +1424,7 @@ void qemu_cpu_kick(CPUState *cpu)
     qemu_cond_broadcast(cpu->halt_cond);
     if (tcg_enabled()) {
         cpu_exit(cpu);
-        /* Also ensure current RR cpu is kicked */
+        /* NOP unless doing single-thread RR */
         qemu_cpu_kick_rr_cpu();
     } else {
         qemu_cpu_kick_thread(cpu);
@@ -1427,13 +1493,6 @@ void pause_all_vcpus(void)
 
     if (qemu_in_vcpu_thread()) {
         cpu_stop_current();
-        if (!kvm_enabled()) {
-            CPU_FOREACH(cpu) {
-                cpu->stop = false;
-                cpu->stopped = true;
-            }
-            return;
-        }
     }
 
     while (!all_vcpus_paused()) {
@@ -1482,29 +1541,43 @@ void cpu_remove_sync(CPUState *cpu)
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
-    static QemuCond *tcg_halt_cond;
-    static QemuThread *tcg_cpu_thread;
+    static QemuCond *single_tcg_halt_cond;
+    static QemuThread *single_tcg_cpu_thread;
 
-    /* share a single thread for all cpus with TCG */
-    if (!tcg_cpu_thread) {
+    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
+        parallel_cpus = true;
         cpu->thread = g_malloc0(sizeof(QemuThread));
         cpu->halt_cond = g_malloc0(sizeof(QemuCond));
         qemu_cond_init(cpu->halt_cond);
-        tcg_halt_cond = cpu->halt_cond;
-        snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+
+        if (qemu_tcg_mttcg_enabled()) {
+            /* create a thread per vCPU with TCG (MTTCG) */
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
                  cpu->cpu_index);
-        qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
-                           cpu, QEMU_THREAD_JOINABLE);
+
+            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+        } else {
+            /* share a single thread for all cpus with TCG */
+            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
+            qemu_thread_create(cpu->thread, thread_name,
+                               qemu_tcg_rr_cpu_thread_fn,
+                               cpu, QEMU_THREAD_JOINABLE);
+
+            single_tcg_halt_cond = cpu->halt_cond;
+            single_tcg_cpu_thread = cpu->thread;
+        }
 #ifdef _WIN32
         cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
         while (!cpu->created) {
             qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
         }
-        tcg_cpu_thread = cpu->thread;
     } else {
-        cpu->thread = tcg_cpu_thread;
-        cpu->halt_cond = tcg_halt_cond;
+        /* For non-MTTCG cases we share the thread */
+        cpu->thread = single_tcg_cpu_thread;
+        cpu->halt_cond = single_tcg_halt_cond;
     }
 }
 
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (7 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 16:36   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks Alex Bennée
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

From: Pranith Kumar <bobby.prani@gmail.com>

The patch enables handling atomic code in the guest. This should be
preferably done in cpu_handle_exception(), but the current assumptions
regarding when we can execute atomic sections cause a deadlock.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[AJB: tweak title]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cpus.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/cpus.c b/cpus.c
index b8d8b87..1ebe518 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1315,6 +1315,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
                 if (r == EXCP_DEBUG) {
                     cpu_handle_guest_debug(cpu);
                     break;
+                } else if (r == EXCP_ATOMIC) {
+                    qemu_mutex_unlock_iothread();
+                    cpu_exec_step_atomic(cpu);
+                    qemu_mutex_lock_iothread();
+                    break;
                 }
             } else if (cpu->stop) {
                 if (cpu->unplug) {
@@ -1385,6 +1390,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
                  */
                 g_assert(cpu->halted);
                 break;
+            case EXCP_ATOMIC:
+                qemu_mutex_unlock_iothread();
+                cpu_exec_step_atomic(cpu);
+                qemu_mutex_lock_iothread();
             default:
                 /* Ignore everything else? */
                 break;
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (8 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 16:39   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work Alex Bennée
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

For SoftMMU the TLB flushes are an example of a task that can be
triggered on one vCPU by another. To deal with this properly we need to
use safe work to ensure these changes are done safely. The new assert
can be enabled while debugging to catch these cases.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cputlb.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/cputlb.c b/cputlb.c
index c6e34f4..30c7c37 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -58,6 +58,12 @@
     } \
 } while (0)
 
+#define assert_cpu_is_self(this_cpu) do {                         \
+        if (DEBUG_TLB_GATE) {                                     \
+            g_assert(!cpu->created || qemu_cpu_is_self(cpu));     \
+        }                                                         \
+    } while (0)
+
 /* statistics */
 int tlb_flush_count;
 
@@ -77,6 +83,7 @@ void tlb_flush(CPUState *cpu, int flush_global)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    assert_cpu_is_self(cpu);
     tlb_debug("(%d)\n", flush_global);
 
     memset(env->tlb_table, -1, sizeof(env->tlb_table));
@@ -93,6 +100,7 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
     for (;;) {
@@ -137,6 +145,7 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     int i;
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
     tlb_debug("page :" TARGET_FMT_lx "\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -174,6 +183,7 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 
     va_start(argp, addr);
 
+    assert_cpu_is_self(cpu);
     tlb_debug("addr "TARGET_FMT_lx"\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -262,6 +272,8 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
+
     env = cpu->env_ptr;
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         unsigned int i;
@@ -293,6 +305,8 @@ void tlb_set_dirty(CPUState *cpu, target_ulong vaddr)
     int i;
     int mmu_idx;
 
+    assert_cpu_is_self(cpu);
+
     vaddr &= TARGET_PAGE_MASK;
     i = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
@@ -352,6 +366,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
     int asidx = cpu_asidx_from_attrs(cpu, attrs);
 
+    assert_cpu_is_self(cpu);
     assert(size >= TARGET_PAGE_SIZE);
     if (size != TARGET_PAGE_SIZE) {
         tlb_add_large_page(env, vaddr, size);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work.
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (9 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 16:48   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

From: KONRAD Frederic <fred.konrad@greensocs.com>

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v6 (base patches)
  - don't use cmpxchg_bool (we drop it later anyway)
  - use RUN_ON_CPU macros instead of inlines
  - bug out of tlb_flush if !tcg_enabled() (MacOSX make check failure)
v5 (base patches)
  - take tb_lock() for memset
  - ensure tb_flush_page properly asyncs work for other vCPUs
  - use run_on_cpu_data
v4 (base_patches)
  - brought forward from arm enabling series
  - restore pending_tlb_flush flag
v1
  - Remove tlb_flush_all just do the check in tlb_flush.
  - remove the need to g_malloc
  - tlb_flush calls direct if !cpu->created

fixup! cputlb: introduce tlb_flush_* async work.
---
 cputlb.c                | 90 +++++++++++++++++++++++++++++++++++++++++--------
 include/exec/exec-all.h |  1 +
 include/qom/cpu.h       |  6 ++++
 3 files changed, 83 insertions(+), 14 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 30c7c37..d75bf8f 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -64,28 +64,29 @@
         }                                                         \
     } while (0)
 
+/* run_on_cpu_data.target_ptr should always be big enough for a
+ * target_ulong even on 32 bit builds */
+QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
+
 /* statistics */
 int tlb_flush_count;
 
-/* NOTE:
- * If flush_global is true (the usual case), flush all tlb entries.
- * If flush_global is false, flush (at least) all tlb entries not
- * marked global.
- *
- * Since QEMU doesn't currently implement a global/not-global flag
- * for tlb entries, at the moment tlb_flush() will also flush all
- * tlb entries in the flush_global == false case. This is OK because
- * CPU architectures generally permit an implementation to drop
- * entries from the TLB at any time, so flushing more entries than
- * required is only an efficiency issue, not a correctness issue.
- */
-void tlb_flush(CPUState *cpu, int flush_global)
+static void tlb_flush_nocheck(CPUState *cpu, int flush_global)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    /* The QOM tests will trigger tlb_flushes without setting up TCG
+     * so we bug out here in that case.
+     */
+    if (!tcg_enabled()) {
+        return;
+    }
+
     assert_cpu_is_self(cpu);
     tlb_debug("(%d)\n", flush_global);
 
+    tb_lock();
+
     memset(env->tlb_table, -1, sizeof(env->tlb_table));
     memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -94,6 +95,39 @@ void tlb_flush(CPUState *cpu, int flush_global)
     env->tlb_flush_addr = -1;
     env->tlb_flush_mask = 0;
     tlb_flush_count++;
+
+    tb_unlock();
+
+    atomic_mb_set(&cpu->pending_tlb_flush, false);
+}
+
+static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
+{
+    tlb_flush_nocheck(cpu, data.host_int);
+}
+
+/* NOTE:
+ * If flush_global is true (the usual case), flush all tlb entries.
+ * If flush_global is false, flush (at least) all tlb entries not
+ * marked global.
+ *
+ * Since QEMU doesn't currently implement a global/not-global flag
+ * for tlb entries, at the moment tlb_flush() will also flush all
+ * tlb entries in the flush_global == false case. This is OK because
+ * CPU architectures generally permit an implementation to drop
+ * entries from the TLB at any time, so flushing more entries than
+ * required is only an efficiency issue, not a correctness issue.
+ */
+void tlb_flush(CPUState *cpu, int flush_global)
+{
+    if (cpu->created && !qemu_cpu_is_self(cpu)) {
+        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
+            async_run_on_cpu(cpu, tlb_flush_global_async_work,
+                             RUN_ON_CPU_HOST_INT(flush_global));
+        }
+    } else {
+        tlb_flush_nocheck(cpu, flush_global);
+    }
 }
 
 static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
@@ -103,6 +137,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
+    tb_lock();
+
     for (;;) {
         int mmu_idx = va_arg(argp, int);
 
@@ -117,6 +153,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+
+    tb_unlock();
 }
 
 void tlb_flush_by_mmuidx(CPUState *cpu, ...)
@@ -139,13 +177,15 @@ static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
     }
 }
 
-void tlb_flush_page(CPUState *cpu, target_ulong addr)
+static void tlb_flush_page_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
+    target_ulong addr = (target_ulong) data.target_ptr;
     int i;
     int mmu_idx;
 
     assert_cpu_is_self(cpu);
+
     tlb_debug("page :" TARGET_FMT_lx "\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -175,6 +215,18 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page(CPUState *cpu, target_ulong addr)
+{
+    tlb_debug("page :" TARGET_FMT_lx "\n", addr);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    } else {
+        tlb_flush_page_async_work(cpu, RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 {
     CPUArchState *env = cpu->env_ptr;
@@ -221,6 +273,16 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page_all(target_ulong addr)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 /* update the TLBs so that writes to code in the virtual page 'addr'
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 37781e0..e4f7839 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -160,6 +160,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr);
 void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
                  uintptr_t retaddr);
+void tlb_flush_page_all(target_ulong addr);
 #else
 static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
 {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 1735374..880ba42 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -393,6 +393,12 @@ struct CPUState {
        (absolute value) offset as small as possible.  This reduces code
        size, especially for hosts without large memory offsets.  */
     uint32_t tcg_exit_req;
+
+    /* The pending_tlb_flush flag is set and cleared atomically to
+     * avoid potential races. The aim of the flag is to avoid
+     * unnecessary flushes.
+     */
+    bool pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (10 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 16:51   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

This moves the helper function closer to where it is called and updates
the error message to report via error_report instead of the deprecated
fprintf.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cputlb.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index d75bf8f..cd1ff71 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -316,18 +316,6 @@ void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
     }
 }
 
-static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
-{
-    ram_addr_t ram_addr;
-
-    ram_addr = qemu_ram_addr_from_host(ptr);
-    if (ram_addr == RAM_ADDR_INVALID) {
-        fprintf(stderr, "Bad ram pointer %p\n", ptr);
-        abort();
-    }
-    return ram_addr;
-}
-
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 {
     CPUArchState *env;
@@ -539,6 +527,18 @@ static void report_bad_exec(CPUState *cpu, target_ulong addr)
     log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP);
 }
 
+static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
+{
+    ram_addr_t ram_addr;
+
+    ram_addr = qemu_ram_addr_from_host(ptr);
+    if (ram_addr == RAM_ADDR_INVALID) {
+        error_report("Bad ram pointer %p", ptr);
+        abort();
+    }
+    return ram_addr;
+}
+
 /* NOTE: this function can trigger an exception */
 /* NOTE2: the returned address is not exactly the physical address: it
  * is actually a ram_addr_t (in system mode; the user mode emulation
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (11 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-09 19:36   ` Pranith Kumar
  2016-11-10 17:23   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
                   ` (7 subsequent siblings)
  20 siblings, 2 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, Peter Crosthwaite

The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
in TLB entries to force the slow-path on writes. This is used to mark
page ranges containing code which has been translated so it can be
invalidated if written to. To do this safely we need to ensure the TLB
entries in question for all vCPUs are updated before we attempt to run
the code otherwise a race could be introduced.

To achieve this we atomically set the flag in tlb_reset_dirty_range and
take care when setting it when the TLB entry is filled.

The helper function is made static as it isn't used outside of cputlb.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v6
  - use TARGET_PAGE_BITS_MIN
  - use run_on_cpu helpers
---
 cputlb.c              | 250 +++++++++++++++++++++++++++++++++++++-------------
 include/exec/cputlb.h |   2 -
 include/qom/cpu.h     |  12 +--
 3 files changed, 194 insertions(+), 70 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index cd1ff71..ae94b7f 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -68,6 +68,11 @@
  * target_ulong even on 32 bit builds */
 QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
 
+/* We currently can't handle more than 16 bits in the MMUIDX bitmask.
+ */
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
+#define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
+
 /* statistics */
 int tlb_flush_count;
 
@@ -98,7 +103,7 @@ static void tlb_flush_nocheck(CPUState *cpu, int flush_global)
 
     tb_unlock();
 
-    atomic_mb_set(&cpu->pending_tlb_flush, false);
+    atomic_mb_set(&cpu->pending_tlb_flush, 0);
 }
 
 static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
@@ -121,7 +126,8 @@ static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
 void tlb_flush(CPUState *cpu, int flush_global)
 {
     if (cpu->created && !qemu_cpu_is_self(cpu)) {
-        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
+        if (atomic_mb_read(&cpu->pending_tlb_flush) != ALL_MMUIDX_BITS) {
+            atomic_mb_set(&cpu->pending_tlb_flush, ALL_MMUIDX_BITS);
             async_run_on_cpu(cpu, tlb_flush_global_async_work,
                              RUN_ON_CPU_HOST_INT(flush_global));
         }
@@ -130,39 +136,78 @@ void tlb_flush(CPUState *cpu, int flush_global)
     }
 }
 
-static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
+static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
+    unsigned long mmu_idx_bitmask = data.host_ulong;
+    int mmu_idx;
 
     assert_cpu_is_self(cpu);
-    tlb_debug("start\n");
 
     tb_lock();
 
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
+    tlb_debug("start: mmu_idx:0x%04lx\n", mmu_idx_bitmask);
 
-        if (mmu_idx < 0) {
-            break;
-        }
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
 
-        tlb_debug("%d\n", mmu_idx);
+        if (test_bit(mmu_idx, &mmu_idx_bitmask)) {
+            tlb_debug("%d\n", mmu_idx);
 
-        memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
-        memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+            memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
+            memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+        }
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
 
+    tlb_debug("done\n");
+
     tb_unlock();
 }
 
+/* Helper function to slurp va_args list into a bitmap
+ */
+static inline unsigned long make_mmu_index_bitmap(va_list args)
+{
+    unsigned long bitmap = 0;
+    int mmu_index = va_arg(args, int);
+
+    /* An empty va_list would be a bad call */
+    g_assert(mmu_index > 0);
+
+    do {
+        set_bit(mmu_index, &bitmap);
+        mmu_index = va_arg(args, int);
+    } while (mmu_index >= 0);
+
+    return bitmap;
+}
+
 void tlb_flush_by_mmuidx(CPUState *cpu, ...)
 {
     va_list argp;
+    unsigned long mmu_idx_bitmap;
+
     va_start(argp, cpu);
-    v_tlb_flush_by_mmuidx(cpu, argp);
+    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
     va_end(argp);
+
+    tlb_debug("mmu_idx: 0x%04lx\n", mmu_idx_bitmap);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        uint16_t pending_flushes =
+            mmu_idx_bitmap & ~atomic_mb_read(&cpu->pending_tlb_flush);
+        if (pending_flushes) {
+            tlb_debug("reduced mmu_idx: 0x%" PRIx16 "\n", pending_flushes);
+
+            atomic_or(&cpu->pending_tlb_flush, pending_flushes);
+            async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
+                             RUN_ON_CPU_HOST_INT(pending_flushes));
+        }
+    } else {
+        tlb_flush_by_mmuidx_async_work(cpu,
+                                       RUN_ON_CPU_HOST_ULONG(mmu_idx_bitmap));
+    }
 }
 
 static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
@@ -227,16 +272,50 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     }
 }
 
-void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
+/* As we are going to hijack the bottom bits of the page address for a
+ * mmuidx bit mask we need to fail to build if we can't do that
+ */
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > TARGET_PAGE_BITS_MIN);
+
+static void tlb_flush_page_by_mmuidx_async_work(CPUState *cpu,
+                                                run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
-    int i, k;
-    va_list argp;
-
-    va_start(argp, addr);
+    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
+    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
+    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
+    int page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+    int mmu_idx;
+    int i;
 
     assert_cpu_is_self(cpu);
-    tlb_debug("addr "TARGET_FMT_lx"\n", addr);
+
+    tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
+              page, addr, mmu_idx_bitmap);
+
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
+        if (test_bit(mmu_idx, &mmu_idx_bitmap)) {
+            tlb_flush_entry(&env->tlb_table[mmu_idx][page], addr);
+
+            /* check whether there are vltb entries that need to be flushed */
+            for (i = 0; i < CPU_VTLB_SIZE; i++) {
+                tlb_flush_entry(&env->tlb_v_table[mmu_idx][i], addr);
+            }
+        }
+    }
+
+    tb_flush_jmp_cache(cpu, addr);
+}
+
+static void tlb_check_page_and_flush_by_mmuidx_async_work(CPUState *cpu,
+                                                          run_on_cpu_data data)
+{
+    CPUArchState *env = cpu->env_ptr;
+    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
+    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
+    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
+
+    tlb_debug("addr:"TARGET_FMT_lx" mmu_idx: %04lx\n", addr, mmu_idx_bitmap);
 
     /* Check if we need to flush due to large pages.  */
     if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) {
@@ -244,33 +323,35 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
                   TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
                   env->tlb_flush_addr, env->tlb_flush_mask);
 
-        v_tlb_flush_by_mmuidx(cpu, argp);
-        va_end(argp);
-        return;
+        tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_ULONG(mmu_idx_bitmap));
+    } else {
+        tlb_flush_page_by_mmuidx_async_work(cpu, data);
     }
+}
 
-    addr &= TARGET_PAGE_MASK;
-    i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
+void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
+{
+    unsigned long mmu_idx_bitmap;
+    target_ulong addr_and_mmu_idx;
+    va_list argp;
 
-        if (mmu_idx < 0) {
-            break;
-        }
+    va_start(argp, addr);
+    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
+    va_end(argp);
 
-        tlb_debug("idx %d\n", mmu_idx);
+    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%lx\n", addr, mmu_idx_bitmap);
 
-        tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
+    /* This should already be page aligned */
+    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
+    addr_and_mmu_idx |= mmu_idx_bitmap;
 
-        /* check whether there are vltb entries that need to be flushed */
-        for (k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
-        }
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_check_page_and_flush_by_mmuidx_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+    } else {
+        tlb_check_page_and_flush_by_mmuidx_async_work(
+            cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
     }
-    va_end(argp);
-
-    tb_flush_jmp_cache(cpu, addr);
 }
 
 void tlb_flush_page_all(target_ulong addr)
@@ -298,32 +379,50 @@ void tlb_unprotect_code(ram_addr_t ram_addr)
     cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_CODE);
 }
 
-static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
-{
-    return (tlbe->addr_write & (TLB_INVALID_MASK|TLB_MMIO|TLB_NOTDIRTY)) == 0;
-}
 
-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
+/*
+ * Dirty write flag handling
+ *
+ * When the TCG code writes to a location it looks up the address in
+ * the TLB and uses that data to compute the final address. If any of
+ * the lower bits of the address are set then the slow path is forced.
+ * There are a number of reasons to do this but for normal RAM the
+ * most usual is detecting writes to code regions which may invalidate
+ * generated code.
+ *
+ * Because we want other vCPUs to respond to changes straight away we
+ * update the te->addr_write field atomically. If the TLB entry has
+ * been changed by the vCPU in the mean time we skip the update.
+ */
+
+static void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
                            uintptr_t length)
 {
-    uintptr_t addr;
+    /* paired with atomic_mb_set in tlb_set_page_with_attrs */
+    uintptr_t orig_addr = atomic_mb_read(&tlb_entry->addr_write);
+    uintptr_t addr = orig_addr;
 
-    if (tlb_is_dirty_ram(tlb_entry)) {
-        addr = (tlb_entry->addr_write & TARGET_PAGE_MASK) + tlb_entry->addend;
+    if ((addr & (TLB_INVALID_MASK | TLB_MMIO | TLB_NOTDIRTY)) == 0) {
+        addr &= TARGET_PAGE_MASK;
+        addr += atomic_read(&tlb_entry->addend);
         if ((addr - start) < length) {
-            tlb_entry->addr_write |= TLB_NOTDIRTY;
+            uintptr_t notdirty_addr = orig_addr | TLB_NOTDIRTY;
+            atomic_cmpxchg(&tlb_entry->addr_write, orig_addr, notdirty_addr);
         }
     }
 }
 
+/* This is a cross vCPU call (i.e. another vCPU resetting the flags of
+ * the target vCPU). As such care needs to be taken that we don't
+ * dangerously race with another vCPU update. The only thing actually
+ * updated is the target TLB entry ->addr_write flags.
+ */
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 {
     CPUArchState *env;
 
     int mmu_idx;
 
-    assert_cpu_is_self(cpu);
-
     env = cpu->env_ptr;
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         unsigned int i;
@@ -409,9 +508,9 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     MemoryRegionSection *section;
     unsigned int index;
     target_ulong address;
-    target_ulong code_address;
+    target_ulong code_address, write_address;
     uintptr_t addend;
-    CPUTLBEntry *te;
+    CPUTLBEntry *te, *tv;
     hwaddr iotlb, xlat, sz;
     unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
     int asidx = cpu_asidx_from_attrs(cpu, attrs);
@@ -446,15 +545,21 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
 
     index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     te = &env->tlb_table[mmu_idx][index];
-
     /* do not discard the translation in te, evict it into a victim tlb */
-    env->tlb_v_table[mmu_idx][vidx] = *te;
+    tv = &env->tlb_v_table[mmu_idx][vidx];
+
+    /* addr_write can race with tlb_reset_dirty_range_all */
+    tv->addr_read = te->addr_read;
+    atomic_set(&tv->addr_write, atomic_read(&te->addr_write));
+    tv->addr_code = te->addr_code;
+    atomic_set(&tv->addend, atomic_read(&te->addend));
+
     env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
 
     /* refill the tlb */
     env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
     env->iotlb[mmu_idx][index].attrs = attrs;
-    te->addend = addend - vaddr;
+    atomic_set(&te->addend, addend - vaddr);
     if (prot & PAGE_READ) {
         te->addr_read = address;
     } else {
@@ -466,21 +571,24 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     } else {
         te->addr_code = -1;
     }
+
+    write_address = -1;
     if (prot & PAGE_WRITE) {
         if ((memory_region_is_ram(section->mr) && section->readonly)
             || memory_region_is_romd(section->mr)) {
             /* Write access calls the I/O callback.  */
-            te->addr_write = address | TLB_MMIO;
+            write_address = address | TLB_MMIO;
         } else if (memory_region_is_ram(section->mr)
                    && cpu_physical_memory_is_clean(
                         memory_region_get_ram_addr(section->mr) + xlat)) {
-            te->addr_write = address | TLB_NOTDIRTY;
+            write_address = address | TLB_NOTDIRTY;
         } else {
-            te->addr_write = address;
+            write_address = address;
         }
-    } else {
-        te->addr_write = -1;
     }
+
+    /* Pairs with flag setting in tlb_reset_dirty_range */
+    atomic_mb_set(&te->addr_write, write_address);
 }
 
 /* Add a new TLB entry, but without specifying the memory
@@ -643,10 +751,28 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
         if (cmp == page) {
             /* Found entry in victim tlb, swap tlb and iotlb.  */
             CPUTLBEntry tmptlb, *tlb = &env->tlb_table[mmu_idx][index];
+
+            /* tmptlb = *tlb; */
+            /* addr_write can race with tlb_reset_dirty_range_all */
+            tmptlb.addr_read = tlb->addr_read;
+            tmptlb.addr_write = atomic_read(&tlb->addr_write);
+            tmptlb.addr_code = tlb->addr_code;
+            tmptlb.addend = atomic_read(&tlb->addend);
+
+            /* *tlb = *vtlb; */
+            tlb->addr_read = vtlb->addr_read;
+            atomic_set(&tlb->addr_write, atomic_read(&vtlb->addr_write));
+            tlb->addr_code = vtlb->addr_code;
+            atomic_set(&tlb->addend, atomic_read(&vtlb->addend));
+
+            /* *vtlb = tmptlb; */
+            vtlb->addr_read = tmptlb.addr_read;
+            atomic_set(&vtlb->addr_write, tmptlb.addr_write);
+            vtlb->addr_code = tmptlb.addr_code;
+            atomic_set(&vtlb->addend, tmptlb.addend);
+
             CPUIOTLBEntry tmpio, *io = &env->iotlb[mmu_idx][index];
             CPUIOTLBEntry *vio = &env->iotlb_v[mmu_idx][vidx];
-
-            tmptlb = *tlb; *tlb = *vtlb; *vtlb = tmptlb;
             tmpio = *io; *io = *vio; *vio = tmpio;
             return true;
         }
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index d454c00..3f94178 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -23,8 +23,6 @@
 /* cputlb.c */
 void tlb_protect_code(ram_addr_t ram_addr);
 void tlb_unprotect_code(ram_addr_t ram_addr);
-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
-                           uintptr_t length);
 extern int tlb_flush_count;
 
 #endif
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 880ba42..d945221 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -388,17 +388,17 @@ struct CPUState {
      */
     bool throttle_thread_scheduled;
 
+    /* The pending_tlb_flush flag is set and cleared atomically to
+     * avoid potential races. The aim of the flag is to avoid
+     * unnecessary flushes.
+     */
+    uint16_t pending_tlb_flush;
+
     /* Note that this is accessed at the start of every TB via a negative
        offset from AREG0.  Leave this field at the end so as to make the
        (absolute value) offset as small as possible.  This reduces code
        size, especially for hosts without large memory offsets.  */
     uint32_t tcg_exit_req;
-
-    /* The pending_tlb_flush flag is set and cleared atomically to
-     * avoid potential races. The aim of the flag is to avoid
-     * unnecessary flushes.
-     */
-    bool pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (12 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 17:35   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it Alex Bennée
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, open list:ARM

When switching a new vCPU on we want to complete a bunch of the setup
work before we start scheduling the vCPU thread. To do this cleanly we
defer vCPU setup to async work which will run the vCPUs execution
context as the thread is woken up. The scheduling of the work will kick
the vCPU awake.

This avoids potential races in MTTCG system emulation.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target-arm/arm-powerctl.c | 144 +++++++++++++++++++++++++++-------------------
 1 file changed, 86 insertions(+), 58 deletions(-)

diff --git a/target-arm/arm-powerctl.c b/target-arm/arm-powerctl.c
index fbb7a15..0ef4b29 100644
--- a/target-arm/arm-powerctl.c
+++ b/target-arm/arm-powerctl.c
@@ -48,11 +48,85 @@ CPUState *arm_get_cpu_by_id(uint64_t id)
     return NULL;
 }
 
+struct cpu_on_info {
+    uint64_t entry;
+    uint64_t context_id;
+    uint32_t target_el;
+    bool target_aa64;
+};
+
+
+static void arm_set_cpu_on_async_work(CPUState *target_cpu_state,
+                                      run_on_cpu_data data)
+{
+    ARMCPU *target_cpu = ARM_CPU(target_cpu_state);
+    struct cpu_on_info *info = (struct cpu_on_info *) data.host_ptr;
+
+    /* Initialize the cpu we are turning on */
+    cpu_reset(target_cpu_state);
+    target_cpu->powered_off = false;
+    target_cpu_state->halted = 0;
+
+    if (info->target_aa64) {
+        if ((info->target_el < 3) && arm_feature(&target_cpu->env, ARM_FEATURE_EL3)) {
+            /*
+             * As target mode is AArch64, we need to set lower
+             * exception level (the requested level 2) to AArch64
+             */
+            target_cpu->env.cp15.scr_el3 |= SCR_RW;
+        }
+
+        if ((info->target_el < 2) && arm_feature(&target_cpu->env, ARM_FEATURE_EL2)) {
+            /*
+             * As target mode is AArch64, we need to set lower
+             * exception level (the requested level 1) to AArch64
+             */
+            target_cpu->env.cp15.hcr_el2 |= HCR_RW;
+        }
+
+        target_cpu->env.pstate = aarch64_pstate_mode(info->target_el, true);
+    } else {
+        /* We are requested to boot in AArch32 mode */
+        static uint32_t mode_for_el[] = { 0,
+                                          ARM_CPU_MODE_SVC,
+                                          ARM_CPU_MODE_HYP,
+                                          ARM_CPU_MODE_SVC };
+
+        cpsr_write(&target_cpu->env, mode_for_el[info->target_el], CPSR_M,
+                   CPSRWriteRaw);
+    }
+
+    if (info->target_el == 3) {
+        /* Processor is in secure mode */
+        target_cpu->env.cp15.scr_el3 &= ~SCR_NS;
+    } else {
+        /* Processor is not in secure mode */
+        target_cpu->env.cp15.scr_el3 |= SCR_NS;
+    }
+
+    /* We check if the started CPU is now at the correct level */
+    assert(info->target_el == arm_current_el(&target_cpu->env));
+
+    if (info->target_aa64) {
+        target_cpu->env.xregs[0] = info->context_id;
+        target_cpu->env.thumb = false;
+    } else {
+        target_cpu->env.regs[0] = info->context_id;
+        target_cpu->env.thumb = info->entry & 1;
+        info->entry &= 0xfffffffe;
+    }
+
+    /* Start the new CPU at the requested address */
+    cpu_set_pc(target_cpu_state, info->entry);
+    g_free(info);
+}
+
 int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
                    uint32_t target_el, bool target_aa64)
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
+    struct cpu_on_info *info;
 
     DPRINTF("cpu %" PRId64 " (EL %d, %s) @ 0x%" PRIx64 " with R0 = 0x%" PRIx64
             "\n", cpuid, target_el, target_aa64 ? "aarch64" : "aarch32", entry,
@@ -109,64 +183,18 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
 
-    /* Initialize the cpu we are turning on */
-    cpu_reset(target_cpu_state);
-    target_cpu->powered_off = false;
-    target_cpu_state->halted = 0;
-
-    if (target_aa64) {
-        if ((target_el < 3) && arm_feature(&target_cpu->env, ARM_FEATURE_EL3)) {
-            /*
-             * As target mode is AArch64, we need to set lower
-             * exception level (the requested level 2) to AArch64
-             */
-            target_cpu->env.cp15.scr_el3 |= SCR_RW;
-        }
-
-        if ((target_el < 2) && arm_feature(&target_cpu->env, ARM_FEATURE_EL2)) {
-            /*
-             * As target mode is AArch64, we need to set lower
-             * exception level (the requested level 1) to AArch64
-             */
-            target_cpu->env.cp15.hcr_el2 |= HCR_RW;
-        }
-
-        target_cpu->env.pstate = aarch64_pstate_mode(target_el, true);
-    } else {
-        /* We are requested to boot in AArch32 mode */
-        static uint32_t mode_for_el[] = { 0,
-                                          ARM_CPU_MODE_SVC,
-                                          ARM_CPU_MODE_HYP,
-                                          ARM_CPU_MODE_SVC };
-
-        cpsr_write(&target_cpu->env, mode_for_el[target_el], CPSR_M,
-                   CPSRWriteRaw);
-    }
-
-    if (target_el == 3) {
-        /* Processor is in secure mode */
-        target_cpu->env.cp15.scr_el3 &= ~SCR_NS;
-    } else {
-        /* Processor is not in secure mode */
-        target_cpu->env.cp15.scr_el3 |= SCR_NS;
-    }
-
-    /* We check if the started CPU is now at the correct level */
-    assert(target_el == arm_current_el(&target_cpu->env));
-
-    if (target_aa64) {
-        target_cpu->env.xregs[0] = context_id;
-        target_cpu->env.thumb = false;
-    } else {
-        target_cpu->env.regs[0] = context_id;
-        target_cpu->env.thumb = entry & 1;
-        entry &= 0xfffffffe;
-    }
-
-    /* Start the new CPU at the requested address */
-    cpu_set_pc(target_cpu_state, entry);
-
-    qemu_cpu_kick(target_cpu_state);
+    /* To avoid racing with a CPU we are just kicking off we do the
+     * final bit of preparation for the work in the target CPUs
+     * context.
+     */
+    info = g_new(struct cpu_on_info, 1);
+    info->entry = entry;
+    info->context_id = context_id;
+    info->target_el = target_el;
+    info->target_aa64 = target_aa64;
+
+    async_run_on_cpu(target_cpu_state, arm_set_cpu_on_async_work,
+                     RUN_ON_CPU_HOST_PTR(info));
 
     /* We are good to go */
     return QEMU_ARM_POWERCTL_RET_SUCCESS;
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (13 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 17:48   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access Alex Bennée
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, open list:ARM

cputlb owns the TLB entries and knows how to safely update them in
MTTCG.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target-arm/cpu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 99f0dbe..990bcb1 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -122,7 +122,13 @@ static void arm_cpu_reset(CPUState *s)
 
     acc->parent_reset(s);
 
+#ifdef CONFIG_SOFTMMU
+    memset(env, 0, offsetof(CPUARMState, tlb_table));
+    tlb_flush(s, 0);
+#else
     memset(env, 0, offsetof(CPUARMState, features));
+#endif
+
     g_hash_table_foreach(cpu->cp_regs, cp_reg_reset, cpu);
     g_hash_table_foreach(cpu->cp_regs, cp_reg_check_reset, cpu);
 
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (14 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 17:54   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL Alex Bennée
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, open list:ARM cores

Most ARMCPRegInfo structures just allow updating of the CPU field.
However some have more complex operations that *may* be have cross vCPU
effects therefor need to be serialised. The most obvious examples at the
moment are things that affect the GICv3 IRQ controller. To avoid
applying this requirement to all registers with custom access functions
we check for if the type is marked ARM_CP_IO.

By default all MMIO access to devices already takes the BQL to serialise
hardware emulation.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 hw/intc/arm_gicv3_cpuif.c |  3 +++
 target-arm/op_helper.c    | 39 +++++++++++++++++++++++++++++++++++----
 2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index bca30c4..8ea4b5b 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -13,6 +13,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "trace.h"
 #include "gicv3_internal.h"
 #include "cpu.h"
@@ -128,6 +129,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
     ARMCPU *cpu = ARM_CPU(cs->cpu);
     CPUARMState *env = &cpu->env;
 
+    g_assert(qemu_mutex_iothread_locked());
+
     trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
                              cs->hppi.grp, cs->hppi.prio);
 
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index cd94216..4f0c754 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -17,6 +17,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "internals.h"
@@ -734,28 +735,58 @@ void HELPER(set_cp_reg)(CPUARMState *env, void *rip, uint32_t value)
 {
     const ARMCPRegInfo *ri = rip;
 
-    ri->writefn(env, ri, value);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        ri->writefn(env, ri, value);
+        qemu_mutex_unlock_iothread();
+    } else {
+        ri->writefn(env, ri, value);
+    }
 }
 
 uint32_t HELPER(get_cp_reg)(CPUARMState *env, void *rip)
 {
     const ARMCPRegInfo *ri = rip;
+    uint32_t res;
 
-    return ri->readfn(env, ri);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        res = ri->readfn(env, ri);
+        qemu_mutex_unlock_iothread();
+    } else {
+        res = ri->readfn(env, ri);
+    }
+
+    return res;
 }
 
 void HELPER(set_cp_reg64)(CPUARMState *env, void *rip, uint64_t value)
 {
     const ARMCPRegInfo *ri = rip;
 
-    ri->writefn(env, ri, value);
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        ri->writefn(env, ri, value);
+        qemu_mutex_unlock_iothread();
+    } else {
+        ri->writefn(env, ri, value);
+    }
 }
 
 uint64_t HELPER(get_cp_reg64)(CPUARMState *env, void *rip)
 {
     const ARMCPRegInfo *ri = rip;
+    uint64_t res;
+
+    if (ri->type & ARM_CP_IO) {
+        qemu_mutex_lock_iothread();
+        res = ri->readfn(env, ri);
+        qemu_mutex_unlock_iothread();
+    } else {
+        res = ri->readfn(env, ri);
+    }
 
-    return ri->readfn(env, ri);
+    return res;
 }
 
 void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (15 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 17:56   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, open list:ARM

As the arm_call_el_change_hook may affect global state (for example with
updating the global GIC state) we need to assert/take the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target-arm/helper.c    | 6 ++++++
 target-arm/op_helper.c | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index b5b65ca..3f47fa7 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -6669,6 +6669,12 @@ void arm_cpu_do_interrupt(CPUState *cs)
         arm_cpu_do_interrupt_aarch32(cs);
     }
 
+    /* Hooks may change global state so BQL should be held, also the
+     * BQL needs to be held for any modification of
+     * cs->interrupt_request.
+     */
+    g_assert(qemu_mutex_iothread_locked());
+
     arm_call_el_change_hook(cpu);
 
     if (!kvm_enabled()) {
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 4f0c754..41beabc 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -487,7 +487,9 @@ void HELPER(cpsr_write_eret)(CPUARMState *env, uint32_t val)
      */
     env->regs[15] &= (env->thumb ? ~1 : ~3);
 
+    qemu_mutex_lock_iothread();
     arm_call_el_change_hook(arm_env_get_cpu(env));
+    qemu_mutex_unlock_iothread();
 }
 
 /* Access to user mode registers from privileged modes.  */
@@ -1013,7 +1015,9 @@ void HELPER(exception_return)(CPUARMState *env)
         env->pc = env->elr_el[cur_el];
     }
 
+    qemu_mutex_lock_iothread();
     arm_call_el_change_hook(arm_env_get_cpu(env));
+    qemu_mutex_unlock_iothread();
 
     return;
 
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (16 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 17:59   ` Richard Henderson
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée, open list:ARM

The WFE and YIELD instructions are really only hints and in TCG's case
they were useful to move the scheduling on from one vCPU to the next. In
the parallel context (MTTCG) this just causes an unnecessary cpu_exit
and contention of the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target-arm/op_helper.c     |  7 +++++++
 target-arm/translate-a64.c |  8 ++++++--
 target-arm/translate.c     | 20 ++++++++++++++++----
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 41beabc..3a36bf9 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -435,6 +435,13 @@ void HELPER(yield)(CPUARMState *env)
     ARMCPU *cpu = arm_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
 
+    /* When running in MTTCG we don't generate jumps to the yield and
+     * WFE helpers as it won't affect the scheduling of other vCPUs.
+     * If we wanted to more completely model WFE/SEV so we don't busy
+     * spin unnecessarily we would need to do something more involved.
+     */
+    g_assert(!parallel_cpus);
+
     /* This is a non-trappable hint instruction that generally indicates
      * that the guest is currently busy-looping. Yield control back to the
      * top level loop so that a more deserving VCPU has a chance to run.
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index de48747..6e44838 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -1341,10 +1341,14 @@ static void handle_hint(DisasContext *s, uint32_t insn,
         s->is_jmp = DISAS_WFI;
         return;
     case 1: /* YIELD */
-        s->is_jmp = DISAS_YIELD;
+        if (!parallel_cpus) {
+            s->is_jmp = DISAS_YIELD;
+        }
         return;
     case 2: /* WFE */
-        s->is_jmp = DISAS_WFE;
+        if (!parallel_cpus) {
+            s->is_jmp = DISAS_WFE;
+        }
         return;
     case 4: /* SEV */
     case 5: /* SEVL */
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 0ad9070..9417e8e 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4368,20 +4368,32 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
     gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
+/*
+ * For WFI we will halt the vCPU until an IRQ. For WFE and YIELD we
+ * only call the helper when running single threaded TCG code to ensure
+ * the next round-robin scheduled vCPU gets a crack. In MTTCG mode we
+ * just skip this instruction. Currently the SEV/SEVL instructions
+ * which are *one* of many ways to wake the CPU from WFE are not
+ * implemented so we can't sleep like WFI does.
+ */
 static void gen_nop_hint(DisasContext *s, int val)
 {
     switch (val) {
     case 1: /* yield */
-        gen_set_pc_im(s, s->pc);
-        s->is_jmp = DISAS_YIELD;
+        if (!parallel_cpus) {
+            gen_set_pc_im(s, s->pc);
+            s->is_jmp = DISAS_YIELD;
+        }
         break;
     case 3: /* wfi */
         gen_set_pc_im(s, s->pc);
         s->is_jmp = DISAS_WFI;
         break;
     case 2: /* wfe */
-        gen_set_pc_im(s, s->pc);
-        s->is_jmp = DISAS_WFE;
+        if (!parallel_cpus) {
+            gen_set_pc_im(s, s->pc);
+            s->is_jmp = DISAS_WFE;
+        }
         break;
     case 4: /* sev */
     case 5: /* sevl */
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (17 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
@ 2016-11-09 14:57 ` Alex Bennée
  2016-11-10 18:00   ` Richard Henderson
  2016-11-09 15:11 ` [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Paolo Bonzini
  2016-11-13  5:50 ` no-reply
  20 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 14:57 UTC (permalink / raw)
  To: pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Alex Bennée

This enables the multi-threaded system emulation by default for ARMv7
and ARMv8 guests using the x86_64 TCG backend. This means:

  - The x86_64 TCG backend supports cmpxchg based atomic ops
  - The x86_64 TCG backend emits barriers for barrier ops

And on the guest side:

  - The ARM translate.c/translate-64.c have been converted to
    - use MTTCG safe atomic primitives
    - emit the appropriate barrier ops
  - The ARM machine has been updated to
    - hold the BQL when modifying shared cross-vCPU state
    - defer cpu_reset to async safe work

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 configure                       | 12 ++++++++++++
 default-configs/arm-softmmu.mak |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/configure b/configure
index fd6f898..685eb03 100755
--- a/configure
+++ b/configure
@@ -516,6 +516,7 @@ else
 fi
 
 ARCH=
+host_mttcg_support=
 # Normalise host CPU name and set ARCH.
 # Note that this case should only have supported host CPUs, not guests.
 case "$cpu" in
@@ -527,6 +528,7 @@ case "$cpu" in
   ;;
   x86_64|amd64)
     cpu="x86_64"
+    host_mttcg_support=yes
   ;;
   armv*b|armv*l|arm)
     cpu="arm"
@@ -5703,6 +5705,10 @@ if test "$pthread_setname_np" = "yes" ; then
   echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
 fi
 
+if test "$host_mttcg_support" = "yes" ; then
+  echo "CONFIG_MTTCG_HOST=y" >> $config_host_mak
+fi
+
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
 elif test "$ARCH" = "sparc64" ; then
@@ -5815,6 +5821,7 @@ target_dir="$target"
 config_target_mak=$target_dir/config-target.mak
 target_name=$(echo $target | cut -d '-' -f 1)
 target_bigendian="no"
+target_mttcg_support="no"
 
 case "$target_name" in
   armeb|lm32|m68k|microblaze|mips|mipsn32|mips64|moxie|or32|ppc|ppcemb|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus|xtensaeb)
@@ -5872,11 +5879,13 @@ case "$target_name" in
     TARGET_ARCH=arm
     bflt="yes"
     gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    target_mttcg_support="yes"
   ;;
   aarch64)
     TARGET_BASE_ARCH=arm
     bflt="yes"
     gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
+    target_mttcg_support="yes"
   ;;
   cris)
   ;;
@@ -6027,6 +6036,9 @@ if test "$target_bigendian" = "yes" ; then
 fi
 if test "$target_softmmu" = "yes" ; then
   echo "CONFIG_SOFTMMU=y" >> $config_target_mak
+  if test "$target_mttcg_support" = "yes" ; then
+    echo "CONFIG_MTTCG_TARGET=y" >> $config_target_mak
+  fi
 fi
 if test "$target_user_only" = "yes" ; then
   echo "CONFIG_USER_ONLY=y" >> $config_target_mak
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 6de3e16..007f751 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -115,3 +115,5 @@ CONFIG_ACPI=y
 CONFIG_SMBIOS=y
 CONFIG_ASPEED_SOC=y
 CONFIG_GPIO_KEY=y
+
+CONFIG_MTTCG_TARGET=y
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (18 preceding siblings ...)
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
@ 2016-11-09 15:11 ` Paolo Bonzini
  2016-11-09 18:38   ` Alex Bennée
  2016-11-13  5:50 ` no-reply
  20 siblings, 1 reply; 50+ messages in thread
From: Paolo Bonzini @ 2016-11-09 15:11 UTC (permalink / raw)
  To: Alex Bennée
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana



On 09/11/2016 15:57, Alex Bennée wrote:
> The one outstanding question is how to deal with the TLB flush
> semantics of the various guest architectures. Currently flushes to
> other vCPUs will happen at the end of their currently executing
> Translation Block which could mean the originating vCPU makes
> assumptions about flushes having been completed when they haven't. In
> practice this hasn't been a problem and I haven't been able to
> construct a test case so far that would fail in such a case. This is
> probably because most tear downs of the other vCPU TLBs tend to be
> done while the other vCPUs are not doing much. If anyone can come up
> with a test case that would fail if this assumption isn't met then
> please let me know.

Have you tried implementing ARM's DMB semantics correctly?

Paolo

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement
  2016-11-09 15:11 ` [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Paolo Bonzini
@ 2016-11-09 18:38   ` Alex Bennée
  0 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-09 18:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 09/11/2016 15:57, Alex Bennée wrote:
>> The one outstanding question is how to deal with the TLB flush
>> semantics of the various guest architectures. Currently flushes to
>> other vCPUs will happen at the end of their currently executing
>> Translation Block which could mean the originating vCPU makes
>> assumptions about flushes having been completed when they haven't. In
>> practice this hasn't been a problem and I haven't been able to
>> construct a test case so far that would fail in such a case. This is
>> probably because most tear downs of the other vCPU TLBs tend to be
>> done while the other vCPUs are not doing much. If anyone can come up
>> with a test case that would fail if this assumption isn't met then
>> please let me know.
>
> Have you tried implementing ARM's DMB semantics correctly?

I've implemented a stricter semantics with the proof of concept patch
bellow.

I'm not sure how to do it on the DMB instruction itself at the
concept of a pending flush is a run-time rather than a translation-time
concept. Is this the sort of state that could be pushed into the
Translation flags? I suspect forcing generation of safe work
synchronisation points for all DMBs would slow things down a lot.
Usually the DMB's will be right after the flushes but not always so I
doubt you can guarantee they will be in the same basic block.

Thoughts?


--8<---------------cut here---------------start------------->8---

target-arm: ensure tlbi_aa64_vae1is_write completes (POC)

Previously flushes on other vCPUs would only get serviced when they
exited their TranslationBlocks. While this isn't overly problematic it
violates the semantics of TLB flush from the point of view of source
vCPU.

This proof-of-concept solves this by introducing a new
tlb_flush_all_page_by_mmuidx which ensures all TLB flushes are completed
by the time execution continues on the vCPU. It does this by creating
a synchronisation point by scheduling its own flush via async_safe_work
and exiting the execution loop. Once the safe work has executed all TLBs
will have been updated.

4 files changed, 53 insertions(+), 9 deletions(-)
cputlb.c                   | 33 +++++++++++++++++++++++++++++++++
include/exec/exec-all.h    | 11 +++++++++++
target-arm/helper.c        | 17 ++++++++---------
target-arm/translate-a64.c |  1 +

modified   cputlb.c
@@ -354,6 +354,39 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
     }
 }

+/* This function affects all vCPUs are will ensure all work is
+ * complete by the time the loop restarts
+ */
+void tlb_flush_all_page_by_mmuidx(CPUState *src_cpu, target_ulong addr, ...)
+{
+    unsigned long mmu_idx_bitmap;
+    target_ulong addr_and_mmu_idx;
+    va_list argp;
+    CPUState *other_cs;
+
+    va_start(argp, addr);
+    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
+    va_end(argp);
+
+    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%lx\n", addr, mmu_idx_bitmap);
+
+    /* This should already be page aligned */
+    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
+    addr_and_mmu_idx |= mmu_idx_bitmap;
+
+    CPU_FOREACH(other_cs) {
+        if (other_cs != src_cpu) {
+            async_run_on_cpu(other_cs, tlb_check_page_and_flush_by_mmuidx_async_work,
+                             RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+        } else {
+            async_safe_run_on_cpu(other_cs, tlb_check_page_and_flush_by_mmuidx_async_work,
+                                  RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
+        }
+    }
+
+    cpu_loop_exit(src_cpu);
+}
+
 void tlb_flush_page_all(target_ulong addr)
 {
     CPUState *cpu;
modified   include/exec/exec-all.h
@@ -115,6 +115,17 @@ void tlb_flush(CPUState *cpu, int flush_global);
  */
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...);
 /**
+ * tlb_flush_all_page_by_mmuidx:
+ * @cpu: Originating CPU of the flush
+ * @addr: virtual address of page to be flushed
+ * @...: list of MMU indexes to flush, terminated by a negative value
+ *
+ * Flush one page from the TLB of all CPUs, for the specified
+ * MMU indexes. This function does not return, the run loop will exit
+ * and restart once the flush is completed.
+ */
+void tlb_flush_all_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...);
+/**
  * tlb_flush_by_mmuidx:
  * @cpu: CPU whose TLB should be flushed
  * @...: list of MMU indexes to flush, terminated by a negative value
modified   target-arm/helper.c
@@ -3047,20 +3047,19 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
--8<---------------cut here---------------end--------------->8---
 static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
                                    uint64_t value)
 {
+    ARMCPU *cpu = arm_env_get_cpu(env);
+    CPUState *cs = CPU(cpu);
     bool sec = arm_is_secure_below_el3(env);
-    CPUState *other_cs;
     uint64_t pageaddr = sextract64(value << 12, 0, 56);

     fprintf(stderr,"%s: dbg\n", __func__);

-    CPU_FOREACH(other_cs) {
-        if (sec) {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S1SE1,
-                                     ARMMMUIdx_S1SE0, -1);
-        } else {
-            tlb_flush_page_by_mmuidx(other_cs, pageaddr, ARMMMUIdx_S12NSE1,
-                                     ARMMMUIdx_S12NSE0, -1);
-        }
+    if (sec) {
+        tlb_flush_all_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S1SE1,
+                                 ARMMMUIdx_S1SE0, -1);
+    } else {
+        tlb_flush_all_page_by_mmuidx(cs, pageaddr, ARMMMUIdx_S12NSE1,
+                                 ARMMMUIdx_S12NSE0, -1);
     }
 }

modified   target-arm/translate-a64.c
@@ -1588,6 +1588,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
         } else if (ri->writefn) {
             TCGv_ptr tmpptr;
             tmpptr = tcg_const_ptr(ri);
+            gen_a64_set_pc_im(s->pc);
             gen_helper_set_cp_reg64(cpu_env, tmpptr, tcg_rt);
             tcg_temp_free_ptr(tmpptr);
         } else {

--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
@ 2016-11-09 19:36   ` Pranith Kumar
  2016-11-10 16:14     ` Alex Bennée
  2016-11-10 17:23   ` Richard Henderson
  1 sibling, 1 reply; 50+ messages in thread
From: Pranith Kumar @ 2016-11-09 19:36 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota, nikunj,
	mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Peter Crosthwaite


Hi Alex,

This patch is causing some build errors on a 32-bit box:

In file included from /home/pranith/qemu/include/exec/exec-all.h:44:0,
                 from /home/pranith/qemu/cputlb.c:23:
/home/pranith/qemu/cputlb.c: In function ‘tlb_flush_page_by_mmuidx_async_work’:
/home/pranith/qemu/cputlb.c:54:36: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 5 has type ‘long unsigned int’ [-Werror=format=]
         qemu_log_mask(CPU_LOG_MMU, "%s: " fmt, __func__, \
                                    ^
/home/pranith/qemu/include/qemu/log.h:94:22: note: in definition of macro ‘qemu_log_mask’
             qemu_log(FMT, ## __VA_ARGS__);              \
                      ^~~
/home/pranith/qemu/cputlb.c:286:5: note: in expansion of macro ‘tlb_debug’
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
/home/pranith/qemu/cputlb.c:57:25: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 6 has type ‘long unsigned int’ [-Werror=format=]
         fprintf(stderr, "%s: " fmt, __func__, ## __VA_ARGS__); \
                         ^
/home/pranith/qemu/cputlb.c:286:5: note: in expansion of macro ‘tlb_debug’
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
cc1: all warnings being treated as errors

Thanks,

Alex Bennée writes:

> The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
> in TLB entries to force the slow-path on writes. This is used to mark
> page ranges containing code which has been translated so it can be
> invalidated if written to. To do this safely we need to ensure the TLB
> entries in question for all vCPUs are updated before we attempt to run
> the code otherwise a race could be introduced.
>
> To achieve this we atomically set the flag in tlb_reset_dirty_range and
> take care when setting it when the TLB entry is filled.
>
> The helper function is made static as it isn't used outside of cputlb.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
> ---
> v6
>   - use TARGET_PAGE_BITS_MIN
>   - use run_on_cpu helpers
> ---
>  cputlb.c              | 250 +++++++++++++++++++++++++++++++++++++-------------
>  include/exec/cputlb.h |   2 -
>  include/qom/cpu.h     |  12 +--
>  3 files changed, 194 insertions(+), 70 deletions(-)
>
> diff --git a/cputlb.c b/cputlb.c
> index cd1ff71..ae94b7f 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -68,6 +68,11 @@
>   * target_ulong even on 32 bit builds */
>  QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
>  
> +/* We currently can't handle more than 16 bits in the MMUIDX bitmask.
> + */
> +QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
> +#define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
> +
>  /* statistics */
>  int tlb_flush_count;
>  
> @@ -98,7 +103,7 @@ static void tlb_flush_nocheck(CPUState *cpu, int flush_global)
>  
>      tb_unlock();
>  
> -    atomic_mb_set(&cpu->pending_tlb_flush, false);
> +    atomic_mb_set(&cpu->pending_tlb_flush, 0);
>  }
>  
>  static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
> @@ -121,7 +126,8 @@ static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
>  void tlb_flush(CPUState *cpu, int flush_global)
>  {
>      if (cpu->created && !qemu_cpu_is_self(cpu)) {
> -        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
> +        if (atomic_mb_read(&cpu->pending_tlb_flush) != ALL_MMUIDX_BITS) {
> +            atomic_mb_set(&cpu->pending_tlb_flush, ALL_MMUIDX_BITS);
>              async_run_on_cpu(cpu, tlb_flush_global_async_work,
>                               RUN_ON_CPU_HOST_INT(flush_global));
>          }
> @@ -130,39 +136,78 @@ void tlb_flush(CPUState *cpu, int flush_global)
>      }
>  }
>  
> -static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
> +static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
>  {
>      CPUArchState *env = cpu->env_ptr;
> +    unsigned long mmu_idx_bitmask = data.host_ulong;
> +    int mmu_idx;
>  
>      assert_cpu_is_self(cpu);
> -    tlb_debug("start\n");
>  
>      tb_lock();
>  
> -    for (;;) {
> -        int mmu_idx = va_arg(argp, int);
> +    tlb_debug("start: mmu_idx:0x%04lx\n", mmu_idx_bitmask);
>  
> -        if (mmu_idx < 0) {
> -            break;
> -        }
> +    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
>  
> -        tlb_debug("%d\n", mmu_idx);
> +        if (test_bit(mmu_idx, &mmu_idx_bitmask)) {
> +            tlb_debug("%d\n", mmu_idx);
>  
> -        memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
> -        memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
> +            memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
> +            memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
> +        }
>      }
>  
>      memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
>  
> +    tlb_debug("done\n");
> +
>      tb_unlock();
>  }
>  
> +/* Helper function to slurp va_args list into a bitmap
> + */
> +static inline unsigned long make_mmu_index_bitmap(va_list args)
> +{
> +    unsigned long bitmap = 0;
> +    int mmu_index = va_arg(args, int);
> +
> +    /* An empty va_list would be a bad call */
> +    g_assert(mmu_index > 0);
> +
> +    do {
> +        set_bit(mmu_index, &bitmap);
> +        mmu_index = va_arg(args, int);
> +    } while (mmu_index >= 0);
> +
> +    return bitmap;
> +}
> +
>  void tlb_flush_by_mmuidx(CPUState *cpu, ...)
>  {
>      va_list argp;
> +    unsigned long mmu_idx_bitmap;
> +
>      va_start(argp, cpu);
> -    v_tlb_flush_by_mmuidx(cpu, argp);
> +    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
>      va_end(argp);
> +
> +    tlb_debug("mmu_idx: 0x%04lx\n", mmu_idx_bitmap);
> +
> +    if (!qemu_cpu_is_self(cpu)) {
> +        uint16_t pending_flushes =
> +            mmu_idx_bitmap & ~atomic_mb_read(&cpu->pending_tlb_flush);
> +        if (pending_flushes) {
> +            tlb_debug("reduced mmu_idx: 0x%" PRIx16 "\n", pending_flushes);
> +
> +            atomic_or(&cpu->pending_tlb_flush, pending_flushes);
> +            async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
> +                             RUN_ON_CPU_HOST_INT(pending_flushes));
> +        }
> +    } else {
> +        tlb_flush_by_mmuidx_async_work(cpu,
> +                                       RUN_ON_CPU_HOST_ULONG(mmu_idx_bitmap));
> +    }
>  }
>  
>  static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
> @@ -227,16 +272,50 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
>      }
>  }
>  
> -void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
> +/* As we are going to hijack the bottom bits of the page address for a
> + * mmuidx bit mask we need to fail to build if we can't do that
> + */
> +QEMU_BUILD_BUG_ON(NB_MMU_MODES > TARGET_PAGE_BITS_MIN);
> +
> +static void tlb_flush_page_by_mmuidx_async_work(CPUState *cpu,
> +                                                run_on_cpu_data data)
>  {
>      CPUArchState *env = cpu->env_ptr;
> -    int i, k;
> -    va_list argp;
> -
> -    va_start(argp, addr);
> +    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
> +    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
> +    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
> +    int page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> +    int mmu_idx;
> +    int i;
>  
>      assert_cpu_is_self(cpu);
> -    tlb_debug("addr "TARGET_FMT_lx"\n", addr);
> +
> +    tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
> +              page, addr, mmu_idx_bitmap);
> +
> +    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> +        if (test_bit(mmu_idx, &mmu_idx_bitmap)) {
> +            tlb_flush_entry(&env->tlb_table[mmu_idx][page], addr);
> +
> +            /* check whether there are vltb entries that need to be flushed */
> +            for (i = 0; i < CPU_VTLB_SIZE; i++) {
> +                tlb_flush_entry(&env->tlb_v_table[mmu_idx][i], addr);
> +            }
> +        }
> +    }
> +
> +    tb_flush_jmp_cache(cpu, addr);
> +}
> +
> +static void tlb_check_page_and_flush_by_mmuidx_async_work(CPUState *cpu,
> +                                                          run_on_cpu_data data)
> +{
> +    CPUArchState *env = cpu->env_ptr;
> +    target_ulong addr_and_mmuidx = (target_ulong) data.target_ptr;
> +    target_ulong addr = addr_and_mmuidx & TARGET_PAGE_MASK;
> +    unsigned long mmu_idx_bitmap = addr_and_mmuidx & ALL_MMUIDX_BITS;
> +
> +    tlb_debug("addr:"TARGET_FMT_lx" mmu_idx: %04lx\n", addr, mmu_idx_bitmap);
>  
>      /* Check if we need to flush due to large pages.  */
>      if ((addr & env->tlb_flush_mask) == env->tlb_flush_addr) {
> @@ -244,33 +323,35 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
>                    TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
>                    env->tlb_flush_addr, env->tlb_flush_mask);
>  
> -        v_tlb_flush_by_mmuidx(cpu, argp);
> -        va_end(argp);
> -        return;
> +        tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_ULONG(mmu_idx_bitmap));
> +    } else {
> +        tlb_flush_page_by_mmuidx_async_work(cpu, data);
>      }
> +}
>  
> -    addr &= TARGET_PAGE_MASK;
> -    i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> -
> -    for (;;) {
> -        int mmu_idx = va_arg(argp, int);
> +void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
> +{
> +    unsigned long mmu_idx_bitmap;
> +    target_ulong addr_and_mmu_idx;
> +    va_list argp;
>  
> -        if (mmu_idx < 0) {
> -            break;
> -        }
> +    va_start(argp, addr);
> +    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
> +    va_end(argp);
>  
> -        tlb_debug("idx %d\n", mmu_idx);
> +    tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%lx\n", addr, mmu_idx_bitmap);
>  
> -        tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
> +    /* This should already be page aligned */
> +    addr_and_mmu_idx = addr & TARGET_PAGE_MASK;
> +    addr_and_mmu_idx |= mmu_idx_bitmap;
>  
> -        /* check whether there are vltb entries that need to be flushed */
> -        for (k = 0; k < CPU_VTLB_SIZE; k++) {
> -            tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
> -        }
> +    if (!qemu_cpu_is_self(cpu)) {
> +        async_run_on_cpu(cpu, tlb_check_page_and_flush_by_mmuidx_async_work,
> +                         RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
> +    } else {
> +        tlb_check_page_and_flush_by_mmuidx_async_work(
> +            cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx));
>      }
> -    va_end(argp);
> -
> -    tb_flush_jmp_cache(cpu, addr);
>  }
>  
>  void tlb_flush_page_all(target_ulong addr)
> @@ -298,32 +379,50 @@ void tlb_unprotect_code(ram_addr_t ram_addr)
>      cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_CODE);
>  }
>  
> -static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
> -{
> -    return (tlbe->addr_write & (TLB_INVALID_MASK|TLB_MMIO|TLB_NOTDIRTY)) == 0;
> -}
>  
> -void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
> +/*
> + * Dirty write flag handling
> + *
> + * When the TCG code writes to a location it looks up the address in
> + * the TLB and uses that data to compute the final address. If any of
> + * the lower bits of the address are set then the slow path is forced.
> + * There are a number of reasons to do this but for normal RAM the
> + * most usual is detecting writes to code regions which may invalidate
> + * generated code.
> + *
> + * Because we want other vCPUs to respond to changes straight away we
> + * update the te->addr_write field atomically. If the TLB entry has
> + * been changed by the vCPU in the mean time we skip the update.
> + */
> +
> +static void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
>                             uintptr_t length)
>  {
> -    uintptr_t addr;
> +    /* paired with atomic_mb_set in tlb_set_page_with_attrs */
> +    uintptr_t orig_addr = atomic_mb_read(&tlb_entry->addr_write);
> +    uintptr_t addr = orig_addr;
>  
> -    if (tlb_is_dirty_ram(tlb_entry)) {
> -        addr = (tlb_entry->addr_write & TARGET_PAGE_MASK) + tlb_entry->addend;
> +    if ((addr & (TLB_INVALID_MASK | TLB_MMIO | TLB_NOTDIRTY)) == 0) {
> +        addr &= TARGET_PAGE_MASK;
> +        addr += atomic_read(&tlb_entry->addend);
>          if ((addr - start) < length) {
> -            tlb_entry->addr_write |= TLB_NOTDIRTY;
> +            uintptr_t notdirty_addr = orig_addr | TLB_NOTDIRTY;
> +            atomic_cmpxchg(&tlb_entry->addr_write, orig_addr, notdirty_addr);
>          }
>      }
>  }
>  
> +/* This is a cross vCPU call (i.e. another vCPU resetting the flags of
> + * the target vCPU). As such care needs to be taken that we don't
> + * dangerously race with another vCPU update. The only thing actually
> + * updated is the target TLB entry ->addr_write flags.
> + */
>  void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
>  {
>      CPUArchState *env;
>  
>      int mmu_idx;
>  
> -    assert_cpu_is_self(cpu);
> -
>      env = cpu->env_ptr;
>      for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
>          unsigned int i;
> @@ -409,9 +508,9 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
>      MemoryRegionSection *section;
>      unsigned int index;
>      target_ulong address;
> -    target_ulong code_address;
> +    target_ulong code_address, write_address;
>      uintptr_t addend;
> -    CPUTLBEntry *te;
> +    CPUTLBEntry *te, *tv;
>      hwaddr iotlb, xlat, sz;
>      unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
>      int asidx = cpu_asidx_from_attrs(cpu, attrs);
> @@ -446,15 +545,21 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
>  
>      index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
>      te = &env->tlb_table[mmu_idx][index];
> -
>      /* do not discard the translation in te, evict it into a victim tlb */
> -    env->tlb_v_table[mmu_idx][vidx] = *te;
> +    tv = &env->tlb_v_table[mmu_idx][vidx];
> +
> +    /* addr_write can race with tlb_reset_dirty_range_all */
> +    tv->addr_read = te->addr_read;
> +    atomic_set(&tv->addr_write, atomic_read(&te->addr_write));
> +    tv->addr_code = te->addr_code;
> +    atomic_set(&tv->addend, atomic_read(&te->addend));
> +
>      env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
>  
>      /* refill the tlb */
>      env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
>      env->iotlb[mmu_idx][index].attrs = attrs;
> -    te->addend = addend - vaddr;
> +    atomic_set(&te->addend, addend - vaddr);
>      if (prot & PAGE_READ) {
>          te->addr_read = address;
>      } else {
> @@ -466,21 +571,24 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
>      } else {
>          te->addr_code = -1;
>      }
> +
> +    write_address = -1;
>      if (prot & PAGE_WRITE) {
>          if ((memory_region_is_ram(section->mr) && section->readonly)
>              || memory_region_is_romd(section->mr)) {
>              /* Write access calls the I/O callback.  */
> -            te->addr_write = address | TLB_MMIO;
> +            write_address = address | TLB_MMIO;
>          } else if (memory_region_is_ram(section->mr)
>                     && cpu_physical_memory_is_clean(
>                          memory_region_get_ram_addr(section->mr) + xlat)) {
> -            te->addr_write = address | TLB_NOTDIRTY;
> +            write_address = address | TLB_NOTDIRTY;
>          } else {
> -            te->addr_write = address;
> +            write_address = address;
>          }
> -    } else {
> -        te->addr_write = -1;
>      }
> +
> +    /* Pairs with flag setting in tlb_reset_dirty_range */
> +    atomic_mb_set(&te->addr_write, write_address);
>  }
>  
>  /* Add a new TLB entry, but without specifying the memory
> @@ -643,10 +751,28 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
>          if (cmp == page) {
>              /* Found entry in victim tlb, swap tlb and iotlb.  */
>              CPUTLBEntry tmptlb, *tlb = &env->tlb_table[mmu_idx][index];
> +
> +            /* tmptlb = *tlb; */
> +            /* addr_write can race with tlb_reset_dirty_range_all */
> +            tmptlb.addr_read = tlb->addr_read;
> +            tmptlb.addr_write = atomic_read(&tlb->addr_write);
> +            tmptlb.addr_code = tlb->addr_code;
> +            tmptlb.addend = atomic_read(&tlb->addend);
> +
> +            /* *tlb = *vtlb; */
> +            tlb->addr_read = vtlb->addr_read;
> +            atomic_set(&tlb->addr_write, atomic_read(&vtlb->addr_write));
> +            tlb->addr_code = vtlb->addr_code;
> +            atomic_set(&tlb->addend, atomic_read(&vtlb->addend));
> +
> +            /* *vtlb = tmptlb; */
> +            vtlb->addr_read = tmptlb.addr_read;
> +            atomic_set(&vtlb->addr_write, tmptlb.addr_write);
> +            vtlb->addr_code = tmptlb.addr_code;
> +            atomic_set(&vtlb->addend, tmptlb.addend);
> +
>              CPUIOTLBEntry tmpio, *io = &env->iotlb[mmu_idx][index];
>              CPUIOTLBEntry *vio = &env->iotlb_v[mmu_idx][vidx];
> -
> -            tmptlb = *tlb; *tlb = *vtlb; *vtlb = tmptlb;
>              tmpio = *io; *io = *vio; *vio = tmpio;
>              return true;
>          }
> diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
> index d454c00..3f94178 100644
> --- a/include/exec/cputlb.h
> +++ b/include/exec/cputlb.h
> @@ -23,8 +23,6 @@
>  /* cputlb.c */
>  void tlb_protect_code(ram_addr_t ram_addr);
>  void tlb_unprotect_code(ram_addr_t ram_addr);
> -void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
> -                           uintptr_t length);
>  extern int tlb_flush_count;
>  
>  #endif
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 880ba42..d945221 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -388,17 +388,17 @@ struct CPUState {
>       */
>      bool throttle_thread_scheduled;
>  
> +    /* The pending_tlb_flush flag is set and cleared atomically to
> +     * avoid potential races. The aim of the flag is to avoid
> +     * unnecessary flushes.
> +     */
> +    uint16_t pending_tlb_flush;
> +
>      /* Note that this is accessed at the start of every TB via a negative
>         offset from AREG0.  Leave this field at the end so as to make the
>         (absolute value) offset as small as possible.  This reduces code
>         size, especially for hosts without large memory offsets.  */
>      uint32_t tcg_exit_req;
> -
> -    /* The pending_tlb_flush flag is set and cleared atomically to
> -     * avoid potential races. The aim of the flag is to avoid
> -     * unnecessary flushes.
> -     */
> -    bool pending_tlb_flush;
>  };
>  
>  QTAILQ_HEAD(CPUTailQ, CPUState);

-- 
Pranith

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt Alex Bennée
@ 2016-11-10 15:00   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 15:00 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: mttcg, peter.maydell, claudio.fontana, nikunj, jan.kiszka,
	mark.burton, a.rigo, qemu-devel, cota, serge.fdrv, bobby.prani,
	fred.konrad

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> This documents the current design for upgrading TCG emulation to take
> advantage of modern CPUs by running a thread-per-CPU. The document goes
> through the various areas of the code affected by such a change and
> proposes design requirements for each part of the solution.
>
> The text marked with (Current solution[s]) to document what the current
> approaches being used are.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Richard Henderson <rth@twiddle.net>

> +    - target-i386
> +    - target-arm
> +    - target-aarch64
> +    - target-alpha

If you're going to list these, target-mips is also updated.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
@ 2016-11-10 15:10   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 15:10 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> Currently we rely on the side effect of the main loop grabbing the
> iothread_mutex to give any long running basic block chains a kick to
> ensure the next vCPU is scheduled. As this code is being re-factored and
> rationalised we now do it explicitly here.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution Alex Bennée
@ 2016-11-10 15:18   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 15:18 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite, Michael S. Tsirkin,
	Eduardo Habkost, David Gibson, Alexander Graf, open list:sPAPR

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
>
> This finally allows TCG to benefit from the iothread introduction: Drop
> the global mutex while running pure TCG CPU code. Reacquire the lock
> when entering MMIO or PIO emulation, or when leaving the TCG loop.
>
> We have to revert a few optimization for the current TCG threading
> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
> kicking it in qemu_cpu_kick. We also need to disable RAM block
> reordering until we have a more efficient locking mechanism at hand.
>
> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
> These numbers demonstrate where we gain something:
>
> 20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
> 20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm
>
> The guest CPU was fully loaded, but the iothread could still run mostly
> independent on a second core. Without the patch we don't get beyond
>
> 32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
> 32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm
>
> We don't benefit significantly, though, when the guest is not fully
> loading a host CPU.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
> Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
> [EGC: fixed iothread lock for cpu-exec IRQ handling]
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> [AJB: -smp single-threaded fix, rm old info from commit msg, review updates]
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
> ---

Reviewed-by: Richard Henderson <rth@twiddle.net>

r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-09 19:36   ` Pranith Kumar
@ 2016-11-10 16:14     ` Alex Bennée
  2016-11-10 17:27       ` Richard Henderson
  0 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 16:14 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota, nikunj,
	mark.burton, jan.kiszka, serge.fdrv, rth, peter.maydell,
	claudio.fontana, Peter Crosthwaite


Pranith Kumar <bobby.prani@gmail.com> writes:

> Hi Alex,
>
> This patch is causing some build errors on a 32-bit box:
>
> In file included from /home/pranith/qemu/include/exec/exec-all.h:44:0,
>                  from /home/pranith/qemu/cputlb.c:23:
> /home/pranith/qemu/cputlb.c: In function ‘tlb_flush_page_by_mmuidx_async_work’:
> /home/pranith/qemu/cputlb.c:54:36: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 5 has type ‘long unsigned int’ [-Werror=format=]
>          qemu_log_mask(CPU_LOG_MMU, "%s: " fmt, __func__, \
>                                     ^
> /home/pranith/qemu/include/qemu/log.h:94:22: note: in definition of macro ‘qemu_log_mask’
>              qemu_log(FMT, ## __VA_ARGS__);              \
>                       ^~~
> /home/pranith/qemu/cputlb.c:286:5: note: in expansion of macro ‘tlb_debug’
>      tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
>      ^~~~~~~~~
> /home/pranith/qemu/cputlb.c:57:25: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 6 has type ‘long unsigned int’ [-Werror=format=]
>          fprintf(stderr, "%s: " fmt, __func__, ## __VA_ARGS__); \
>                          ^
> /home/pranith/qemu/cputlb.c:286:5: note: in expansion of macro ‘tlb_debug’
>      tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
>      ^~~~~~~~~
> cc1: all warnings being treated as errors
>
> Thanks,
>
<snip>
>> +/*
>> + * Dirty write flag handling
>> + *
>> + * When the TCG code writes to a location it looks up the address in
>> + * the TLB and uses that data to compute the final address. If any of
>> + * the lower bits of the address are set then the slow path is forced.
>> + * There are a number of reasons to do this but for normal RAM the
>> + * most usual is detecting writes to code regions which may invalidate
>> + * generated code.
>> + *
>> + * Because we want other vCPUs to respond to changes straight away we
>> + * update the te->addr_write field atomically. If the TLB entry has
>> + * been changed by the vCPU in the mean time we skip the update.
>> + */
>> +
>> +static void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
>>                             uintptr_t length)
>>  {
>> -    uintptr_t addr;
>> +    /* paired with atomic_mb_set in tlb_set_page_with_attrs */
>> +    uintptr_t orig_addr = atomic_mb_read(&tlb_entry->addr_write);
>> +    uintptr_t addr = orig_addr;
>>
>> -    if (tlb_is_dirty_ram(tlb_entry)) {
>> -        addr = (tlb_entry->addr_write & TARGET_PAGE_MASK) + tlb_entry->addend;
>> +    if ((addr & (TLB_INVALID_MASK | TLB_MMIO | TLB_NOTDIRTY)) == 0) {
>> +        addr &= TARGET_PAGE_MASK;
>> +        addr += atomic_read(&tlb_entry->addend);
>>          if ((addr - start) < length) {
>> -            tlb_entry->addr_write |= TLB_NOTDIRTY;
>> +            uintptr_t notdirty_addr = orig_addr | TLB_NOTDIRTY;
>> +            atomic_cmpxchg(&tlb_entry->addr_write, orig_addr, notdirty_addr);
>>          }
>>      }
>>  }

Even worse than that we trip up the atomic.h QEMU_BUILD_BUG_ON with the
atomic_cmpxchg. Now I believe we can use atomic_cmpxchg__nocheck without
too much issue on x86 but we'll need to #ifdef it on detection of wide
atomics.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU Alex Bennée
@ 2016-11-10 16:35   ` Richard Henderson
  2016-11-10 16:46     ` Alex Bennée
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 16:35 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> +    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
> +        parallel_cpus = true;

Why are we setting this here,

>          cpu->thread = g_malloc0(sizeof(QemuThread));
>          cpu->halt_cond = g_malloc0(sizeof(QemuCond));
>          qemu_cond_init(cpu->halt_cond);
> +
> +        if (qemu_tcg_mttcg_enabled()) {
> +            /* create a thread per vCPU with TCG (MTTCG) */

and not here?

> +            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>                   cpu->cpu_index);
> -        qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
> -                           cpu, QEMU_THREAD_JOINABLE);
> +
> +            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
> +                               cpu, QEMU_THREAD_JOINABLE);
> +
> +        } else {
> +            /* share a single thread for all cpus with TCG */
> +            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
> +            qemu_thread_create(cpu->thread, thread_name,

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
@ 2016-11-10 16:36   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 16:36 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> From: Pranith Kumar <bobby.prani@gmail.com>
>
> The patch enables handling atomic code in the guest. This should be
> preferably done in cpu_handle_exception(), but the current assumptions
> regarding when we can execute atomic sections cause a deadlock.
>
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> [AJB: tweak title]
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c | 9 +++++++++
>  1 file changed, 9 insertions(+)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks Alex Bennée
@ 2016-11-10 16:39   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 16:39 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> For SoftMMU the TLB flushes are an example of a task that can be
> triggered on one vCPU by another. To deal with this properly we need to
> use safe work to ensure these changes are done safely. The new assert
> can be enabled while debugging to catch these cases.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cputlb.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU
  2016-11-10 16:35   ` Richard Henderson
@ 2016-11-10 16:46     ` Alex Bennée
  0 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 16:46 UTC (permalink / raw)
  To: Richard Henderson
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana, Peter Crosthwaite


Richard Henderson <rth@twiddle.net> writes:

> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>> +    if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>> +        parallel_cpus = true;
>
> Why are we setting this here,
>
>>          cpu->thread = g_malloc0(sizeof(QemuThread));
>>          cpu->halt_cond = g_malloc0(sizeof(QemuCond));
>>          qemu_cond_init(cpu->halt_cond);
>> +
>> +        if (qemu_tcg_mttcg_enabled()) {
>> +            /* create a thread per vCPU with TCG (MTTCG) */
>
> and not here?

Good point, I'll fix that.

>
>> +            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>                   cpu->cpu_index);
>> -        qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
>> -                           cpu, QEMU_THREAD_JOINABLE);
>> +
>> +            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
>> +                               cpu, QEMU_THREAD_JOINABLE);
>> +
>> +        } else {
>> +            /* share a single thread for all cpus with TCG */
>> +            snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
>> +            qemu_thread_create(cpu->thread, thread_name,
>
> Otherwise,
>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
>
> r~


--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work.
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work Alex Bennée
@ 2016-11-10 16:48   ` Richard Henderson
  2016-11-10 17:34     ` Alex Bennée
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 16:48 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> +void tlb_flush_page_all(target_ulong addr)

It's a nit, but when I read this I think all pages, not all cpus.
Can we rename this tlb_fluch_page_all_cpus?

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
@ 2016-11-10 16:51   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 16:51 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> This moves the helper function closer to where it is called and updates
> the error message to report via error_report instead of the deprecated
> fprintf.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cputlb.c | 24 ++++++++++++------------
>  1 file changed, 12 in

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
  2016-11-09 19:36   ` Pranith Kumar
@ 2016-11-10 17:23   ` Richard Henderson
  2016-11-10 18:07     ` Alex Bennée
  1 sibling, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:23 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> +/* We currently can't handle more than 16 bits in the MMUIDX bitmask.
> + */
> +QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);

We already assert <= 12 in exec/cpu_ldst.h.  Although really any such assert 
belongs in exec/cpu-defs.h, where we define CPU_TLB_BITS et al.

That said, what's the technical restriction here?


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-10 16:14     ` Alex Bennée
@ 2016-11-10 17:27       ` Richard Henderson
  2016-11-10 18:00         ` Alex Bennée
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:27 UTC (permalink / raw)
  To: Alex Bennée, Pranith Kumar
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota, nikunj,
	mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/10/2016 05:14 PM, Alex Bennée wrote:
> Even worse than that we trip up the atomic.h QEMU_BUILD_BUG_ON with the
> atomic_cmpxchg. Now I believe we can use atomic_cmpxchg__nocheck without
> too much issue on x86 but we'll need to #ifdef it on detection of wide
> atomics.

You've already got CONFIG_ATOMIC64.  And what's the fallback?

We ought not be enabling mttcg for 32-bit host and 64-bit guest at all.  But 
that doesn't help much here, where we're otherwise guest width agnostic.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work.
  2016-11-10 16:48   ` Richard Henderson
@ 2016-11-10 17:34     ` Alex Bennée
  2016-11-10 17:40       ` Richard Henderson
  0 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 17:34 UTC (permalink / raw)
  To: Richard Henderson
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana, Peter Crosthwaite


Richard Henderson <rth@twiddle.net> writes:

> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>> +void tlb_flush_page_all(target_ulong addr)
>
> It's a nit, but when I read this I think all pages, not all cpus.
> Can we rename this tlb_fluch_page_all_cpus?

So to properly support ARM TLB flush semantics I want to move some of
the looping in the helpers into cputlb.c so I'm thinking we'll have:

tlb_flush_page_all_cpus
tlb_flush_by_mmuidx_all_cpus
tlb_flush_page_by_mmuidx_all_cpus

Which will have the initial parameters of at least

  CPUState *src, bool sync

Where src is the source vCPU of the flush request and sync will cause
the source vCPU to schedule its work as safe work and do a
cpu_loop_exit. This will allow the helpers to ensure TLB flushes are in
a known state after executing the helper.

In fact for ARM we'll be able to put off the reckoning until a DMB
instruction comes along and we can force synchronisation at that point
but I'm assuming there must be other architectures with stricter
requirements.


>
> Otherwise,
>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
>
> r~


--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
@ 2016-11-10 17:35   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:35 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, open list:ARM

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> +        /* We are requested to boot in AArch32 mode */
> +        static uint32_t mode_for_el[] = { 0,
> +                                          ARM_CPU_MODE_SVC,
> +                                          ARM_CPU_MODE_HYP,
> +                                          ARM_CPU_MODE_SVC };

I know you're just moving most of this code, but, const.

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work.
  2016-11-10 17:34     ` Alex Bennée
@ 2016-11-10 17:40       ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:40 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana, Peter Crosthwaite

On 11/10/2016 06:34 PM, Alex Bennée wrote:
> So to properly support ARM TLB flush semantics I want to move some of
> the looping in the helpers into cputlb.c so I'm thinking we'll have:
>
> tlb_flush_page_all_cpus
> tlb_flush_by_mmuidx_all_cpus
> tlb_flush_page_by_mmuidx_all_cpus

Sounds good, thanks.

> In fact for ARM we'll be able to put off the reckoning until a DMB
> instruction comes along and we can force synchronisation at that point
> but I'm assuming there must be other architectures with stricter
> requirements.

Yes, I can think of at least one arch for which the cross-cpu flush must finish 
before the source cpu continues.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it Alex Bennée
@ 2016-11-10 17:48   ` Richard Henderson
  2016-11-10 18:08     ` Alex Bennée
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:48 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, open list:ARM

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> +#ifdef CONFIG_SOFTMMU
> +    memset(env, 0, offsetof(CPUARMState, tlb_table));
> +    tlb_flush(s, 0);
> +#else
>      memset(env, 0, offsetof(CPUARMState, features));
> +#endif

I'd really prefer to see the tlb_flush be moved into parent_reset, so that we 
handle it identically for all targets.

As for the memset, do we really need to distinguish softmmu?  I don't like you 
picking out a variable name within CPU_COMMON.  Better to use empty struct 
markers, like the

       struct {} start_init_save;

that x86 uses.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access Alex Bennée
@ 2016-11-10 17:54   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:54 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, open list:ARM cores

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> Most ARMCPRegInfo structures just allow updating of the CPU field.
> However some have more complex operations that *may* be have cross vCPU
> effects therefor need to be serialised. The most obvious examples at the
> moment are things that affect the GICv3 IRQ controller. To avoid
> applying this requirement to all registers with custom access functions
> we check for if the type is marked ARM_CP_IO.
>
> By default all MMIO access to devices already takes the BQL to serialise
> hardware emulation.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  hw/intc/arm_gicv3_cpuif.c |  3 +++
>  target-arm/op_helper.c    | 39 +++++++++++++++++++++++++++++++++++----
>  2 files changed, 38 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL Alex Bennée
@ 2016-11-10 17:56   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:56 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, open list:ARM

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> As the arm_call_el_change_hook may affect global state (for example with
> updating the global GIC state) we need to assert/take the BQL.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  target-arm/helper.c    | 6 ++++++
>  target-arm/op_helper.c | 4 ++++
>  2 files changed, 10 insertions(+)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
@ 2016-11-10 17:59   ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 17:59 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, open list:ARM

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> The WFE and YIELD instructions are really only hints and in TCG's case
> they were useful to move the scheduling on from one vCPU to the next. In
> the parallel context (MTTCG) this just causes an unnecessary cpu_exit
> and contention of the BQL.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  target-arm/op_helper.c     |  7 +++++++
>  target-arm/translate-a64.c |  8 ++++++--
>  target-arm/translate.c     | 20 ++++++++++++++++----
>  3 files changed, 29 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-10 17:27       ` Richard Henderson
@ 2016-11-10 18:00         ` Alex Bennée
  2016-11-10 18:32           ` Richard Henderson
  0 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 18:00 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Pranith Kumar, pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo,
	cota, nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite


Richard Henderson <rth@twiddle.net> writes:

> On 11/10/2016 05:14 PM, Alex Bennée wrote:
>> Even worse than that we trip up the atomic.h QEMU_BUILD_BUG_ON with the
>> atomic_cmpxchg. Now I believe we can use atomic_cmpxchg__nocheck without
>> too much issue on x86 but we'll need to #ifdef it on detection of wide
>> atomics.
>
> You've already got CONFIG_ATOMIC64.  And what's the fallback?

I'm going to re-factor cputlb a bit so all the TLB read and write's can
be done in helper functions so I don't scatter stuff around too much. I
was thinking something like:

#ifdef CONFIG_ATOMIC64
  .. as usual ..
#else
  assert(!parallel_cpus)
  .. non atomic update ..
#endif

> We ought not be enabling mttcg for 32-bit host and 64-bit guest at all.  But
> that doesn't help much here, where we're otherwise guest width
> agnostic.

Hmm well the most common case (any guest on x86) should work. Currently
the default mttcg code in cpus.c works when:

  #if defined(CONFIG_MTTCG_TARGET) && defined(CONFIG_MTTCG_HOST)

I should probably expand that to default to false in the case of (sizeof
target_ulong > sizeof void *) when we don't have CONFIG_ATOMIC64.

Then if the user does force mttcg on they will quickly get an assert
although maybe we want to report that in a nicer way?

>
>
> r~


--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts
  2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
@ 2016-11-10 18:00   ` Richard Henderson
  2016-11-10 18:13     ` Alex Bennée
  0 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 18:00 UTC (permalink / raw)
  To: Alex Bennée, pbonzini
  Cc: qemu-devel, mttcg, fred.konrad, a.rigo, cota, bobby.prani,
	nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana

On 11/09/2016 03:57 PM, Alex Bennée wrote:
> This enables the multi-threaded system emulation by default for ARMv7
> and ARMv8 guests using the x86_64 TCG backend. This means:
>
>   - The x86_64 TCG backend supports cmpxchg based atomic ops
>   - The x86_64 TCG backend emits barriers for barrier ops

What tcg backend doesn't support what we need?  For a weakly ordered target, 
any of our hosts should work.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-10 17:23   ` Richard Henderson
@ 2016-11-10 18:07     ` Alex Bennée
  0 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 18:07 UTC (permalink / raw)
  To: Richard Henderson
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana, Peter Crosthwaite


Richard Henderson <rth@twiddle.net> writes:

> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>> +/* We currently can't handle more than 16 bits in the MMUIDX bitmask.
>> + */
>> +QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
>
> We already assert <= 12 in exec/cpu_ldst.h.  Although really any such assert
> belongs in exec/cpu-defs.h, where we define CPU_TLB_BITS et al.
>
> That said, what's the technical restriction here?

Really we just need to ensure that we don't run out of bits to convert
the MMUIDX var args into the bottom bit of a page aligned address. We
already have:

  QEMU_BUILD_BUG_ON(NB_MMU_MODES > TARGET_PAGE_BITS_MIN);

So I guess I can drop the other one.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it
  2016-11-10 17:48   ` Richard Henderson
@ 2016-11-10 18:08     ` Alex Bennée
  0 siblings, 0 replies; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 18:08 UTC (permalink / raw)
  To: Richard Henderson
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana, open list:ARM


Richard Henderson <rth@twiddle.net> writes:

> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>> +#ifdef CONFIG_SOFTMMU
>> +    memset(env, 0, offsetof(CPUARMState, tlb_table));
>> +    tlb_flush(s, 0);
>> +#else
>>      memset(env, 0, offsetof(CPUARMState, features));
>> +#endif
>
> I'd really prefer to see the tlb_flush be moved into parent_reset, so that we
> handle it identically for all targets.

Yeah I'll prepare a series to do that separate from MTTCG.

>
> As for the memset, do we really need to distinguish softmmu?  I don't like you
> picking out a variable name within CPU_COMMON.  Better to use empty struct
> markers, like the
>
>        struct {} start_init_save;
>
> that x86 uses.

OK fair enough.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts
  2016-11-10 18:00   ` Richard Henderson
@ 2016-11-10 18:13     ` Alex Bennée
  2016-11-10 18:41       ` Richard Henderson
  0 siblings, 1 reply; 50+ messages in thread
From: Alex Bennée @ 2016-11-10 18:13 UTC (permalink / raw)
  To: Richard Henderson
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana


Richard Henderson <rth@twiddle.net> writes:

> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>> This enables the multi-threaded system emulation by default for ARMv7
>> and ARMv8 guests using the x86_64 TCG backend. This means:
>>
>>   - The x86_64 TCG backend supports cmpxchg based atomic ops
>>   - The x86_64 TCG backend emits barriers for barrier ops
>
> What tcg backend doesn't support what we need?  For a weakly ordered target,
> any of our hosts should work.

True, but this comes with certification that I've tested it. But you are
right adding this to configure is fugly. Should I just drop the backend
config symbol requirement totally?

--
Alex Bennée

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty
  2016-11-10 18:00         ` Alex Bennée
@ 2016-11-10 18:32           ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 18:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Pranith Kumar, pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo,
	cota, nikunj, mark.burton, jan.kiszka, serge.fdrv, peter.maydell,
	claudio.fontana, Peter Crosthwaite

On 11/10/2016 07:00 PM, Alex Bennée wrote:
> I should probably expand that to default to false in the case of (sizeof
> target_ulong > sizeof void *) when we don't have CONFIG_ATOMIC64.
>
> Then if the user does force mttcg on they will quickly get an assert
> although maybe we want to report that in a nicer way?

While forcing mttcg is good for testing, small hosts will definitely fail, so 
there's not point in even trying.  We should report it in a nicer way.

We shouldn't be checking sizeof(void*), but checking TCG_TARGET_REG_BITS.  That 
says how wide the host registers actually are, as opposed to the memory model 
in effect -- think x86_64 in x32 mode and the like.

If the host register size is smaller than the guest register size, we should 
force disable mttcg, regardles of CONFIG_ATOMIC64, because e.g. normal 64 bit 
loads and stores won't be atomic.


r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts
  2016-11-10 18:13     ` Alex Bennée
@ 2016-11-10 18:41       ` Richard Henderson
  0 siblings, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2016-11-10 18:41 UTC (permalink / raw)
  To: Alex Bennée
  Cc: pbonzini, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, mark.burton, jan.kiszka, serge.fdrv,
	peter.maydell, claudio.fontana

On 11/10/2016 07:13 PM, Alex Bennée wrote:
>
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 11/09/2016 03:57 PM, Alex Bennée wrote:
>>> This enables the multi-threaded system emulation by default for ARMv7
>>> and ARMv8 guests using the x86_64 TCG backend. This means:
>>>
>>>   - The x86_64 TCG backend supports cmpxchg based atomic ops
>>>   - The x86_64 TCG backend emits barriers for barrier ops
>>
>> What tcg backend doesn't support what we need?  For a weakly ordered target,
>> any of our hosts should work.
>
> True, but this comes with certification that I've tested it. But you are
> right adding this to configure is fugly. Should I just drop the backend
> config symbol requirement totally?

I was thinking that a good backend config symbol would somehow indicate the 
memory ordering strength of the host.  Preferably in such a way that we can 
tell that host >= guest.

I dunno if we assign ordinal numbers in some arbitrary way, or try something 
more complex such as

   x86_64)
     HOST_MTTCG_MO='TCG_MO_ALL & ~TCG_MO_LD_ST'

Or maybe put this in tcg/*/tcg-target.h in preference to configure.

Then enable mttcg if the host memory-order is a superset of the guest,

     (GUEST_MTTCG_MO & ~HOST_MTTCG_MO) == 0



r~

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement
  2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
                   ` (19 preceding siblings ...)
  2016-11-09 15:11 ` [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Paolo Bonzini
@ 2016-11-13  5:50 ` no-reply
  20 siblings, 0 replies; 50+ messages in thread
From: no-reply @ 2016-11-13  5:50 UTC (permalink / raw)
  To: alex.bennee
  Cc: famz, pbonzini, mttcg, peter.maydell, claudio.fontana, nikunj,
	jan.kiszka, mark.burton, a.rigo, qemu-devel, cota, serge.fdrv,
	bobby.prani, rth, fred.konrad

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Type: series
Subject: [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement
Message-id: 20161109145748.27282-1-alex.bennee@linaro.org

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
8954aa8 tcg: enable MTTCG by default for ARM on x86 hosts
86cf5b8 target-arm: don't generate WFE/YIELD calls for MTTCG
3e3d30b target-arm: helpers which may affect global state need the BQL
359b902 target-arm: ensure BQL taken for ARM_CP_IO register access
7511fa1 target-arm/cpu: don't reset TLB structures, use cputlb to do it
7ac2e27 target-arm/powerctl: defer cpu reset work to CPU context
a736162 cputlb: atomically update tlb fields used by tlb_reset_dirty
7fc1b76 cputlb: tweak qemu_ram_addr_from_host_nofail reporting
4e48ca1 cputlb: introduce tlb_flush_* async work.
d579f6c cputlb: add assert_cpu_is_self checks
65d0035 tcg: handle EXCP_ATOMIC exception for system emulation
d43add5 tcg: enable thread-per-vCPU
5340981 tcg: enable tb_lock() for SoftMMU
3577374 tcg: remove global exit_request
15fa003 tcg: drop global lock during TCG code execution
65c12ae tcg: rename tcg_current_cpu to tcg_current_rr_cpu
2709113 tcg: add kick timer for single-threaded vCPU emulation
d5dda07 tcg: add options for enabling MTTCG
436cb3e docs: new design document multi-thread-tcg.txt

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
make[1]: Entering directory `/var/tmp/patchew-tester-tmp-6smtmz0t/src'
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPY    RUNNER
    RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache     tar git make gcc g++     zlib-devel glib2-devel SDL-devel pixman-devel     epel-release
HOSTNAME=69710984ee01
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix    /var/tmp/qemu-build/install
BIOS directory    /var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path       /tmp/qemu-test/src
C compiler        cc
Host C compiler   cc
C++ compiler      
Objective-C compiler cc
ARFLAGS           rv
CFLAGS            -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS       -I/usr/include/pixman-1    -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all
LDFLAGS           -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make              make
install           install
python            python -B
smbd              /usr/sbin/smbd
module support    no
host CPU          x86_64
host big endian   no
target list       x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled     no
sparse enabled    no
strip binaries    yes
profiler          no
static build      no
pixman            system
SDL support       yes (1.2.14)
GTK support       no 
GTK GL support    no
VTE support       no 
TLS priority      NORMAL
GNUTLS support    no
GNUTLS rnd        no
libgcrypt         no
libgcrypt kdf     no
nettle            no 
nettle kdf        no
libtasn1          no
curses support    no
virgl support     no
curl support      no
mingw32 support   no
Audio drivers     oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS support    no
VNC support       yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support       no
brlapi support    no
bluez  support    no
Documentation     no
PIE               yes
vde support       no
netmap support    no
Linux AIO support no
ATTR/XATTR support yes
Install blobs     yes
KVM support       yes
COLO support      yes
RDMA support      no
TCG interpreter   no
fdt support       yes
preadv support    yes
fdatasync         yes
madvise           yes
posix_madvise     yes
libcap-ng support no
vhost-net support yes
vhost-scsi support yes
vhost-vsock support yes
Trace backends    log
spice support     no 
rbd support       no
xfsctl support    no
smartcard support no
libusb            no
usb net redir     no
OpenGL support    no
OpenGL dmabufs    no
libiscsi support  no
libnfs support    no
build guest agent yes
QGA VSS support   no
QGA w32 disk info no
QGA MSI support   no
seccomp support   no
coroutine backend ucontext
coroutine pool    yes
debug stack usage no
GlusterFS support no
Archipelago support no
gcov              gcov
gcov enabled      no
TPM support       yes
libssh2 support   no
TPM passthrough   yes
QOM debugging     yes
lzo support       no
snappy support    no
bzip2 support     no
NUMA host support no
tcmalloc support  no
jemalloc support  no
avx2 optimization no
replication support yes
  GEN     x86_64-softmmu/config-devices.mak.tmp
  GEN     aarch64-softmmu/config-devices.mak.tmp
  GEN     config-host.h
  GEN     qemu-options.def
  GEN     qmp-commands.h
  GEN     qapi-visit.h
  GEN     qapi-types.h
  GEN     qapi-event.h
  GEN     qmp-introspect.h
  GEN     x86_64-softmmu/config-devices.mak
  GEN     aarch64-softmmu/config-devices.mak
  GEN     module_block.h
  GEN     tests/test-qapi-types.h
  GEN     tests/test-qapi-visit.h
  GEN     tests/test-qmp-commands.h
  GEN     tests/test-qapi-event.h
  GEN     tests/test-qmp-introspect.h
  GEN     config-all-devices.mak
  GEN     trace/generated-tracers.h
  GEN     trace/generated-tcg-tracers.h
  GEN     trace/generated-helpers-wrappers.h
  GEN     trace/generated-helpers.h
  CC      tests/qemu-iotests/socket_scm_helper.o
  GEN     qga/qapi-generated/qga-qapi-types.h
  GEN     qga/qapi-generated/qga-qapi-visit.h
  GEN     qga/qapi-generated/qga-qmp-commands.h
  GEN     qga/qapi-generated/qga-qapi-types.c
  GEN     qga/qapi-generated/qga-qapi-visit.c
  GEN     qga/qapi-generated/qga-qmp-marshal.c
  GEN     qmp-introspect.c
  GEN     qapi-types.c
  GEN     qapi-visit.c
  GEN     qapi-event.c
  CC      qapi/qapi-visit-core.o
  CC      qapi/qapi-dealloc-visitor.o
  CC      qapi/qobject-input-visitor.o
  CC      qapi/qobject-output-visitor.o
  CC      qapi/qmp-registry.o
  CC      qapi/qmp-dispatch.o
  CC      qapi/string-input-visitor.o
  CC      qapi/string-output-visitor.o
  CC      qapi/opts-visitor.o
  CC      qapi/qapi-clone-visitor.o
  CC      qapi/qmp-event.o
  CC      qapi/qapi-util.o
  CC      qobject/qnull.o
  CC      qobject/qint.o
  CC      qobject/qstring.o
  CC      qobject/qdict.o
  CC      qobject/qlist.o
  CC      qobject/qfloat.o
  CC      qobject/qbool.o
  CC      qobject/qjson.o
  CC      qobject/qobject.o
  CC      qobject/json-lexer.o
  CC      qobject/json-streamer.o
  CC      qobject/json-parser.o
  GEN     trace/generated-tracers.c
  CC      trace/control.o
  CC      trace/qmp.o
  CC      util/osdep.o
  CC      util/cutils.o
  CC      util/unicode.o
  CC      util/qemu-timer-common.o
  CC      util/bufferiszero.o
  CC      util/compatfd.o
  CC      util/event_notifier-posix.o
  CC      util/mmap-alloc.o
  CC      util/oslib-posix.o
  CC      util/qemu-openpty.o
  CC      util/qemu-thread-posix.o
  CC      util/memfd.o
  CC      util/envlist.o
  CC      util/path.o
  CC      util/module.o
  CC      util/bitmap.o
  CC      util/bitops.o
  CC      util/hbitmap.o
  CC      util/fifo8.o
  CC      util/acl.o
  CC      util/error.o
  CC      util/qemu-error.o
  CC      util/id.o
  CC      util/iov.o
  CC      util/qemu-config.o
  CC      util/qemu-sockets.o
  CC      util/uri.o
  CC      util/notify.o
  CC      util/qemu-option.o
  CC      util/qemu-progress.o
  CC      util/hexdump.o
  CC      util/crc32c.o
  CC      util/uuid.o
  CC      util/throttle.o
  CC      util/readline.o
  CC      util/getauxval.o
  CC      util/rcu.o
  CC      util/qemu-coroutine.o
  CC      util/qemu-coroutine-lock.o
  CC      util/qemu-coroutine-io.o
  CC      util/qemu-coroutine-sleep.o
  CC      util/coroutine-ucontext.o
  CC      util/buffer.o
  CC      util/timed-average.o
  CC      util/base64.o
  CC      util/log.o
  CC      util/qdist.o
  CC      util/qht.o
  CC      util/range.o
  CC      crypto/pbkdf-stub.o
  CC      stubs/arch-query-cpu-def.o
  CC      stubs/arch-query-cpu-model-expansion.o
  CC      stubs/arch-query-cpu-model-comparison.o
  CC      stubs/arch-query-cpu-model-baseline.o
  CC      stubs/bdrv-next-monitor-owned.o
  CC      stubs/blk-commit-all.o
  CC      stubs/blockdev-close-all-bdrv-states.o
  CC      stubs/clock-warp.o
  CC      stubs/cpu-get-clock.o
  CC      stubs/cpu-get-icount.o
  CC      stubs/dump.o
  CC      stubs/error-printf.o
  CC      stubs/fdset-add-fd.o
  CC      stubs/fdset-find-fd.o
  CC      stubs/fdset-get-fd.o
  CC      stubs/fdset-remove-fd.o
  CC      stubs/gdbstub.o
  CC      stubs/get-fd.o
  CC      stubs/get-next-serial.o
  CC      stubs/get-vm-name.o
  CC      stubs/iothread.o
  CC      stubs/iothread-lock.o
  CC      stubs/is-daemonized.o
  CC      stubs/machine-init-done.o
  CC      stubs/migr-blocker.o
  CC      stubs/mon-is-qmp.o
  CC      stubs/monitor-init.o
  CC      stubs/notify-event.o
  CC      stubs/qtest.o
  CC      stubs/replay.o
  CC      stubs/replay-user.o
  CC      stubs/reset.o
  CC      stubs/runstate-check.o
  CC      stubs/set-fd-handler.o
  CC      stubs/slirp.o
  CC      stubs/sysbus.o
  CC      stubs/trace-control.o
  CC      stubs/uuid.o
  CC      stubs/vm-stop.o
  CC      stubs/vmstate.o
  CC      stubs/cpus.o
  CC      stubs/kvm.o
  CC      stubs/qmp_pc_dimm_device_list.o
  CC      stubs/target-monitor-defs.o
  CC      stubs/target-get-monitor-def.o
  CC      stubs/vhost.o
  CC      stubs/iohandler.o
  CC      stubs/smbios_type_38.o
  CC      stubs/ipmi.o
  CC      stubs/pc_madt_cpu_entry.o
  CC      stubs/migration-colo.o
  CC      contrib/ivshmem-client/ivshmem-client.o
  CC      contrib/ivshmem-client/main.o
  CC      contrib/ivshmem-server/ivshmem-server.o
  CC      contrib/ivshmem-server/main.o
  CC      qemu-nbd.o
  CC      async.o
  CC      thread-pool.o
  CC      block.o
  CC      blockjob.o
  CC      main-loop.o
  CC      iohandler.o
  CC      qemu-timer.o
  CC      aio-posix.o
  CC      qemu-io-cmds.o
  CC      replication.o
  CC      block/raw_bsd.o
  CC      block/qcow.o
  CC      block/vdi.o
  CC      block/vmdk.o
  CC      block/cloop.o
  CC      block/bochs.o
  CC      block/vpc.o
  CC      block/vvfat.o
  CC      block/dmg.o
  CC      block/qcow2.o
  CC      block/qcow2-refcount.o
  CC      block/qcow2-cluster.o
  CC      block/qcow2-snapshot.o
  CC      block/qcow2-cache.o
  CC      block/qed.o
  CC      block/qed-gencb.o
  CC      block/qed-l2-cache.o
  CC      block/qed-table.o
  CC      block/qed-cluster.o
  CC      block/qed-check.o
  CC      block/vhdx.o
  CC      block/vhdx-endian.o
  CC      block/vhdx-log.o
  CC      block/quorum.o
  CC      block/parallels.o
  CC      block/blkdebug.o
  CC      block/blkverify.o
  CC      block/blkreplay.o
  CC      block/block-backend.o
  CC      block/snapshot.o
  CC      block/qapi.o
  CC      block/raw-posix.o
  CC      block/null.o
  CC      block/mirror.o
  CC      block/commit.o
  CC      block/io.o
  CC      block/throttle-groups.o
  CC      block/nbd.o
  CC      block/nbd-client.o
  CC      block/sheepdog.o
  CC      block/accounting.o
  CC      block/dirty-bitmap.o
  CC      block/write-threshold.o
  CC      block/backup.o
  CC      block/replication.o
  CC      block/crypto.o
  CC      nbd/server.o
  CC      nbd/client.o
  CC      nbd/common.o
  CC      crypto/init.o
  CC      crypto/hash.o
  CC      crypto/hash-glib.o
  CC      crypto/aes.o
  CC      crypto/desrfb.o
  CC      crypto/cipher.o
  CC      crypto/tlscreds.o
  CC      crypto/tlscredsanon.o
  CC      crypto/tlscredsx509.o
  CC      crypto/tlssession.o
  CC      crypto/secret.o
  CC      crypto/random-platform.o
  CC      crypto/pbkdf.o
  CC      crypto/ivgen.o
  CC      crypto/ivgen-essiv.o
  CC      crypto/ivgen-plain.o
  CC      crypto/ivgen-plain64.o
  CC      crypto/xts.o
  CC      crypto/afsplit.o
  CC      crypto/block.o
  CC      crypto/block-qcow.o
  CC      crypto/block-luks.o
  CC      io/channel.o
  CC      io/channel-buffer.o
  CC      io/channel-command.o
  CC      io/channel-file.o
  CC      io/channel-socket.o
  CC      io/channel-tls.o
  CC      io/channel-watch.o
  CC      io/channel-websock.o
  CC      io/channel-util.o
  CC      io/task.o
  CC      qom/object.o
  CC      qom/container.o
  CC      qom/qom-qobject.o
  CC      qom/object_interfaces.o
  GEN     qemu-img-cmds.h
  CC      qemu-io.o
  CC      qemu-bridge-helper.o
  CC      blockdev.o
  CC      blockdev-nbd.o
  CC      iothread.o
  CC      device-hotplug.o
  CC      qdev-monitor.o
  CC      os-posix.o
  CC      qemu-char.o
  CC      page_cache.o
  CC      accel.o
  CC      bt-host.o
  CC      bt-vhci.o
  CC      dma-helpers.o
  CC      vl.o
  CC      tpm.o
  CC      device_tree.o
  GEN     qmp-marshal.c
  CC      qmp.o
  CC      hmp.o
  CC      cpus-common.o
  CC      audio/audio.o
  CC      audio/noaudio.o
  CC      audio/wavaudio.o
  CC      audio/mixeng.o
  CC      audio/sdlaudio.o
  CC      audio/ossaudio.o
  CC      audio/wavcapture.o
  CC      backends/rng.o
  CC      backends/rng-egd.o
  CC      backends/rng-random.o
  CC      backends/testdev.o
  CC      backends/msmouse.o
  CC      backends/tpm.o
  CC      backends/hostmem.o
  CC      backends/hostmem-ram.o
  CC      backends/hostmem-file.o
  CC      backends/cryptodev.o
  CC      backends/cryptodev-builtin.o
  CC      block/stream.o
  CC      disas/arm.o
  CC      disas/i386.o
  CC      fsdev/qemu-fsdev-dummy.o
  CC      fsdev/qemu-fsdev-opts.o
  CC      hw/acpi/core.o
  CC      hw/acpi/piix4.o
  CC      hw/acpi/pcihp.o
  CC      hw/acpi/ich9.o
  CC      hw/acpi/tco.o
  CC      hw/acpi/cpu_hotplug.o
  CC      hw/acpi/memory_hotplug.o
  CC      hw/acpi/memory_hotplug_acpi_table.o
  CC      hw/acpi/cpu.o
  CC      hw/acpi/nvdimm.o
  CC      hw/acpi/acpi_interface.o
  CC      hw/acpi/bios-linker-loader.o
  CC      hw/acpi/aml-build.o
  CC      hw/acpi/ipmi.o
  CC      hw/audio/sb16.o
  CC      hw/audio/es1370.o
  CC      hw/audio/ac97.o
  CC      hw/audio/adlib.o
  CC      hw/audio/fmopl.o
  CC      hw/audio/gus.o
  CC      hw/audio/gusemu_hal.o
  CC      hw/audio/gusemu_mixer.o
  CC      hw/audio/cs4231a.o
  CC      hw/audio/intel-hda.o
  CC      hw/audio/hda-codec.o
  CC      hw/audio/pcspk.o
  CC      hw/audio/wm8750.o
  CC      hw/audio/pl041.o
  CC      hw/audio/lm4549.o
  CC      hw/audio/marvell_88w8618.o
  CC      hw/block/block.o
  CC      hw/block/cdrom.o
  CC      hw/block/hd-geometry.o
  CC      hw/block/fdc.o
  CC      hw/block/m25p80.o
  CC      hw/block/nand.o
  CC      hw/block/pflash_cfi01.o
  CC      hw/block/pflash_cfi02.o
  CC      hw/block/ecc.o
  CC      hw/block/onenand.o
  CC      hw/block/nvme.o
  CC      hw/bt/core.o
  CC      hw/bt/l2cap.o
  CC      hw/bt/sdp.o
  CC      hw/bt/hci.o
  CC      hw/bt/hid.o
  CC      hw/bt/hci-csr.o
  CC      hw/char/ipoctal232.o
  CC      hw/char/parallel.o
  CC      hw/char/pl011.o
  CC      hw/char/serial.o
  CC      hw/char/serial-isa.o
  CC      hw/char/serial-pci.o
  CC      hw/char/virtio-console.o
  CC      hw/char/cadence_uart.o
  CC      hw/char/debugcon.o
  CC      hw/char/imx_serial.o
  CC      hw/core/qdev-properties.o
  CC      hw/core/qdev.o
  CC      hw/core/bus.o
  CC      hw/core/fw-path-provider.o
  CC      hw/core/irq.o
  CC      hw/core/hotplug.o
  CC      hw/core/ptimer.o
  CC      hw/core/sysbus.o
  CC      hw/core/machine.o
  CC      hw/core/null-machine.o
  CC      hw/core/loader.o
  CC      hw/core/qdev-properties-system.o
  CC      hw/core/register.o
  CC      hw/core/or-irq.o
  CC      hw/core/platform-bus.o
  CC      hw/display/ads7846.o
  CC      hw/display/cirrus_vga.o
  CC      hw/display/pl110.o
  CC      hw/display/ssd0303.o
  CC      hw/display/ssd0323.o
  CC      hw/display/vga-pci.o
  CC      hw/display/vga-isa.o
  CC      hw/display/vmware_vga.o
  CC      hw/display/blizzard.o
  CC      hw/display/exynos4210_fimd.o
  CC      hw/display/framebuffer.o
  CC      hw/display/tc6393xb.o
  CC      hw/dma/pl080.o
  CC      hw/dma/pl330.o
  CC      hw/dma/i8257.o
  CC      hw/dma/xlnx-zynq-devcfg.o
  CC      hw/gpio/max7310.o
  CC      hw/gpio/pl061.o
  CC      hw/gpio/zaurus.o
  CC      hw/gpio/gpio_key.o
  CC      hw/i2c/core.o
  CC      hw/i2c/smbus.o
  CC      hw/i2c/smbus_eeprom.o
  CC      hw/i2c/i2c-ddc.o
  CC      hw/i2c/versatile_i2c.o
  CC      hw/i2c/smbus_ich9.o
  CC      hw/i2c/pm_smbus.o
  CC      hw/i2c/bitbang_i2c.o
  CC      hw/i2c/exynos4210_i2c.o
  CC      hw/i2c/imx_i2c.o
  CC      hw/i2c/aspeed_i2c.o
  CC      hw/ide/core.o
  CC      hw/ide/atapi.o
  CC      hw/ide/qdev.o
  CC      hw/ide/pci.o
  CC      hw/ide/isa.o
  CC      hw/ide/piix.o
  CC      hw/ide/microdrive.o
  CC      hw/ide/ahci.o
  CC      hw/ide/ich.o
  CC      hw/input/hid.o
  CC      hw/input/lm832x.o
  CC      hw/input/pckbd.o
  CC      hw/input/pl050.o
  CC      hw/input/ps2.o
  CC      hw/input/stellaris_input.o
  CC      hw/input/tsc2005.o
  CC      hw/input/vmmouse.o
  CC      hw/input/virtio-input.o
  CC      hw/input/virtio-input-hid.o
  CC      hw/intc/i8259_common.o
  CC      hw/input/virtio-input-host.o
  CC      hw/intc/i8259.o
  CC      hw/intc/pl190.o
  CC      hw/intc/imx_avic.o
  CC      hw/intc/realview_gic.o
  CC      hw/intc/ioapic_common.o
  CC      hw/intc/arm_gic_common.o
  CC      hw/intc/arm_gic.o
  CC      hw/intc/arm_gicv2m.o
  CC      hw/intc/arm_gicv3_common.o
  CC      hw/intc/arm_gicv3.o
  CC      hw/intc/arm_gicv3_dist.o
  CC      hw/intc/arm_gicv3_redist.o
  CC      hw/intc/arm_gicv3_its_common.o
  CC      hw/intc/intc.o
  CC      hw/ipack/ipack.o
  CC      hw/ipack/tpci200.o
  CC      hw/ipmi/ipmi.o
  CC      hw/ipmi/ipmi_bmc_sim.o
  CC      hw/ipmi/ipmi_bmc_extern.o
  CC      hw/ipmi/isa_ipmi_kcs.o
  CC      hw/ipmi/isa_ipmi_bt.o
  CC      hw/isa/isa-bus.o
  CC      hw/isa/apm.o
  CC      hw/mem/pc-dimm.o
  CC      hw/mem/nvdimm.o
  CC      hw/misc/applesmc.o
  CC      hw/misc/max111x.o
  CC      hw/misc/tmp105.o
  CC      hw/misc/debugexit.o
  CC      hw/misc/sga.o
  CC      hw/misc/pc-testdev.o
  CC      hw/misc/pci-testdev.o
  CC      hw/misc/arm_l2x0.o
  CC      hw/misc/arm_integrator_debug.o
  CC      hw/misc/a9scu.o
  CC      hw/misc/arm11scu.o
  CC      hw/net/ne2000.o
  CC      hw/net/eepro100.o
  CC      hw/net/pcnet-pci.o
  CC      hw/net/pcnet.o
  CC      hw/net/e1000.o
  CC      hw/net/e1000x_common.o
  CC      hw/net/net_tx_pkt.o
  CC      hw/net/net_rx_pkt.o
  CC      hw/net/e1000e.o
  CC      hw/net/e1000e_core.o
  CC      hw/net/rtl8139.o
  CC      hw/net/vmxnet3.o
  CC      hw/net/smc91c111.o
  CC      hw/net/lan9118.o
  CC      hw/net/ne2000-isa.o
  CC      hw/net/xgmac.o
  CC      hw/net/allwinner_emac.o
  CC      hw/net/imx_fec.o
  CC      hw/net/cadence_gem.o
  CC      hw/net/stellaris_enet.o
  CC      hw/net/rocker/rocker.o
  CC      hw/net/rocker/rocker_fp.o
  CC      hw/net/rocker/rocker_desc.o
  CC      hw/net/rocker/rocker_world.o
  CC      hw/net/rocker/rocker_of_dpa.o
  CC      hw/nvram/eeprom93xx.o
  CC      hw/nvram/fw_cfg.o
  CC      hw/nvram/chrp_nvram.o
  CC      hw/pci-bridge/pci_bridge_dev.o
  CC      hw/pci-bridge/pci_expander_bridge.o
  CC      hw/pci-bridge/xio3130_upstream.o
  CC      hw/pci-bridge/xio3130_downstream.o
  CC      hw/pci-bridge/ioh3420.o
  CC      hw/pci-bridge/i82801b11.o
  CC      hw/pci-host/pam.o
  CC      hw/pci-host/versatile.o
  CC      hw/pci-host/piix.o
  CC      hw/pci-host/q35.o
  CC      hw/pci-host/gpex.o
  CC      hw/pci/pci.o
  CC      hw/pci/pci_bridge.o
  CC      hw/pci/msix.o
  CC      hw/pci/msi.o
  CC      hw/pci/shpc.o
  CC      hw/pci/slotid_cap.o
  CC      hw/pci/pci_host.o
  CC      hw/pci/pcie_host.o
  CC      hw/pci/pcie.o
  CC      hw/pci/pcie_aer.o
  CC      hw/pci/pcie_port.o
/tmp/qemu-test/src/hw/nvram/fw_cfg.c: In function ‘fw_cfg_dma_transfer’:
/tmp/qemu-test/src/hw/nvram/fw_cfg.c:329: warning: ‘read’ may be used uninitialized in this function
  CC      hw/pci/pci-stub.o
  CC      hw/pcmcia/pcmcia.o
  CC      hw/scsi/scsi-disk.o
  CC      hw/scsi/scsi-generic.o
  CC      hw/scsi/scsi-bus.o
  CC      hw/scsi/lsi53c895a.o
  CC      hw/scsi/mptsas.o
  CC      hw/scsi/mptconfig.o
  CC      hw/scsi/mptendian.o
  CC      hw/scsi/megasas.o
  CC      hw/scsi/vmw_pvscsi.o
  CC      hw/scsi/esp.o
  CC      hw/scsi/esp-pci.o
  CC      hw/sd/pl181.o
  CC      hw/sd/ssi-sd.o
  CC      hw/sd/sd.o
  CC      hw/sd/core.o
  CC      hw/sd/sdhci.o
  CC      hw/smbios/smbios.o
  CC      hw/smbios/smbios_type_38.o
  CC      hw/ssi/pl022.o
  CC      hw/ssi/ssi.o
  CC      hw/ssi/xilinx_spips.o
  CC      hw/ssi/aspeed_smc.o
  CC      hw/ssi/stm32f2xx_spi.o
  CC      hw/timer/arm_timer.o
  CC      hw/timer/arm_mptimer.o
  CC      hw/timer/a9gtimer.o
  CC      hw/timer/cadence_ttc.o
  CC      hw/timer/ds1338.o
  CC      hw/timer/hpet.o
  CC      hw/timer/i8254_common.o
  CC      hw/timer/i8254.o
  CC      hw/timer/pl031.o
  CC      hw/timer/twl92230.o
  CC      hw/timer/imx_epit.o
  CC      hw/timer/imx_gpt.o
  CC      hw/timer/stm32f2xx_timer.o
  CC      hw/timer/aspeed_timer.o
  CC      hw/tpm/tpm_passthrough.o
  CC      hw/tpm/tpm_tis.o
  CC      hw/tpm/tpm_util.o
  CC      hw/usb/core.o
  CC      hw/usb/combined-packet.o
  CC      hw/usb/bus.o
  CC      hw/usb/libhw.o
  CC      hw/usb/desc.o
  CC      hw/usb/desc-msos.o
  CC      hw/usb/hcd-uhci.o
  CC      hw/usb/hcd-ohci.o
  CC      hw/usb/hcd-ehci.o
  CC      hw/usb/hcd-ehci-pci.o
  CC      hw/usb/hcd-ehci-sysbus.o
  CC      hw/usb/hcd-xhci.o
  CC      hw/usb/hcd-musb.o
  CC      hw/usb/dev-hub.o
  CC      hw/usb/dev-hid.o
  CC      hw/usb/dev-wacom.o
  CC      hw/usb/dev-storage.o
  CC      hw/usb/dev-uas.o
  CC      hw/usb/dev-audio.o
  CC      hw/usb/dev-serial.o
  CC      hw/usb/dev-network.o
  CC      hw/usb/dev-bluetooth.o
  CC      hw/usb/dev-smartcard-reader.o
  CC      hw/usb/dev-mtp.o
  CC      hw/usb/host-stub.o
  CC      hw/virtio/virtio-rng.o
  CC      hw/virtio/virtio-pci.o
  CC      hw/virtio/virtio-bus.o
  CC      hw/virtio/virtio-mmio.o
  CC      hw/watchdog/watchdog.o
  CC      hw/watchdog/wdt_i6300esb.o
  CC      hw/watchdog/wdt_ib700.o
  CC      migration/migration.o
  CC      migration/socket.o
  CC      migration/fd.o
  CC      migration/exec.o
  CC      migration/tls.o
  CC      migration/colo-comm.o
  CC      migration/colo.o
  CC      migration/colo-failover.o
  CC      migration/vmstate.o
  CC      migration/qemu-file-channel.o
  CC      migration/qemu-file.o
  CC      migration/xbzrle.o
  CC      migration/postcopy-ram.o
  CC      migration/qjson.o
  CC      migration/block.o
  CC      net/net.o
  CC      net/queue.o
  CC      net/checksum.o
  CC      net/util.o
  CC      net/hub.o
  CC      net/socket.o
  CC      net/dump.o
  CC      net/eth.o
  CC      net/l2tpv3.o
  CC      net/tap.o
  CC      net/vhost-user.o
  CC      net/tap-linux.o
  CC      net/slirp.o
  CC      net/filter.o
  CC      net/filter-buffer.o
  CC      net/filter-mirror.o
  CC      net/colo-compare.o
  CC      net/colo.o
  CC      net/filter-rewriter.o
  CC      qom/cpu.o
  CC      replay/replay.o
  CC      replay/replay-internal.o
  CC      replay/replay-events.o
  CC      replay/replay-time.o
  CC      replay/replay-input.o
  CC      replay/replay-char.o
  CC      replay/replay-snapshot.o
  CC      slirp/cksum.o
  CC      slirp/if.o
  CC      slirp/ip_icmp.o
  CC      slirp/ip6_icmp.o
  CC      slirp/ip6_input.o
  CC      slirp/ip6_output.o
  CC      slirp/ip_input.o
  CC      slirp/ip_output.o
/tmp/qemu-test/src/replay/replay-internal.c: In function ‘replay_put_array’:
/tmp/qemu-test/src/replay/replay-internal.c:65: warning: ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
  CC      slirp/dnssearch.o
  CC      slirp/dhcpv6.o
  CC      slirp/slirp.o
  CC      slirp/mbuf.o
  CC      slirp/misc.o
  CC      slirp/sbuf.o
  CC      slirp/socket.o
  CC      slirp/tcp_input.o
  CC      slirp/tcp_output.o
  CC      slirp/tcp_subr.o
  CC      slirp/tcp_timer.o
  CC      slirp/udp.o
  CC      slirp/udp6.o
  CC      slirp/bootp.o
/tmp/qemu-test/src/slirp/tcp_input.c: In function ‘tcp_input’:
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_p’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_len’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_tos’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_id’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_off’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_ttl’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_sum’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_src.s_addr’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:219: warning: ‘save_ip.ip_dst.s_addr’ may be used uninitialized in this function
/tmp/qemu-test/src/slirp/tcp_input.c:220: warning: ‘save_ip6.ip_nh’ may be used uninitialized in this function
  CC      slirp/tftp.o
  CC      slirp/arp_table.o
  CC      slirp/ndp_table.o
  CC      ui/keymaps.o
  CC      ui/console.o
  CC      ui/cursor.o
  CC      ui/qemu-pixman.o
  CC      ui/input.o
  CC      ui/input-keymap.o
  CC      ui/input-legacy.o
  CC      ui/input-linux.o
  CC      ui/sdl.o
  CC      ui/sdl_zoom.o
  CC      ui/x_keymap.o
  CC      ui/vnc.o
  CC      ui/vnc-enc-zlib.o
  CC      ui/vnc-enc-hextile.o
  CC      ui/vnc-enc-tight.o
  CC      ui/vnc-palette.o
  CC      ui/vnc-enc-zrle.o
  CC      ui/vnc-auth-vencrypt.o
  CC      ui/vnc-ws.o
  CC      ui/vnc-jobs.o
  LINK    tests/qemu-iotests/socket_scm_helper
  CC      qga/commands.o
  CC      qga/guest-agent-command-state.o
  CC      qga/main.o
  CC      qga/commands-posix.o
  CC      qga/channel-posix.o
  CC      qga/qapi-generated/qga-qapi-types.o
  CC      qga/qapi-generated/qga-qapi-visit.o
  AS      optionrom/multiboot.o
  CC      qga/qapi-generated/qga-qmp-marshal.o
  AS      optionrom/linuxboot.o
  CC      optionrom/linuxboot_dma.o
  CC      qmp-introspect.o
cc: unrecognized option '-no-integrated-as'
cc: unrecognized option '-no-integrated-as'
  AS      optionrom/kvmvapic.o
  CC      qapi-types.o
  CC      qapi-visit.o
  BUILD   optionrom/multiboot.img
  BUILD   optionrom/linuxboot.img
  BUILD   optionrom/linuxboot_dma.img
  BUILD   optionrom/kvmvapic.img
  BUILD   optionrom/multiboot.raw
  BUILD   optionrom/linuxboot.raw
  CC      qapi-event.o
  BUILD   optionrom/linuxboot_dma.raw
  BUILD   optionrom/kvmvapic.raw
  SIGN    optionrom/multiboot.bin
  AR      libqemustub.a
  SIGN    optionrom/linuxboot.bin
  CC      qemu-img.o
  SIGN    optionrom/linuxboot_dma.bin
  SIGN    optionrom/kvmvapic.bin
  CC      qmp-marshal.o
  CC      trace/generated-tracers.o
  AR      libqemuutil.a
  LINK    qemu-ga
  LINK    ivshmem-client
  LINK    ivshmem-server
  LINK    qemu-nbd
  LINK    qemu-img
  LINK    qemu-io
  LINK    qemu-bridge-helper
  GEN     x86_64-softmmu/config-target.h
  GEN     x86_64-softmmu/hmp-commands.h
  GEN     x86_64-softmmu/hmp-commands-info.h
  GEN     aarch64-softmmu/hmp-commands-info.h
  GEN     aarch64-softmmu/hmp-commands.h
  GEN     aarch64-softmmu/config-target.h
  CC      x86_64-softmmu/exec.o
  CC      x86_64-softmmu/cpu-exec.o
  CC      x86_64-softmmu/translate-all.o
  CC      x86_64-softmmu/cpu-exec-common.o
  CC      x86_64-softmmu/translate-common.o
  CC      x86_64-softmmu/tcg/tcg.o
  CC      x86_64-softmmu/tcg/tcg-op.o
  CC      x86_64-softmmu/tcg/optimize.o
  CC      x86_64-softmmu/fpu/softfloat.o
  CC      x86_64-softmmu/tcg/tcg-common.o
  CC      x86_64-softmmu/disas.o
  CC      x86_64-softmmu/tcg-runtime.o
  CC      x86_64-softmmu/arch_init.o
  CC      x86_64-softmmu/cpus.o
  CC      x86_64-softmmu/monitor.o
  CC      aarch64-softmmu/exec.o
  CC      x86_64-softmmu/gdbstub.o
  CC      aarch64-softmmu/translate-all.o
  CC      aarch64-softmmu/cpu-exec.o
  CC      aarch64-softmmu/translate-common.o
  CC      aarch64-softmmu/cpu-exec-common.o
  CC      aarch64-softmmu/tcg/tcg.o
  CC      x86_64-softmmu/balloon.o
  CC      x86_64-softmmu/ioport.o
  CC      x86_64-softmmu/numa.o
  CC      x86_64-softmmu/qtest.o
  CC      x86_64-softmmu/bootdevice.o
  CC      aarch64-softmmu/tcg/tcg-op.o
  CC      aarch64-softmmu/tcg/optimize.o
  CC      aarch64-softmmu/tcg/tcg-common.o
  CC      x86_64-softmmu/kvm-all.o
  CC      x86_64-softmmu/memory.o
  CC      aarch64-softmmu/fpu/softfloat.o
  CC      aarch64-softmmu/disas.o
  CC      x86_64-softmmu/cputlb.o
  CC      x86_64-softmmu/memory_mapping.o
  CC      x86_64-softmmu/dump.o
  CC      aarch64-softmmu/tcg-runtime.o
  GEN     aarch64-softmmu/gdbstub-xml.c
  CC      x86_64-softmmu/migration/ram.o
  CC      aarch64-softmmu/kvm-stub.o
  CC      aarch64-softmmu/arch_init.o
  CC      x86_64-softmmu/migration/savevm.o
  CC      aarch64-softmmu/cpus.o
  CC      x86_64-softmmu/xen-common-stub.o
  CC      x86_64-softmmu/xen-hvm-stub.o
  CC      x86_64-softmmu/hw/block/virtio-blk.o
  CC      aarch64-softmmu/monitor.o
  CC      x86_64-softmmu/hw/block/dataplane/virtio-blk.o
  CC      x86_64-softmmu/hw/char/virtio-serial-bus.o
  CC      x86_64-softmmu/hw/core/nmi.o
  CC      x86_64-softmmu/hw/core/generic-loader.o
  CC      aarch64-softmmu/gdbstub.o
  CC      x86_64-softmmu/hw/cpu/core.o
  CC      aarch64-softmmu/balloon.o
  CC      aarch64-softmmu/ioport.o
  CC      aarch64-softmmu/numa.o
  CC      aarch64-softmmu/qtest.o
  CC      x86_64-softmmu/hw/display/vga.o
  CC      aarch64-softmmu/bootdevice.o
  CC      aarch64-softmmu/memory.o
  CC      aarch64-softmmu/cputlb.o
  CC      aarch64-softmmu/memory_mapping.o
  CC      aarch64-softmmu/dump.o
  CC      x86_64-softmmu/hw/display/virtio-gpu.o
  CC      x86_64-softmmu/hw/display/virtio-gpu-3d.o
  CC      x86_64-softmmu/hw/display/virtio-gpu-pci.o
  CC      aarch64-softmmu/migration/ram.o
  CC      aarch64-softmmu/migration/savevm.o
  CC      x86_64-softmmu/hw/display/virtio-vga.o
  CC      x86_64-softmmu/hw/intc/apic.o
  CC      aarch64-softmmu/xen-common-stub.o
  CC      aarch64-softmmu/xen-hvm-stub.o
  CC      aarch64-softmmu/hw/adc/stm32f2xx_adc.o
  CC      aarch64-softmmu/hw/block/virtio-blk.o
  CC      aarch64-softmmu/hw/block/dataplane/virtio-blk.o
  CC      x86_64-softmmu/hw/intc/apic_common.o
  CC      x86_64-softmmu/hw/intc/ioapic.o
  CC      x86_64-softmmu/hw/isa/lpc_ich9.o
  CC      aarch64-softmmu/hw/char/exynos4210_uart.o
  CC      x86_64-softmmu/hw/misc/vmport.o
  CC      x86_64-softmmu/hw/misc/ivshmem.o
  CC      aarch64-softmmu/hw/char/omap_uart.o
  CC      x86_64-softmmu/hw/misc/pvpanic.o
  CC      x86_64-softmmu/hw/misc/edu.o
  CC      x86_64-softmmu/hw/misc/hyperv_testdev.o
  CC      aarch64-softmmu/hw/char/digic-uart.o
  CC      x86_64-softmmu/hw/net/virtio-net.o
  CC      aarch64-softmmu/hw/char/stm32f2xx_usart.o
  CC      x86_64-softmmu/hw/net/vhost_net.o
  CC      aarch64-softmmu/hw/char/bcm2835_aux.o
  CC      x86_64-softmmu/hw/scsi/virtio-scsi.o
  CC      x86_64-softmmu/hw/scsi/virtio-scsi-dataplane.o
  CC      aarch64-softmmu/hw/char/virtio-serial-bus.o
  CC      aarch64-softmmu/hw/core/nmi.o
  CC      aarch64-softmmu/hw/cpu/arm11mpcore.o
  CC      aarch64-softmmu/hw/core/generic-loader.o
  CC      aarch64-softmmu/hw/cpu/realview_mpcore.o
  CC      aarch64-softmmu/hw/cpu/a9mpcore.o
  CC      x86_64-softmmu/hw/scsi/vhost-scsi.o
  CC      aarch64-softmmu/hw/cpu/a15mpcore.o
  CC      aarch64-softmmu/hw/cpu/core.o
  CC      aarch64-softmmu/hw/display/omap_dss.o
  CC      x86_64-softmmu/hw/timer/mc146818rtc.o
  CC      x86_64-softmmu/hw/vfio/common.o
  CC      x86_64-softmmu/hw/vfio/pci.o
  CC      aarch64-softmmu/hw/display/omap_lcdc.o
  CC      aarch64-softmmu/hw/display/pxa2xx_lcd.o
  CC      aarch64-softmmu/hw/display/bcm2835_fb.o
  CC      x86_64-softmmu/hw/vfio/pci-quirks.o
  CC      x86_64-softmmu/hw/vfio/platform.o
  CC      aarch64-softmmu/hw/display/vga.o
  CC      x86_64-softmmu/hw/vfio/calxeda-xgmac.o
  CC      x86_64-softmmu/hw/vfio/amd-xgbe.o
  CC      aarch64-softmmu/hw/display/virtio-gpu.o
  CC      aarch64-softmmu/hw/display/virtio-gpu-3d.o
  CC      aarch64-softmmu/hw/display/virtio-gpu-pci.o
  CC      x86_64-softmmu/hw/vfio/spapr.o
  CC      aarch64-softmmu/hw/display/dpcd.o
  CC      x86_64-softmmu/hw/virtio/virtio.o
  CC      aarch64-softmmu/hw/display/xlnx_dp.o
  CC      x86_64-softmmu/hw/virtio/virtio-balloon.o
  CC      aarch64-softmmu/hw/dma/xlnx_dpdma.o
  CC      aarch64-softmmu/hw/dma/omap_dma.o
  CC      x86_64-softmmu/hw/virtio/vhost.o
  CC      aarch64-softmmu/hw/dma/soc_dma.o
  CC      x86_64-softmmu/hw/virtio/vhost-backend.o
  CC      x86_64-softmmu/hw/virtio/vhost-user.o
  CC      x86_64-softmmu/hw/virtio/vhost-vsock.o
  CC      x86_64-softmmu/hw/virtio/virtio-crypto.o
  CC      x86_64-softmmu/hw/virtio/virtio-crypto-pci.o
  CC      aarch64-softmmu/hw/dma/pxa2xx_dma.o
  CC      aarch64-softmmu/hw/dma/bcm2835_dma.o
  CC      aarch64-softmmu/hw/gpio/omap_gpio.o
  CC      x86_64-softmmu/hw/i386/multiboot.o
  CC      aarch64-softmmu/hw/gpio/imx_gpio.o
  CC      aarch64-softmmu/hw/i2c/omap_i2c.o
  CC      aarch64-softmmu/hw/input/pxa2xx_keypad.o
  CC      aarch64-softmmu/hw/input/tsc210x.o
  CC      aarch64-softmmu/hw/intc/armv7m_nvic.o
  CC      x86_64-softmmu/hw/i386/pc.o
  CC      aarch64-softmmu/hw/intc/exynos4210_gic.o
  CC      aarch64-softmmu/hw/intc/exynos4210_combiner.o
  CC      aarch64-softmmu/hw/intc/omap_intc.o
  CC      x86_64-softmmu/hw/i386/pc_piix.o
  CC      x86_64-softmmu/hw/i386/pc_q35.o
  CC      x86_64-softmmu/hw/i386/pc_sysfw.o
  CC      aarch64-softmmu/hw/intc/bcm2835_ic.o
  CC      aarch64-softmmu/hw/intc/bcm2836_control.o
  CC      aarch64-softmmu/hw/intc/allwinner-a10-pic.o
  CC      aarch64-softmmu/hw/intc/aspeed_vic.o
  CC      x86_64-softmmu/hw/i386/x86-iommu.o
  CC      x86_64-softmmu/hw/i386/intel_iommu.o
  CC      x86_64-softmmu/hw/i386/amd_iommu.o
  CC      x86_64-softmmu/hw/i386/kvmvapic.o
  CC      aarch64-softmmu/hw/intc/arm_gicv3_cpuif.o
  CC      x86_64-softmmu/hw/i386/acpi-build.o
  CC      aarch64-softmmu/hw/misc/ivshmem.o
  CC      aarch64-softmmu/hw/misc/arm_sysctl.o
  CC      aarch64-softmmu/hw/misc/cbus.o
  CC      aarch64-softmmu/hw/misc/exynos4210_pmu.o
  CC      aarch64-softmmu/hw/misc/imx_ccm.o
  CC      x86_64-softmmu/hw/i386/pci-assign-load-rom.o
  CC      aarch64-softmmu/hw/misc/imx31_ccm.o
  CC      aarch64-softmmu/hw/misc/imx25_ccm.o
  CC      aarch64-softmmu/hw/misc/imx6_ccm.o
  CC      x86_64-softmmu/hw/i386/kvm/clock.o
  CC      aarch64-softmmu/hw/misc/imx6_src.o
  CC      x86_64-softmmu/hw/i386/kvm/apic.o
  CC      x86_64-softmmu/hw/i386/kvm/i8259.o
  CC      aarch64-softmmu/hw/misc/mst_fpga.o
  CC      aarch64-softmmu/hw/misc/omap_clk.o
  CC      aarch64-softmmu/hw/misc/omap_gpmc.o
  CC      aarch64-softmmu/hw/misc/omap_l4.o
  CC      aarch64-softmmu/hw/misc/omap_sdrc.o
/tmp/qemu-test/src/hw/i386/pc_piix.c: In function ‘igd_passthrough_isa_bridge_create’:
/tmp/qemu-test/src/hw/i386/pc_piix.c:1046: warning: ‘pch_rev_id’ may be used uninitialized in this function
  CC      aarch64-softmmu/hw/misc/omap_tap.o
  CC      aarch64-softmmu/hw/misc/bcm2835_mbox.o
  CC      aarch64-softmmu/hw/misc/bcm2835_property.o
  CC      x86_64-softmmu/hw/i386/kvm/ioapic.o
  CC      x86_64-softmmu/hw/i386/kvm/i8254.o
  CC      aarch64-softmmu/hw/misc/zynq_slcr.o
  CC      aarch64-softmmu/hw/misc/zynq-xadc.o
  CC      aarch64-softmmu/hw/misc/stm32f2xx_syscfg.o
  CC      x86_64-softmmu/hw/i386/kvm/pci-assign.o
/tmp/qemu-test/src/hw/i386/acpi-build.c: In function ‘build_append_pci_bus_devices’:
/tmp/qemu-test/src/hw/i386/acpi-build.c:501: warning: ‘notify_method’ may be used uninitialized in this function
  CC      aarch64-softmmu/hw/misc/edu.o
  CC      x86_64-softmmu/target-i386/translate.o
  CC      aarch64-softmmu/hw/misc/auxbus.o
  CC      aarch64-softmmu/hw/misc/aspeed_scu.o
  CC      x86_64-softmmu/target-i386/helper.o
  CC      x86_64-softmmu/target-i386/cpu.o
  CC      x86_64-softmmu/target-i386/bpt_helper.o
  CC      x86_64-softmmu/target-i386/excp_helper.o
  CC      aarch64-softmmu/hw/misc/aspeed_sdmc.o
  CC      x86_64-softmmu/target-i386/fpu_helper.o
  CC      x86_64-softmmu/target-i386/cc_helper.o
  CC      aarch64-softmmu/hw/net/virtio-net.o
  CC      aarch64-softmmu/hw/net/vhost_net.o
  CC      aarch64-softmmu/hw/pcmcia/pxa2xx.o
  CC      x86_64-softmmu/target-i386/int_helper.o
  CC      x86_64-softmmu/target-i386/svm_helper.o
  CC      aarch64-softmmu/hw/scsi/virtio-scsi.o
  CC      aarch64-softmmu/hw/scsi/virtio-scsi-dataplane.o
  CC      aarch64-softmmu/hw/scsi/vhost-scsi.o
  CC      aarch64-softmmu/hw/sd/omap_mmc.o
  CC      x86_64-softmmu/target-i386/smm_helper.o
  CC      x86_64-softmmu/target-i386/misc_helper.o
  CC      x86_64-softmmu/target-i386/mem_helper.o
  CC      aarch64-softmmu/hw/sd/pxa2xx_mmci.o
  CC      x86_64-softmmu/target-i386/seg_helper.o
  CC      x86_64-softmmu/target-i386/mpx_helper.o
  CC      aarch64-softmmu/hw/ssi/omap_spi.o
  CC      x86_64-softmmu/target-i386/gdbstub.o
  CC      x86_64-softmmu/target-i386/machine.o
  CC      aarch64-softmmu/hw/ssi/imx_spi.o
  CC      aarch64-softmmu/hw/timer/exynos4210_mct.o
  CC      aarch64-softmmu/hw/timer/exynos4210_pwm.o
  CC      aarch64-softmmu/hw/timer/exynos4210_rtc.o
  CC      aarch64-softmmu/hw/timer/omap_gptimer.o
  CC      aarch64-softmmu/hw/timer/omap_synctimer.o
  CC      aarch64-softmmu/hw/timer/pxa2xx_timer.o
  CC      aarch64-softmmu/hw/timer/digic-timer.o
  CC      x86_64-softmmu/target-i386/arch_memory_mapping.o
  CC      aarch64-softmmu/hw/timer/allwinner-a10-pit.o
  CC      x86_64-softmmu/target-i386/arch_dump.o
  CC      x86_64-softmmu/target-i386/monitor.o
  CC      aarch64-softmmu/hw/usb/tusb6010.o
  CC      x86_64-softmmu/target-i386/kvm.o
  CC      aarch64-softmmu/hw/vfio/common.o
  CC      aarch64-softmmu/hw/vfio/pci.o
  CC      aarch64-softmmu/hw/vfio/pci-quirks.o
  CC      aarch64-softmmu/hw/vfio/platform.o
  CC      x86_64-softmmu/target-i386/hyperv.o
  GEN     trace/generated-helpers.c
  CC      aarch64-softmmu/hw/vfio/calxeda-xgmac.o
  CC      x86_64-softmmu/trace/control-target.o
  CC      aarch64-softmmu/hw/vfio/amd-xgbe.o
  CC      aarch64-softmmu/hw/vfio/spapr.o
  CC      aarch64-softmmu/hw/virtio/virtio.o
  CC      aarch64-softmmu/hw/virtio/virtio-balloon.o
  CC      aarch64-softmmu/hw/virtio/vhost.o
  CC      aarch64-softmmu/hw/virtio/vhost-backend.o
  CC      aarch64-softmmu/hw/virtio/vhost-user.o
  CC      aarch64-softmmu/hw/virtio/vhost-vsock.o
  CC      aarch64-softmmu/hw/virtio/virtio-crypto.o
  CC      aarch64-softmmu/hw/virtio/virtio-crypto-pci.o
  CC      x86_64-softmmu/trace/generated-helpers.o
  CC      aarch64-softmmu/hw/arm/boot.o
  CC      aarch64-softmmu/hw/arm/collie.o
  CC      aarch64-softmmu/hw/arm/exynos4_boards.o
  CC      aarch64-softmmu/hw/arm/gumstix.o
  CC      aarch64-softmmu/hw/arm/highbank.o
  CC      aarch64-softmmu/hw/arm/digic_boards.o
  CC      aarch64-softmmu/hw/arm/integratorcp.o
  CC      aarch64-softmmu/hw/arm/mainstone.o
  CC      aarch64-softmmu/hw/arm/musicpal.o
  CC      aarch64-softmmu/hw/arm/nseries.o
  CC      aarch64-softmmu/hw/arm/omap_sx1.o
  CC      aarch64-softmmu/hw/arm/palm.o
  CC      aarch64-softmmu/hw/arm/realview.o
  CC      aarch64-softmmu/hw/arm/spitz.o
  CC      aarch64-softmmu/hw/arm/stellaris.o
  CC      aarch64-softmmu/hw/arm/tosa.o
  CC      aarch64-softmmu/hw/arm/versatilepb.o
  CC      aarch64-softmmu/hw/arm/vexpress.o
  CC      aarch64-softmmu/hw/arm/virt.o
  CC      aarch64-softmmu/hw/arm/xilinx_zynq.o
  CC      aarch64-softmmu/hw/arm/z2.o
  CC      aarch64-softmmu/hw/arm/virt-acpi-build.o
  CC      aarch64-softmmu/hw/arm/netduino2.o
  CC      aarch64-softmmu/hw/arm/sysbus-fdt.o
  CC      aarch64-softmmu/hw/arm/armv7m.o
  CC      aarch64-softmmu/hw/arm/exynos4210.o
  CC      aarch64-softmmu/hw/arm/pxa2xx.o
  CC      aarch64-softmmu/hw/arm/pxa2xx_gpio.o
  CC      aarch64-softmmu/hw/arm/pxa2xx_pic.o
  CC      aarch64-softmmu/hw/arm/digic.o
  CC      aarch64-softmmu/hw/arm/omap1.o
  CC      aarch64-softmmu/hw/arm/omap2.o
  CC      aarch64-softmmu/hw/arm/strongarm.o
  CC      aarch64-softmmu/hw/arm/bcm2835_peripherals.o
  CC      aarch64-softmmu/hw/arm/allwinner-a10.o
  CC      aarch64-softmmu/hw/arm/cubieboard.o
  CC      aarch64-softmmu/hw/arm/bcm2836.o
  CC      aarch64-softmmu/hw/arm/raspi.o
  CC      aarch64-softmmu/hw/arm/stm32f205_soc.o
  CC      aarch64-softmmu/hw/arm/xlnx-zynqmp.o
  CC      aarch64-softmmu/hw/arm/xlnx-ep108.o
  CC      aarch64-softmmu/hw/arm/fsl-imx25.o
  CC      aarch64-softmmu/hw/arm/imx25_pdk.o
  CC      aarch64-softmmu/hw/arm/fsl-imx31.o
  CC      aarch64-softmmu/hw/arm/fsl-imx6.o
  CC      aarch64-softmmu/hw/arm/sabrelite.o
  CC      aarch64-softmmu/hw/arm/kzm.o
  CC      aarch64-softmmu/hw/arm/aspeed_soc.o
  CC      aarch64-softmmu/hw/arm/aspeed.o
  CC      aarch64-softmmu/target-arm/arm-semi.o
  CC      aarch64-softmmu/target-arm/machine.o
  CC      aarch64-softmmu/target-arm/psci.o
  CC      aarch64-softmmu/target-arm/arch_dump.o
  CC      aarch64-softmmu/target-arm/monitor.o
  CC      aarch64-softmmu/target-arm/kvm-stub.o
  CC      aarch64-softmmu/target-arm/translate.o
  LINK    x86_64-softmmu/qemu-system-x86_64
  CC      aarch64-softmmu/target-arm/op_helper.o
  CC      aarch64-softmmu/target-arm/helper.o
  CC      aarch64-softmmu/target-arm/cpu.o
  CC      aarch64-softmmu/target-arm/neon_helper.o
  CC      aarch64-softmmu/target-arm/iwmmxt_helper.o
  CC      aarch64-softmmu/target-arm/gdbstub.o
  CC      aarch64-softmmu/target-arm/cpu64.o
  CC      aarch64-softmmu/target-arm/translate-a64.o
  CC      aarch64-softmmu/target-arm/helper-a64.o
  CC      aarch64-softmmu/target-arm/gdbstub64.o
  CC      aarch64-softmmu/target-arm/crypto_helper.o
  CC      aarch64-softmmu/target-arm/arm-powerctl.o
  GEN     trace/generated-helpers.c
  CC      aarch64-softmmu/trace/control-target.o
  CC      aarch64-softmmu/gdbstub-xml.o
  CC      aarch64-softmmu/trace/generated-helpers.o
/tmp/qemu-test/src/target-arm/translate-a64.c: In function ‘handle_shri_with_rndacc’:
/tmp/qemu-test/src/target-arm/translate-a64.c:6399: warning: ‘tcg_src_hi’ may be used uninitialized in this function
/tmp/qemu-test/src/target-arm/translate-a64.c: In function ‘disas_simd_scalar_two_reg_misc’:
/tmp/qemu-test/src/target-arm/translate-a64.c:8126: warning: ‘rmode’ may be used uninitialized in this function
  LINK    aarch64-softmmu/qemu-system-aarch64
  TEST    tests/qapi-schema/alternate-any.out
  TEST    tests/qapi-schema/alternate-array.out
  TEST    tests/qapi-schema/alternate-base.out
  TEST    tests/qapi-schema/alternate-clash.out
  TEST    tests/qapi-schema/alternate-conflict-dict.out
  TEST    tests/qapi-schema/alternate-conflict-string.out
  TEST    tests/qapi-schema/alternate-empty.out
  TEST    tests/qapi-schema/alternate-nested.out
  TEST    tests/qapi-schema/alternate-unknown.out
  TEST    tests/qapi-schema/args-alternate.out
  TEST    tests/qapi-schema/args-any.out
  TEST    tests/qapi-schema/args-array-unknown.out
  TEST    tests/qapi-schema/args-array-empty.out
  TEST    tests/qapi-schema/args-bad-boxed.out
  TEST    tests/qapi-schema/args-boxed-anon.out
  TEST    tests/qapi-schema/args-boxed-empty.out
  TEST    tests/qapi-schema/args-boxed-string.out
  TEST    tests/qapi-schema/args-int.out
  TEST    tests/qapi-schema/args-invalid.out
  TEST    tests/qapi-schema/args-member-array-bad.out
  TEST    tests/qapi-schema/args-member-case.out
  TEST    tests/qapi-schema/args-member-unknown.out
  TEST    tests/qapi-schema/args-name-clash.out
  TEST    tests/qapi-schema/args-union.out
  TEST    tests/qapi-schema/args-unknown.out
  TEST    tests/qapi-schema/bad-base.out
  TEST    tests/qapi-schema/bad-data.out
  TEST    tests/qapi-schema/bad-ident.out
  TEST    tests/qapi-schema/bad-type-bool.out
  TEST    tests/qapi-schema/bad-type-dict.out
  TEST    tests/qapi-schema/bad-type-int.out
  TEST    tests/qapi-schema/base-cycle-direct.out
  TEST    tests/qapi-schema/base-cycle-indirect.out
  TEST    tests/qapi-schema/command-int.out
  TEST    tests/qapi-schema/comments.out
  TEST    tests/qapi-schema/double-data.out
  TEST    tests/qapi-schema/double-type.out
  TEST    tests/qapi-schema/duplicate-key.out
  TEST    tests/qapi-schema/empty.out
  TEST    tests/qapi-schema/enum-bad-name.out
  TEST    tests/qapi-schema/enum-bad-prefix.out
  TEST    tests/qapi-schema/enum-clash-member.out
  TEST    tests/qapi-schema/enum-dict-member.out
  TEST    tests/qapi-schema/enum-int-member.out
  TEST    tests/qapi-schema/enum-member-case.out
  TEST    tests/qapi-schema/enum-missing-data.out
  TEST    tests/qapi-schema/enum-wrong-data.out
  TEST    tests/qapi-schema/escape-outside-string.out
  TEST    tests/qapi-schema/escape-too-big.out
  TEST    tests/qapi-schema/escape-too-short.out
  TEST    tests/qapi-schema/event-boxed-empty.out
  TEST    tests/qapi-schema/event-case.out
  TEST    tests/qapi-schema/event-nest-struct.out
  TEST    tests/qapi-schema/flat-union-array-branch.out
  TEST    tests/qapi-schema/flat-union-bad-base.out
  TEST    tests/qapi-schema/flat-union-bad-discriminator.out
  TEST    tests/qapi-schema/flat-union-base-any.out
  TEST    tests/qapi-schema/flat-union-base-union.out
  TEST    tests/qapi-schema/flat-union-clash-member.out
  TEST    tests/qapi-schema/flat-union-empty.out
  TEST    tests/qapi-schema/flat-union-incomplete-branch.out
  TEST    tests/qapi-schema/flat-union-int-branch.out
  TEST    tests/qapi-schema/flat-union-inline.out
  TEST    tests/qapi-schema/flat-union-invalid-branch-key.out
  TEST    tests/qapi-schema/flat-union-invalid-discriminator.out
  TEST    tests/qapi-schema/flat-union-no-base.out
  TEST    tests/qapi-schema/flat-union-optional-discriminator.out
  TEST    tests/qapi-schema/flat-union-string-discriminator.out
  TEST    tests/qapi-schema/funny-char.out
  TEST    tests/qapi-schema/ident-with-escape.out
  TEST    tests/qapi-schema/include-cycle.out
  TEST    tests/qapi-schema/include-before-err.out
  TEST    tests/qapi-schema/include-format-err.out
  TEST    tests/qapi-schema/include-nested-err.out
  TEST    tests/qapi-schema/include-no-file.out
  TEST    tests/qapi-schema/include-non-file.out
  TEST    tests/qapi-schema/include-relpath.out
  TEST    tests/qapi-schema/include-repetition.out
  TEST    tests/qapi-schema/include-self-cycle.out
  TEST    tests/qapi-schema/include-simple.out
  TEST    tests/qapi-schema/indented-expr.out
  TEST    tests/qapi-schema/leading-comma-list.out
  TEST    tests/qapi-schema/leading-comma-object.out
  TEST    tests/qapi-schema/missing-colon.out
  TEST    tests/qapi-schema/missing-comma-list.out
  TEST    tests/qapi-schema/missing-comma-object.out
  TEST    tests/qapi-schema/missing-type.out
  TEST    tests/qapi-schema/nested-struct-data.out
  TEST    tests/qapi-schema/non-objects.out
  TEST    tests/qapi-schema/qapi-schema-test.out
  TEST    tests/qapi-schema/quoted-structural-chars.out
  TEST    tests/qapi-schema/redefined-builtin.out
  TEST    tests/qapi-schema/redefined-command.out
  TEST    tests/qapi-schema/redefined-event.out
  TEST    tests/qapi-schema/redefined-type.out
  TEST    tests/qapi-schema/reserved-command-q.out
  TEST    tests/qapi-schema/reserved-enum-q.out
  TEST    tests/qapi-schema/reserved-member-has.out
  TEST    tests/qapi-schema/reserved-member-q.out
  TEST    tests/qapi-schema/reserved-member-u.out
  TEST    tests/qapi-schema/reserved-member-underscore.out
  TEST    tests/qapi-schema/reserved-type-kind.out
  TEST    tests/qapi-schema/reserved-type-list.out
  TEST    tests/qapi-schema/returns-alternate.out
  TEST    tests/qapi-schema/returns-array-bad.out
  TEST    tests/qapi-schema/returns-unknown.out
  TEST    tests/qapi-schema/returns-dict.out
  TEST    tests/qapi-schema/returns-whitelist.out
  TEST    tests/qapi-schema/struct-base-clash-deep.out
  TEST    tests/qapi-schema/struct-base-clash.out
  TEST    tests/qapi-schema/struct-data-invalid.out
  TEST    tests/qapi-schema/struct-member-invalid.out
  TEST    tests/qapi-schema/trailing-comma-list.out
  TEST    tests/qapi-schema/trailing-comma-object.out
  TEST    tests/qapi-schema/unclosed-object.out
  TEST    tests/qapi-schema/type-bypass-bad-gen.out
  TEST    tests/qapi-schema/unclosed-list.out
  TEST    tests/qapi-schema/unclosed-string.out
  TEST    tests/qapi-schema/unicode-str.out
  TEST    tests/qapi-schema/union-base-no-discriminator.out
  TEST    tests/qapi-schema/union-branch-case.out
  TEST    tests/qapi-schema/union-clash-branches.out
  TEST    tests/qapi-schema/union-empty.out
  TEST    tests/qapi-schema/union-invalid-base.out
  TEST    tests/qapi-schema/union-optional-branch.out
  TEST    tests/qapi-schema/union-unknown.out
  TEST    tests/qapi-schema/unknown-escape.out
  TEST    tests/qapi-schema/unknown-expr-key.out
  CC      tests/check-qdict.o
  CC      tests/test-char.o
  CC      tests/check-qfloat.o
  CC      tests/check-qint.o
  CC      tests/check-qstring.o
  CC      tests/check-qlist.o
  CC      tests/check-qjson.o
  CC      tests/check-qnull.o
  CC      tests/test-qobject-output-visitor.o
  GEN     tests/test-qapi-visit.c
  GEN     tests/test-qapi-types.c
  GEN     tests/test-qapi-event.c
  GEN     tests/test-qmp-introspect.c
  CC      tests/test-clone-visitor.o
  CC      tests/test-qobject-input-visitor.o
  CC      tests/test-qobject-input-strict.o
  CC      tests/test-qmp-commands.o
  GEN     tests/test-qmp-marshal.c
  CC      tests/test-string-input-visitor.o
  CC      tests/test-string-output-visitor.o
  CC      tests/test-qmp-event.o
  CC      tests/test-opts-visitor.o
  CC      tests/test-coroutine.o
  CC      tests/test-visitor-serialization.o
  CC      tests/test-iov.o
  CC      tests/test-aio.o
  CC      tests/test-throttle.o
  CC      tests/test-thread-pool.o
  CC      tests/test-hbitmap.o
  CC      tests/test-blockjob.o
  CC      tests/test-blockjob-txn.o
  CC      tests/test-x86-cpuid.o
  CC      tests/test-xbzrle.o
  CC      tests/test-vmstate.o
  CC      tests/test-cutils.o
  CC      tests/test-mul64.o
  CC      tests/test-int128.o
  CC      tests/rcutorture.o
  CC      tests/test-rcu-list.o
  CC      tests/test-qdist.o
  CC      tests/test-qht.o
  CC      tests/test-qht-par.o
  CC      tests/qht-bench.o
  CC      tests/test-bitops.o
/tmp/qemu-test/src/tests/test-int128.c:180: warning: ‘__noclone__’ attribute directive ignored
  CC      tests/check-qom-interface.o
  CC      tests/check-qom-proplist.o
  CC      tests/test-qemu-opts.o
  CC      tests/test-write-threshold.o
  CC      tests/test-crypto-hash.o
  CC      tests/test-crypto-cipher.o
  CC      tests/test-crypto-secret.o
  CC      tests/test-qga.o
  CC      tests/libqtest.o
  CC      tests/test-timed-average.o
  CC      tests/test-io-task.o
  CC      tests/test-io-channel-socket.o
  CC      tests/io-channel-helpers.o
  CC      tests/test-io-channel-file.o
  CC      tests/test-io-channel-command.o
  CC      tests/test-io-channel-buffer.o
  CC      tests/test-base64.o
  CC      tests/test-crypto-ivgen.o
  CC      tests/test-crypto-afsplit.o
  CC      tests/test-crypto-xts.o
  CC      tests/test-crypto-block.o
  CC      tests/test-logging.o
  CC      tests/test-replication.o
  CC      tests/test-bufferiszero.o
  CC      tests/test-uuid.o
  CC      tests/ptimer-test.o
  CC      tests/ptimer-test-stubs.o
  CC      tests/vhost-user-test.o
  CC      tests/libqos/pci.o
  CC      tests/libqos/fw_cfg.o
  CC      tests/libqos/malloc.o
  CC      tests/libqos/i2c.o
  CC      tests/libqos/libqos.o
  CC      tests/libqos/malloc-spapr.o
  CC      tests/libqos/libqos-spapr.o
  CC      tests/libqos/rtas.o
  CC      tests/libqos/pci-spapr.o
  CC      tests/libqos/pci-pc.o
  CC      tests/libqos/malloc-pc.o
  CC      tests/libqos/libqos-pc.o
  CC      tests/libqos/ahci.o
  CC      tests/libqos/virtio.o
  CC      tests/libqos/virtio-pci.o
  CC      tests/libqos/virtio-mmio.o
  CC      tests/libqos/malloc-generic.o
  CC      tests/endianness-test.o
  CC      tests/fdc-test.o
  CC      tests/ide-test.o
  CC      tests/ahci-test.o
  CC      tests/hd-geo-test.o
  CC      tests/boot-order-test.o
  CC      tests/bios-tables-test.o
  CC      tests/boot-sector.o
  CC      tests/boot-serial-test.o
  CC      tests/pxe-test.o
  CC      tests/rtc-test.o
  CC      tests/ipmi-kcs-test.o
  CC      tests/ipmi-bt-test.o
  CC      tests/i440fx-test.o
  CC      tests/fw_cfg-test.o
  CC      tests/drive_del-test.o
/tmp/qemu-test/src/tests/ide-test.c: In function ‘cdrom_pio_impl’:
/tmp/qemu-test/src/tests/ide-test.c:791: warning: ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
/tmp/qemu-test/src/tests/ide-test.c: In function ‘test_cdrom_dma’:
/tmp/qemu-test/src/tests/ide-test.c:886: warning: ignoring return value of ‘fwrite’, declared with attribute warn_unused_result
  CC      tests/tco-test.o
  CC      tests/wdt_ib700-test.o
  CC      tests/e1000-test.o
  CC      tests/e1000e-test.o
  CC      tests/rtl8139-test.o
  CC      tests/pcnet-test.o
  CC      tests/eepro100-test.o
  CC      tests/ne2000-test.o
  CC      tests/nvme-test.o
  CC      tests/ac97-test.o
  CC      tests/es1370-test.o
  CC      tests/virtio-net-test.o
  CC      tests/virtio-balloon-test.o
  CC      tests/virtio-blk-test.o
  CC      tests/virtio-rng-test.o
  CC      tests/virtio-scsi-test.o
  CC      tests/virtio-serial-test.o
  CC      tests/virtio-console-test.o
  CC      tests/tpci200-test.o
  CC      tests/ipoctal232-test.o
  CC      tests/display-vga-test.o
  CC      tests/ivshmem-test.o
  CC      tests/intel-hda-test.o
  CC      tests/vmxnet3-test.o
  CC      tests/pvpanic-test.o
  CC      tests/i82801b11-test.o
  CC      tests/ioh3420-test.o
  CC      tests/usb-hcd-ohci-test.o
  CC      tests/libqos/usb.o
  CC      tests/usb-hcd-uhci-test.o
  CC      tests/usb-hcd-ehci-test.o
  CC      tests/usb-hcd-xhci-test.o
  CC      tests/pc-cpu-test.o
  CC      tests/test-netfilter.o
  CC      tests/q35-test.o
  CC      tests/test-filter-redirector.o
  CC      tests/test-filter-mirror.o
  CC      tests/postcopy-test.o
  CC      tests/test-x86-cpuid-compat.o
  CC      tests/device-introspect-test.o
  CC      tests/qom-test.o
  LINK    tests/check-qdict
  LINK    tests/test-char
  LINK    tests/check-qfloat
  LINK    tests/check-qint
  LINK    tests/check-qstring
  LINK    tests/check-qlist
  LINK    tests/check-qnull
  LINK    tests/check-qjson
  CC      tests/test-qapi-types.o
  CC      tests/test-qapi-visit.o
  CC      tests/test-qmp-introspect.o
  CC      tests/test-qmp-marshal.o
  CC      tests/test-qapi-event.o
  LINK    tests/test-coroutine
  LINK    tests/test-iov
  LINK    tests/test-aio
  LINK    tests/test-throttle
  LINK    tests/test-thread-pool
  LINK    tests/test-hbitmap
  LINK    tests/test-blockjob
  LINK    tests/test-blockjob-txn
  LINK    tests/test-x86-cpuid
  LINK    tests/test-xbzrle
  LINK    tests/test-vmstate
  LINK    tests/test-cutils
  LINK    tests/test-mul64
  LINK    tests/test-int128
  LINK    tests/rcutorture
  LINK    tests/test-rcu-list
  LINK    tests/test-qdist
  LINK    tests/test-qht
  LINK    tests/qht-bench
  LINK    tests/test-bitops
  LINK    tests/check-qom-interface
  LINK    tests/check-qom-proplist
  LINK    tests/test-qemu-opts
  LINK    tests/test-write-threshold
  LINK    tests/test-crypto-hash
  LINK    tests/test-crypto-cipher
  LINK    tests/test-crypto-secret
  LINK    tests/test-qga
  LINK    tests/test-timed-average
  LINK    tests/test-io-task
  LINK    tests/test-io-channel-socket
  LINK    tests/test-io-channel-file
  LINK    tests/test-io-channel-command
  LINK    tests/test-io-channel-buffer
  LINK    tests/test-base64
  LINK    tests/test-crypto-ivgen
  LINK    tests/test-crypto-afsplit
  LINK    tests/test-crypto-xts
  LINK    tests/test-crypto-block
  LINK    tests/test-logging
  LINK    tests/test-replication
  LINK    tests/test-bufferiszero
  LINK    tests/test-uuid
  LINK    tests/ptimer-test
  LINK    tests/vhost-user-test
  LINK    tests/endianness-test
  LINK    tests/fdc-test
  LINK    tests/ide-test
  LINK    tests/ahci-test
  LINK    tests/hd-geo-test
  LINK    tests/boot-order-test
  LINK    tests/bios-tables-test
  LINK    tests/boot-serial-test
  LINK    tests/pxe-test
  LINK    tests/rtc-test
  LINK    tests/ipmi-kcs-test
  LINK    tests/ipmi-bt-test
  LINK    tests/i440fx-test
  LINK    tests/fw_cfg-test
  LINK    tests/drive_del-test
  LINK    tests/wdt_ib700-test
  LINK    tests/tco-test
  LINK    tests/e1000-test
  LINK    tests/e1000e-test
  LINK    tests/rtl8139-test
  LINK    tests/pcnet-test
  LINK    tests/eepro100-test
  LINK    tests/ne2000-test
  LINK    tests/nvme-test
  LINK    tests/ac97-test
  LINK    tests/es1370-test
  LINK    tests/virtio-net-test
  LINK    tests/virtio-balloon-test
  LINK    tests/virtio-blk-test
  LINK    tests/virtio-rng-test
  LINK    tests/virtio-scsi-test
  LINK    tests/virtio-serial-test
  LINK    tests/virtio-console-test
  LINK    tests/tpci200-test
  LINK    tests/ipoctal232-test
  LINK    tests/display-vga-test
  LINK    tests/intel-hda-test
  LINK    tests/ivshmem-test
  LINK    tests/vmxnet3-test
  LINK    tests/pvpanic-test
  LINK    tests/i82801b11-test
  LINK    tests/ioh3420-test
  LINK    tests/usb-hcd-ohci-test
  LINK    tests/usb-hcd-uhci-test
  LINK    tests/usb-hcd-ehci-test
  LINK    tests/usb-hcd-xhci-test
  LINK    tests/pc-cpu-test
  LINK    tests/q35-test
  LINK    tests/test-netfilter
  LINK    tests/test-filter-mirror
  LINK    tests/test-filter-redirector
  LINK    tests/postcopy-test
  LINK    tests/test-x86-cpuid-compat
  LINK    tests/device-introspect-test
  LINK    tests/qom-test
  GTESTER tests/check-qdict
  GTESTER tests/test-char
  GTESTER tests/check-qfloat
  GTESTER tests/check-qint
  GTESTER tests/check-qstring
  GTESTER tests/check-qlist
  GTESTER tests/check-qnull
  GTESTER tests/check-qjson
  LINK    tests/test-qobject-output-visitor
  LINK    tests/test-clone-visitor
  LINK    tests/test-qobject-input-visitor
  LINK    tests/test-qobject-input-strict
  LINK    tests/test-qmp-commands
  LINK    tests/test-string-input-visitor
  LINK    tests/test-string-output-visitor
  LINK    tests/test-qmp-event
  LINK    tests/test-opts-visitor
  GTESTER tests/test-coroutine
  GTESTER tests/test-iov
  LINK    tests/test-visitor-serialization
  GTESTER tests/test-aio
  GTESTER tests/test-throttle
  GTESTER tests/test-thread-pool
  GTESTER tests/test-hbitmap
  GTESTER tests/test-blockjob
  GTESTER tests/test-xbzrle
  GTESTER tests/test-blockjob-txn
  GTESTER tests/test-x86-cpuid
  GTESTER tests/test-vmstate
  GTESTER tests/test-cutils
  GTESTER tests/test-mul64
  GTESTER tests/test-int128
Failed to load simple/primitive:b_1
Failed to load simple/primitive:i64_2
Failed to load simple/primitive:i32_1
Failed to load simple/primitive:i32_1
  GTESTER tests/rcutorture
  GTESTER tests/test-rcu-list
  GTESTER tests/test-qdist
  GTESTER tests/test-qht
  LINK    tests/test-qht-par
  GTESTER tests/test-bitops
  GTESTER tests/check-qom-interface
  GTESTER tests/check-qom-proplist
  GTESTER tests/test-qemu-opts
  GTESTER tests/test-write-threshold
  GTESTER tests/test-crypto-hash
  GTESTER tests/test-crypto-cipher
  GTESTER tests/test-crypto-secret
  GTESTER tests/test-qga
  GTESTER tests/test-timed-average
  GTESTER tests/test-io-task
  GTESTER tests/test-io-channel-socket
  GTESTER tests/test-io-channel-file
  GTESTER tests/test-io-channel-command
  GTESTER tests/test-io-channel-buffer
  GTESTER tests/test-base64
  GTESTER tests/test-crypto-ivgen
  GTESTER tests/test-crypto-afsplit
  GTESTER tests/test-crypto-xts
  GTESTER tests/test-crypto-block
  GTESTER tests/test-logging
  GTESTER tests/test-replication
  GTESTER tests/test-uuid
  GTESTER tests/test-bufferiszero
  GTESTER tests/ptimer-test
  GTESTER check-qtest-x86_64
  GTESTER check-qtest-aarch64
  GTESTER tests/test-qobject-output-visitor
  GTESTER tests/test-clone-visitor
  GTESTER tests/test-qobject-input-visitor
ftruncate: Permission denied
  GTESTER tests/test-qobject-input-strict
  GTESTER tests/test-qmp-commands
  GTESTER tests/test-string-input-visitor
  GTESTER tests/test-string-output-visitor
  GTESTER tests/test-qmp-event
  GTESTER tests/test-opts-visitor
  GTESTER tests/test-visitor-serialization
  GTESTER tests/test-qht-par
ftruncate: Permission denied
ftruncate: Permission denied
**
ERROR:/tmp/qemu-test/src/tests/vhost-user-test.c:668:test_migrate: assertion failed: (qdict_haskey(rsp, "return"))
GTester: last random seed: R02S5deea6b538a04947e7d4b442437f3be7
ftruncate: Permission denied
ftruncate: Permission denied
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6smtmz0t/src'
  BUILD   fedora
make[1]: Entering directory `/var/tmp/patchew-tester-tmp-6smtmz0t/src'
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPY    RUNNER
    RUN test-mingw in qemu:fedora 
Packages installed:
PyYAML-3.11-12.fc24.x86_64
SDL-devel-1.2.15-21.fc24.x86_64
bc-1.06.95-16.fc24.x86_64
bison-3.0.4-4.fc24.x86_64
ccache-3.3.2-1.fc24.x86_64
clang-3.8.0-2.fc24.x86_64
findutils-4.6.0-7.fc24.x86_64
flex-2.6.0-2.fc24.x86_64
gcc-6.2.1-2.fc24.x86_64
gcc-c++-6.2.1-2.fc24.x86_64
git-2.7.4-3.fc24.x86_64
glib2-devel-2.48.2-1.fc24.x86_64
libfdt-devel-1.4.2-1.fc24.x86_64
make-4.1-5.fc24.x86_64
mingw32-SDL-1.2.15-7.fc24.noarch
mingw32-bzip2-1.0.6-7.fc24.noarch
mingw32-curl-7.47.0-1.fc24.noarch
mingw32-glib2-2.48.2-1.fc24.noarch
mingw32-gmp-6.1.0-1.fc24.noarch
mingw32-gnutls-3.4.14-1.fc24.noarch
mingw32-gtk2-2.24.31-1.fc24.noarch
mingw32-gtk3-3.20.9-1.fc24.noarch
mingw32-libjpeg-turbo-1.5.0-1.fc24.noarch
mingw32-libpng-1.6.23-1.fc24.noarch
mingw32-libssh2-1.4.3-5.fc24.noarch
mingw32-libtasn1-4.5-2.fc24.noarch
mingw32-nettle-3.2-1.fc24.noarch
mingw32-pixman-0.34.0-1.fc24.noarch
mingw32-pkg-config-0.28-6.fc24.x86_64
mingw64-SDL-1.2.15-7.fc24.noarch
mingw64-bzip2-1.0.6-7.fc24.noarch
mingw64-curl-7.47.0-1.fc24.noarch
mingw64-glib2-2.48.2-1.fc24.noarch
mingw64-gmp-6.1.0-1.fc24.noarch
mingw64-gnutls-3.4.14-1.fc24.noarch
mingw64-gtk2-2.24.31-1.fc24.noarch
mingw64-gtk3-3.20.9-1.fc24.noarch
mingw64-libjpeg-turbo-1.5.0-1.fc24.noarch
mingw64-libpng-1.6.23-1.fc24.noarch
mingw64-libssh2-1.4.3-5.fc24.noarch
mingw64-libtasn1-4.5-2.fc24.noarch
mingw64-nettle-3.2-1.fc24.noarch
mingw64-pixman-0.34.0-1.fc24.noarch
mingw64-pkg-config-0.28-6.fc24.x86_64
perl-5.22.2-362.fc24.x86_64
pixman-devel-0.34.0-2.fc24.x86_64
sparse-0.5.0-7.fc24.x86_64
tar-1.28-7.fc24.x86_64
which-2.20-13.fc24.x86_64
zlib-devel-1.2.8-10.fc24.x86_64

Environment variables:
PACKAGES=ccache git tar PyYAML sparse flex bison     glib2-devel pixman-devel zlib-devel SDL-devel libfdt-devel     gcc gcc-c++ clang make perl which bc findutils     mingw32-pixman mingw32-glib2 mingw32-gmp mingw32-SDL mingw32-pkg-config     mingw32-gtk2 mingw32-gtk3 mingw32-gnutls mingw32-nettle mingw32-libtasn1     mingw32-libjpeg-turbo mingw32-libpng mingw32-curl mingw32-libssh2     mingw32-bzip2     mingw64-pixman mingw64-glib2 mingw64-gmp mingw64-SDL mingw64-pkg-config     mingw64-gtk2 mingw64-gtk3 mingw64-gnutls mingw64-nettle mingw64-libtasn1     mingw64-libjpeg-turbo mingw64-libpng mingw64-curl mingw64-libssh2     mingw64-bzip2
HOSTNAME=
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES=mingw clang pyyaml dtc
DEBUG=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install --cross-prefix=x86_64-w64-mingw32- --enable-trace-backends=simple --enable-debug --enable-gnutls --enable-nettle --enable-curl --enable-vnc --enable-bzip2 --enable-guest-agent --with-sdlabi=1.2 --with-gtkabi=2.0
Install prefix    /var/tmp/qemu-build/install
BIOS directory    /var/tmp/qemu-build/install
binary directory  /var/tmp/qemu-build/install
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install
local state directory   queried at runtime
Windows SDK       no
Source path       /tmp/qemu-test/src
C compiler        x86_64-w64-mingw32-gcc
Host C compiler   cc
C++ compiler      x86_64-w64-mingw32-g++
Objective-C compiler clang
ARFLAGS           rv
CFLAGS            -g 
QEMU_CFLAGS       -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/pixman-1  -I$(SRC_PATH)/dtc/libfdt -Werror -mms-bitfields -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0 -I/usr/x86_64-w64-mingw32/sys-root/mingw/lib/glib-2.0/include -I/usr/x86_64-w64-mingw32/sys-root/mingw/include  -m64 -mcx16 -mthreads -D__USE_MINGW_ANSI_STDIO=1 -DWIN32_LEAN_AND_MEAN -DWINVER=0x501 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wno-shift-negative-value -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-strong -I/usr/x86_64-w64-mingw32/sys-root/mingw/include -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/p11-kit-1 -I/usr/x86_64-w64-mingw32/sys-root/mingw/include  -I/usr/x86_64-w64-mingw32/sys-root/mingw/include   -I/usr/x86_64-w64-mingw32/sys-root/mingw/include/libpng16 
LDFLAGS           -Wl,--nxcompat -Wl,--no-seh -Wl,--dynamicbase -Wl,--warn-common -m64 -g 
make              make
install           install
python            python -B
smbd              /usr/sbin/smbd
module support    no
host CPU          x86_64
host big endian   no
target list       x86_64-softmmu aarch64-softmmu
tcg debug enabled yes
gprof enabled     no
sparse enabled    no
strip binaries    no
profiler          no
static build      no
pixman            system
SDL support       yes (1.2.15)
GTK support       yes (2.24.31)
GTK GL support    no
VTE support       no 
TLS priority      NORMAL
GNUTLS support    yes
GNUTLS rnd        yes
libgcrypt         no
libgcrypt kdf     no
nettle            yes (3.2)
nettle kdf        yes
libtasn1          yes
curses support    no
virgl support     no
curl support      yes
mingw32 support   yes
Audio drivers     dsound
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS support    no
VNC support       yes
VNC SASL support  no
VNC JPEG support  yes
VNC PNG support   yes
xen support       no
brlapi support    no
bluez  support    no
Documentation     no
PIE               no
vde support       no
netmap support    no
Linux AIO support no
ATTR/XATTR support no
Install blobs     yes
KVM support       no
COLO support      yes
RDMA support      no
TCG interpreter   no
fdt support       yes
preadv support    no
fdatasync         no
madvise           no
posix_madvise     no
libcap-ng support no
vhost-net support no
vhost-scsi support no
vhost-vsock support no
Trace backends    simple
Trace output file trace-<pid>
spice support     no 
rbd support       no
xfsctl support    no
smartcard support no
libusb            no
usb net redir     no
OpenGL support    no
OpenGL dmabufs    no
libiscsi support  no
libnfs support    no
build guest agent yes
QGA VSS support   no
QGA w32 disk info yes
QGA MSI support   no
seccomp support   no
coroutine backend win32
coroutine pool    yes
debug stack usage no
GlusterFS support no
Archipelago support no
gcov              gcov
gcov enabled      no
TPM support       yes
libssh2 support   yes
TPM passthrough   no
QOM debugging     yes
lzo support       no
snappy support    no
bzip2 support     yes
NUMA host support no
tcmalloc support  no
jemalloc support  no
avx2 optimization yes
replication support yes
mkdir -p dtc/libfdt
  GEN     x86_64-softmmu/config-devices.mak.tmp
mkdir -p dtc/tests
  GEN     aarch64-softmmu/config-devices.mak.tmp
  GEN     config-host.h
  GEN     qmp-commands.h
  GEN     qemu-options.def
  GEN     qapi-types.h
  GEN     qapi-visit.h
  GEN     qapi-event.h
  GEN     qmp-introspect.h
  GEN     module_block.h
  GEN     tests/test-qapi-types.h
  GEN     tests/test-qapi-visit.h
  GEN     tests/test-qmp-commands.h
  GEN     tests/test-qapi-event.h
  GEN     tests/test-qmp-introspect.h
  GEN     x86_64-softmmu/config-devices.mak
  GEN     aarch64-softmmu/config-devices.mak
  GEN     trace/generated-tracers.h
  GEN     trace/generated-tcg-tracers.h
	 DEP /tmp/qemu-test/src/dtc/tests/dumptrees.c
  GEN     trace/generated-helpers-wrappers.h
	 DEP /tmp/qemu-test/src/dtc/tests/trees.S
  GEN     trace/generated-helpers.h
	 DEP /tmp/qemu-test/src/dtc/tests/testutils.c
	 DEP /tmp/qemu-test/src/dtc/tests/value-labels.c
	 DEP /tmp/qemu-test/src/dtc/tests/asm_tree_dump.c
	 DEP /tmp/qemu-test/src/dtc/tests/truncated_property.c
  GEN     config-all-devices.mak
	 DEP /tmp/qemu-test/src/dtc/tests/subnode_iterate.c
	 DEP /tmp/qemu-test/src/dtc/tests/integer-expressions.c
	 DEP /tmp/qemu-test/src/dtc/tests/utilfdt_test.c
	 DEP /tmp/qemu-test/src/dtc/tests/path_offset_aliases.c
	 DEP /tmp/qemu-test/src/dtc/tests/add_subnode_with_nops.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtbs_equal_unordered.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtb_reverse.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtbs_equal_ordered.c
	 DEP /tmp/qemu-test/src/dtc/tests/extra-terminating-null.c
	 DEP /tmp/qemu-test/src/dtc/tests/incbin.c
	 DEP /tmp/qemu-test/src/dtc/tests/boot-cpuid.c
	 DEP /tmp/qemu-test/src/dtc/tests/phandle_format.c
	 DEP /tmp/qemu-test/src/dtc/tests/path-references.c
	 DEP /tmp/qemu-test/src/dtc/tests/references.c
	 DEP /tmp/qemu-test/src/dtc/tests/string_escapes.c
	 DEP /tmp/qemu-test/src/dtc/tests/propname_escapes.c
	 DEP /tmp/qemu-test/src/dtc/tests/appendprop2.c
	 DEP /tmp/qemu-test/src/dtc/tests/appendprop1.c
	 DEP /tmp/qemu-test/src/dtc/tests/del_node.c
	 DEP /tmp/qemu-test/src/dtc/tests/del_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/setprop.c
	 DEP /tmp/qemu-test/src/dtc/tests/set_name.c
	 DEP /tmp/qemu-test/src/dtc/tests/rw_tree1.c
	 DEP /tmp/qemu-test/src/dtc/tests/open_pack.c
	 DEP /tmp/qemu-test/src/dtc/tests/nopulate.c
	 DEP /tmp/qemu-test/src/dtc/tests/mangle-layout.c
	 DEP /tmp/qemu-test/src/dtc/tests/move_and_save.c
	 DEP /tmp/qemu-test/src/dtc/tests/sw_tree1.c
	 DEP /tmp/qemu-test/src/dtc/tests/nop_node.c
	 DEP /tmp/qemu-test/src/dtc/tests/nop_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/setprop_inplace.c
	 DEP /tmp/qemu-test/src/dtc/tests/notfound.c
	 DEP /tmp/qemu-test/src/dtc/tests/sized_cells.c
	 DEP /tmp/qemu-test/src/dtc/tests/char_literal.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_alias.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_compatible.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_check_compatible.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_phandle.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_prop_value.c
	 DEP /tmp/qemu-test/src/dtc/tests/parent_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/supernode_atdepth_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_path.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_phandle.c
	 DEP /tmp/qemu-test/src/dtc/tests/getprop.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_name.c
	 DEP /tmp/qemu-test/src/dtc/tests/path_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/subnode_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/find_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/root_node.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_mem_rsv.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_empty_tree.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_strerror.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_rw.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_sw.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_wip.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_ro.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt.c
	 DEP /tmp/qemu-test/src/dtc/util.c
	 DEP /tmp/qemu-test/src/dtc/fdtput.c
	 DEP /tmp/qemu-test/src/dtc/fdtget.c
	 DEP /tmp/qemu-test/src/dtc/fdtdump.c
	 LEX convert-dtsv0-lexer.lex.c
	 DEP /tmp/qemu-test/src/dtc/srcpos.c
	 BISON dtc-parser.tab.c
	 LEX dtc-lexer.lex.c
	 DEP /tmp/qemu-test/src/dtc/treesource.c
	 DEP /tmp/qemu-test/src/dtc/livetree.c
	 DEP /tmp/qemu-test/src/dtc/fstree.c
	 DEP /tmp/qemu-test/src/dtc/flattree.c
	 DEP /tmp/qemu-test/src/dtc/dtc.c
	 DEP /tmp/qemu-test/src/dtc/data.c
	 DEP /tmp/qemu-test/src/dtc/checks.c
	 DEP convert-dtsv0-lexer.lex.c
	 DEP dtc-lexer.lex.c
	 DEP dtc-parser.tab.c
	CHK version_gen.h
	UPD version_gen.h
	 DEP /tmp/qemu-test/src/dtc/util.c
	 CC libfdt/fdt.o
	 CC libfdt/fdt_ro.o
	 CC libfdt/fdt_wip.o
	 CC libfdt/fdt_sw.o
	 CC libfdt/fdt_rw.o
	 CC libfdt/fdt_strerror.o
	 CC libfdt/fdt_empty_tree.o
	 AR libfdt/libfdt.a
x86_64-w64-mingw32-ar: creating libfdt/libfdt.a
a - libfdt/fdt.o
a - libfdt/fdt_ro.o
a - libfdt/fdt_wip.o
a - libfdt/fdt_sw.o
a - libfdt/fdt_rw.o
a - libfdt/fdt_strerror.o
a - libfdt/fdt_empty_tree.o
  RC      version.lo
  RC      version.o
  GEN     qga/qapi-generated/qga-qapi-types.h
  GEN     qga/qapi-generated/qga-qapi-visit.h
  GEN     qga/qapi-generated/qga-qmp-commands.h
  GEN     qga/qapi-generated/qga-qapi-types.c
  GEN     qga/qapi-generated/qga-qapi-visit.c
  GEN     qga/qapi-generated/qga-qmp-marshal.c
  GEN     qmp-introspect.c
  GEN     qapi-types.c
  GEN     qapi-visit.c
  GEN     qapi-event.c
  CC      qapi/qapi-visit-core.o
  CC      qapi/qapi-dealloc-visitor.o
  CC      qapi/qobject-input-visitor.o
  CC      qapi/qobject-output-visitor.o
  CC      qapi/qmp-registry.o
  CC      qapi/qmp-dispatch.o
  CC      qapi/string-input-visitor.o
  CC      qapi/string-output-visitor.o
  CC      qapi/opts-visitor.o
  CC      qapi/qapi-clone-visitor.o
  CC      qapi/qmp-event.o
  CC      qapi/qapi-util.o
  CC      qobject/qnull.o
  CC      qobject/qint.o
  CC      qobject/qstring.o
  CC      qobject/qdict.o
  CC      qobject/qlist.o
  CC      qobject/qfloat.o
  CC      qobject/qbool.o
  CC      qobject/qjson.o
  CC      qobject/qobject.o
  CC      qobject/json-lexer.o
  CC      qobject/json-streamer.o
  CC      qobject/json-parser.o
  GEN     trace/generated-tracers.c
  CC      trace/simple.o
  CC      trace/control.o
  CC      trace/qmp.o
  CC      util/osdep.o
  CC      util/cutils.o
  CC      util/unicode.o
  CC      util/qemu-timer-common.o
  CC      util/bufferiszero.o
  CC      util/event_notifier-win32.o
  CC      util/oslib-win32.o
  CC      util/qemu-thread-win32.o
  CC      util/envlist.o
  CC      util/path.o
  CC      util/module.o
  CC      util/bitmap.o
  CC      util/bitops.o
  CC      util/hbitmap.o
  CC      util/fifo8.o
  CC      util/acl.o
  CC      util/error.o
  CC      util/qemu-error.o
  CC      util/id.o
  CC      util/iov.o
  CC      util/qemu-config.o
  CC      util/qemu-sockets.o
  CC      util/uri.o
  CC      util/notify.o
  CC      util/qemu-option.o
  CC      util/qemu-progress.o
  CC      util/hexdump.o
  CC      util/crc32c.o
  CC      util/uuid.o
  CC      util/throttle.o
  CC      util/getauxval.o
  CC      util/readline.o
  CC      util/rcu.o
  CC      util/qemu-coroutine.o
  CC      util/qemu-coroutine-lock.o
  CC      util/qemu-coroutine-io.o
  CC      util/qemu-coroutine-sleep.o
  CC      util/coroutine-win32.o
  CC      util/buffer.o
  CC      util/timed-average.o
  CC      util/base64.o
  CC      util/log.o
  CC      util/qdist.o
  CC      util/qht.o
  CC      util/range.o
  CC      crypto/pbkdf-stub.o
  CC      stubs/arch-query-cpu-def.o
  CC      stubs/arch-query-cpu-model-expansion.o
  CC      stubs/arch-query-cpu-model-comparison.o
  CC      stubs/arch-query-cpu-model-baseline.o
  CC      stubs/bdrv-next-monitor-owned.o
  CC      stubs/blk-commit-all.o
  CC      stubs/blockdev-close-all-bdrv-states.o
  CC      stubs/clock-warp.o
  CC      stubs/cpu-get-clock.o
  CC      stubs/cpu-get-icount.o
  CC      stubs/dump.o
  CC      stubs/error-printf.o
  CC      stubs/fdset-add-fd.o
  CC      stubs/fdset-find-fd.o
  CC      stubs/fdset-get-fd.o
  CC      stubs/fdset-remove-fd.o
  CC      stubs/gdbstub.o
  CC      stubs/get-fd.o
  CC      stubs/get-next-serial.o
  CC      stubs/get-vm-name.o
  CC      stubs/iothread.o
  CC      stubs/iothread-lock.o
  CC      stubs/is-daemonized.o
  CC      stubs/machine-init-done.o
  CC      stubs/migr-blocker.o
  CC      stubs/mon-is-qmp.o
  CC      stubs/monitor-init.o
  CC      stubs/notify-event.o
  CC      stubs/qtest.o
  CC      stubs/replay.o
  CC      stubs/replay-user.o
  CC      stubs/reset.o
  CC      stubs/runstate-check.o
  CC      stubs/set-fd-handler.o
  CC      stubs/slirp.o
  CC      stubs/sysbus.o
  CC      stubs/trace-control.o
  CC      stubs/uuid.o
  CC      stubs/vm-stop.o
  CC      stubs/vmstate.o
  CC      stubs/fd-register.o
  CC      stubs/cpus.o
  CC      stubs/kvm.o
  CC      stubs/qmp_pc_dimm_device_list.o
  CC      stubs/target-monitor-defs.o
  CC      stubs/target-get-monitor-def.o
  CC      stubs/vhost.o
  CC      stubs/iohandler.o
  CC      stubs/smbios_type_38.o
  CC      stubs/ipmi.o
  CC      stubs/pc_madt_cpu_entry.o
  CC      stubs/migration-colo.o
  GEN     qemu-img-cmds.h
  CC      async.o
  CC      thread-pool.o
  CC      block.o
  CC      blockjob.o
  CC      main-loop.o
  CC      iohandler.o
  CC      qemu-timer.o
  CC      aio-win32.o
  CC      qemu-io-cmds.o
  CC      replication.o
  CC      block/raw_bsd.o
  CC      block/qcow.o
  CC      block/vdi.o
  CC      block/vmdk.o
  CC      block/cloop.o
  CC      block/bochs.o
  CC      block/vpc.o
  CC      block/vvfat.o
  CC      block/dmg.o
  CC      block/qcow2.o
  CC      block/qcow2-refcount.o
  CC      block/qcow2-cluster.o
  CC      block/qcow2-snapshot.o
  CC      block/qcow2-cache.o
  CC      block/qed.o
  CC      block/qed-gencb.o
  CC      block/qed-l2-cache.o
  CC      block/qed-table.o
  CC      block/qed-cluster.o
  CC      block/qed-check.o
  CC      block/vhdx.o
  CC      block/vhdx-endian.o
  CC      block/vhdx-log.o
  CC      block/quorum.o
  CC      block/parallels.o
  CC      block/blkdebug.o
  CC      block/blkverify.o
  CC      block/blkreplay.o
  CC      block/block-backend.o
  CC      block/snapshot.o
  CC      block/qapi.o
  CC      block/raw-win32.o
  CC      block/win32-aio.o
  CC      block/null.o
  CC      block/mirror.o
  CC      block/commit.o
  CC      block/io.o
  CC      block/throttle-groups.o
  CC      block/nbd.o
  CC      block/nbd-client.o
  CC      block/sheepdog.o
  CC      block/accounting.o
  CC      block/dirty-bitmap.o
  CC      block/write-threshold.o
  CC      block/backup.o
  CC      block/replication.o
  CC      block/crypto.o
  CC      nbd/server.o
  CC      nbd/client.o
  CC      nbd/common.o
  CC      block/curl.o
  CC      block/ssh.o
  CC      block/dmg-bz2.o
  CC      crypto/init.o
  CC      crypto/hash.o
  CC      crypto/hash-nettle.o
  CC      crypto/aes.o
  CC      crypto/desrfb.o
  CC      crypto/cipher.o
  CC      crypto/tlscreds.o
  CC      crypto/tlscredsanon.o
  CC      crypto/tlscredsx509.o
  CC      crypto/tlssession.o
  CC      crypto/secret.o
  CC      crypto/random-gnutls.o
  CC      crypto/pbkdf.o
  CC      crypto/ivgen.o
  CC      crypto/pbkdf-nettle.o
  CC      crypto/ivgen-essiv.o
  CC      crypto/ivgen-plain.o
  CC      crypto/ivgen-plain64.o
  CC      crypto/afsplit.o
  CC      crypto/xts.o
  CC      crypto/block.o
  CC      crypto/block-qcow.o
  CC      crypto/block-luks.o
  CC      io/channel.o
  CC      io/channel-buffer.o
  CC      io/channel-command.o
  CC      io/channel-file.o
  CC      io/channel-socket.o
  CC      io/channel-tls.o
  CC      io/channel-watch.o
  CC      io/channel-websock.o
  CC      io/channel-util.o
  CC      io/task.o
  CC      qom/object.o
  CC      qom/container.o
  CC      qom/qom-qobject.o
  CC      qom/object_interfaces.o
  CC      qemu-io.o
  CC      blockdev.o
  CC      blockdev-nbd.o
  CC      iothread.o
  CC      qdev-monitor.o
  CC      device-hotplug.o
  CC      os-win32.o
  CC      qemu-char.o
  CC      page_cache.o
  CC      accel.o
  CC      bt-host.o
  CC      bt-vhci.o
  CC      dma-helpers.o
  CC      vl.o
  CC      tpm.o
  CC      device_tree.o
  GEN     qmp-marshal.c
  CC      qmp.o
  CC      hmp.o
  CC      cpus-common.o
  CC      audio/audio.o
  CC      audio/noaudio.o
  CC      audio/wavaudio.o
  CC      audio/mixeng.o
  CC      audio/sdlaudio.o
  CC      audio/dsoundaudio.o
  CC      audio/audio_win_int.o
  CC      audio/wavcapture.o
  CC      backends/rng.o
  CC      backends/rng-egd.o
  CC      backends/msmouse.o
  CC      backends/testdev.o
  CC      backends/tpm.o
  CC      backends/hostmem.o
  CC      backends/hostmem-ram.o
  CC      backends/cryptodev.o
  CC      backends/cryptodev-builtin.o
  CC      block/stream.o
  CC      disas/arm.o
  CXX     disas/arm-a64.o
  CC      disas/i386.o
  CXX     disas/libvixl/vixl/utils.o
  CXX     disas/libvixl/vixl/compiler-intrinsics.o
  CXX     disas/libvixl/vixl/a64/instructions-a64.o
  CXX     disas/libvixl/vixl/a64/decoder-a64.o
  CXX     disas/libvixl/vixl/a64/disasm-a64.o
  CC      hw/acpi/core.o
  CC      hw/acpi/piix4.o
  CC      hw/acpi/pcihp.o
  CC      hw/acpi/ich9.o
  CC      hw/acpi/tco.o
  CC      hw/acpi/cpu_hotplug.o
  CC      hw/acpi/memory_hotplug.o
  CC      hw/acpi/memory_hotplug_acpi_table.o
  CC      hw/acpi/cpu.o
  CC      hw/acpi/nvdimm.o
  CC      hw/acpi/acpi_interface.o
  CC      hw/acpi/bios-linker-loader.o
  CC      hw/acpi/aml-build.o
  CC      hw/acpi/ipmi.o
  CC      hw/audio/sb16.o
  CC      hw/audio/es1370.o
  CC      hw/audio/ac97.o
  CC      hw/audio/fmopl.o
  CC      hw/audio/adlib.o
  CC      hw/audio/gus.o
  CC      hw/audio/gusemu_hal.o
  CC      hw/audio/gusemu_mixer.o
  CC      hw/audio/cs4231a.o
  CC      hw/audio/intel-hda.o
  CC      hw/audio/hda-codec.o
  CC      hw/audio/pcspk.o
  CC      hw/audio/wm8750.o
  CC      hw/audio/pl041.o
  CC      hw/audio/lm4549.o
  CC      hw/audio/marvell_88w8618.o
  CC      hw/block/block.o
  CC      hw/block/cdrom.o
  CC      hw/block/hd-geometry.o
  CC      hw/block/fdc.o
  CC      hw/block/m25p80.o
  CC      hw/block/nand.o
  CC      hw/block/pflash_cfi01.o
  CC      hw/block/pflash_cfi02.o
  CC      hw/block/ecc.o
  CC      hw/block/onenand.o
  CC      hw/block/nvme.o
  CC      hw/bt/core.o
  CC      hw/bt/l2cap.o
  CC      hw/bt/sdp.o
  CC      hw/bt/hci.o
  CC      hw/bt/hid.o
  CC      hw/bt/hci-csr.o
  CC      hw/char/ipoctal232.o
  CC      hw/char/parallel.o
  CC      hw/char/pl011.o
  CC      hw/char/serial.o
  CC      hw/char/serial-isa.o
  CC      hw/char/serial-pci.o
  CC      hw/char/virtio-console.o
  CC      hw/char/cadence_uart.o
  CC      hw/char/debugcon.o
  CC      hw/char/imx_serial.o
  CC      hw/core/qdev.o
  CC      hw/core/qdev-properties.o
  CC      hw/core/bus.o
  CC      hw/core/fw-path-provider.o
  CC      hw/core/irq.o
  CC      hw/core/hotplug.o
  CC      hw/core/ptimer.o
  CC      hw/core/sysbus.o
  CC      hw/core/machine.o
  CC      hw/core/null-machine.o
  CC      hw/core/loader.o
  CC      hw/core/qdev-properties-system.o
  CC      hw/core/register.o
  CC      hw/core/or-irq.o
  CC      hw/core/platform-bus.o
  CC      hw/display/ads7846.o
  CC      hw/display/cirrus_vga.o
  CC      hw/display/pl110.o
  CC      hw/display/ssd0303.o
  CC      hw/display/ssd0323.o
  CC      hw/display/vga-pci.o
  CC      hw/display/vga-isa.o
  CC      hw/display/vmware_vga.o
  CC      hw/display/blizzard.o
  CC      hw/display/exynos4210_fimd.o
  CC      hw/display/framebuffer.o
  CC      hw/display/tc6393xb.o
  CC      hw/dma/pl080.o
  CC      hw/dma/pl330.o
  CC      hw/dma/i8257.o
  CC      hw/dma/xlnx-zynq-devcfg.o
  CC      hw/gpio/max7310.o
  CC      hw/gpio/pl061.o
  CC      hw/gpio/zaurus.o
  CC      hw/gpio/gpio_key.o
  CC      hw/i2c/core.o
  CC      hw/i2c/smbus.o
  CC      hw/i2c/smbus_eeprom.o
  CC      hw/i2c/i2c-ddc.o
  CC      hw/i2c/versatile_i2c.o
  CC      hw/i2c/smbus_ich9.o
  CC      hw/i2c/pm_smbus.o
  CC      hw/i2c/bitbang_i2c.o
  CC      hw/i2c/exynos4210_i2c.o
  CC      hw/i2c/imx_i2c.o
  CC      hw/i2c/aspeed_i2c.o
  CC      hw/ide/core.o
  CC      hw/ide/atapi.o
  CC      hw/ide/qdev.o
  CC      hw/ide/pci.o
  CC      hw/ide/isa.o
  CC      hw/ide/piix.o
  CC      hw/ide/microdrive.o
  CC      hw/ide/ahci.o
  CC      hw/ide/ich.o
  CC      hw/input/hid.o
  CC      hw/input/lm832x.o
  CC      hw/input/pckbd.o
  CC      hw/input/pl050.o
  CC      hw/input/ps2.o
  CC      hw/input/stellaris_input.o
  CC      hw/input/tsc2005.o
  CC      hw/input/vmmouse.o
  CC      hw/input/virtio-input.o
  CC      hw/input/virtio-input-hid.o
  CC      hw/intc/i8259_common.o
  CC      hw/intc/i8259.o
  CC      hw/intc/pl190.o
  CC      hw/intc/imx_avic.o
  CC      hw/intc/realview_gic.o
  CC      hw/intc/ioapic_common.o
  CC      hw/intc/arm_gic_common.o
  CC      hw/intc/arm_gic.o
  CC      hw/intc/arm_gicv2m.o
  CC      hw/intc/arm_gicv3_common.o
  CC      hw/intc/arm_gicv3.o
  CC      hw/intc/arm_gicv3_dist.o
  CC      hw/intc/arm_gicv3_redist.o
  CC      hw/intc/arm_gicv3_its_common.o
  CC      hw/intc/intc.o
  CC      hw/ipack/ipack.o
  CC      hw/ipack/tpci200.o
  CC      hw/ipmi/ipmi.o
  CC      hw/ipmi/ipmi_bmc_sim.o
  CC      hw/ipmi/ipmi_bmc_extern.o
  CC      hw/ipmi/isa_ipmi_kcs.o
  CC      hw/ipmi/isa_ipmi_bt.o
  CC      hw/isa/isa-bus.o
  CC      hw/isa/apm.o
  CC      hw/mem/pc-dimm.o
  CC      hw/mem/nvdimm.o
  CC      hw/misc/applesmc.o
  CC      hw/misc/max111x.o
  CC      hw/misc/tmp105.o
  CC      hw/misc/debugexit.o
  CC      hw/misc/sga.o
  CC      hw/misc/pc-testdev.o
  CC      hw/misc/pci-testdev.o
  CC      hw/misc/arm_l2x0.o
  CC      hw/misc/arm_integrator_debug.o
  CC      hw/misc/a9scu.o
  CC      hw/misc/arm11scu.o
  CC      hw/net/ne2000.o
  CC      hw/net/eepro100.o
  CC      hw/net/pcnet-pci.o
  CC      hw/net/pcnet.o
  CC      hw/net/e1000.o
  CC      hw/net/e1000x_common.o
  CC      hw/net/net_tx_pkt.o
  CC      hw/net/net_rx_pkt.o
  CC      hw/net/e1000e.o
  CC      hw/net/e1000e_core.o
  CC      hw/net/rtl8139.o
  CC      hw/net/vmxnet3.o
  CC      hw/net/smc91c111.o
  CC      hw/net/lan9118.o
  CC      hw/net/ne2000-isa.o
  CC      hw/net/xgmac.o
  CC      hw/net/allwinner_emac.o
  CC      hw/net/imx_fec.o
  CC      hw/net/cadence_gem.o
  CC      hw/net/stellaris_enet.o
  CC      hw/net/rocker/rocker.o
  CC      hw/net/rocker/rocker_fp.o
  CC      hw/net/rocker/rocker_desc.o
  CC      hw/net/rocker/rocker_world.o
  CC      hw/net/rocker/rocker_of_dpa.o
  CC      hw/nvram/eeprom93xx.o
  CC      hw/nvram/fw_cfg.o
  CC      hw/nvram/chrp_nvram.o
  CC      hw/pci-bridge/pci_bridge_dev.o
  CC      hw/pci-bridge/pci_expander_bridge.o
  CC      hw/pci-bridge/xio3130_upstream.o
  CC      hw/pci-bridge/xio3130_downstream.o
  CC      hw/pci-bridge/ioh3420.o
  CC      hw/pci-bridge/i82801b11.o
  CC      hw/pci-host/pam.o
  CC      hw/pci-host/versatile.o
  CC      hw/pci-host/piix.o
  CC      hw/pci-host/q35.o
  CC      hw/pci-host/gpex.o
  CC      hw/pci/pci.o
  CC      hw/pci/pci_bridge.o
  CC      hw/pci/msix.o
  CC      hw/pci/msi.o
  CC      hw/pci/shpc.o
  CC      hw/pci/slotid_cap.o
  CC      hw/pci/pci_host.o
  CC      hw/pci/pcie_host.o
  CC      hw/pci/pcie.o
  CC      hw/pci/pcie_aer.o
  CC      hw/pci/pcie_port.o
  CC      hw/pci/pci-stub.o
  CC      hw/pcmcia/pcmcia.o
  CC      hw/scsi/scsi-disk.o
  CC      hw/scsi/scsi-generic.o
  CC      hw/scsi/scsi-bus.o
  CC      hw/scsi/lsi53c895a.o
  CC      hw/scsi/mptsas.o
  CC      hw/scsi/mptconfig.o
  CC      hw/scsi/mptendian.o
  CC      hw/scsi/megasas.o
  CC      hw/scsi/vmw_pvscsi.o
  CC      hw/scsi/esp.o
  CC      hw/scsi/esp-pci.o
  CC      hw/sd/pl181.o
  CC      hw/sd/ssi-sd.o
  CC      hw/sd/sd.o
  CC      hw/sd/core.o
  CC      hw/sd/sdhci.o
  CC      hw/smbios/smbios.o
  CC      hw/smbios/smbios_type_38.o
  CC      hw/ssi/pl022.o
  CC      hw/ssi/ssi.o
  CC      hw/ssi/xilinx_spips.o
  CC      hw/ssi/aspeed_smc.o
  CC      hw/ssi/stm32f2xx_spi.o
  CC      hw/timer/arm_timer.o
  CC      hw/timer/arm_mptimer.o
  CC      hw/timer/a9gtimer.o
  CC      hw/timer/cadence_ttc.o
  CC      hw/timer/ds1338.o
  CC      hw/timer/hpet.o
  CC      hw/timer/i8254_common.o
  CC      hw/timer/i8254.o
  CC      hw/timer/pl031.o
  CC      hw/timer/twl92230.o
  CC      hw/timer/imx_epit.o
  CC      hw/timer/imx_gpt.o
  CC      hw/timer/stm32f2xx_timer.o
  CC      hw/timer/aspeed_timer.o
  CC      hw/tpm/tpm_tis.o
  CC      hw/usb/core.o
  CC      hw/usb/combined-packet.o
  CC      hw/usb/bus.o
  CC      hw/usb/libhw.o
  CC      hw/usb/desc.o
  CC      hw/usb/desc-msos.o
  CC      hw/usb/hcd-uhci.o
  CC      hw/usb/hcd-ohci.o
  CC      hw/usb/hcd-ehci.o
  CC      hw/usb/hcd-ehci-pci.o
  CC      hw/usb/hcd-ehci-sysbus.o
  CC      hw/usb/hcd-xhci.o
  CC      hw/usb/hcd-musb.o
  CC      hw/usb/dev-hub.o
  CC      hw/usb/dev-hid.o
  CC      hw/usb/dev-storage.o
  CC      hw/usb/dev-wacom.o
  CC      hw/usb/dev-uas.o
  CC      hw/usb/dev-audio.o
  CC      hw/usb/dev-serial.o
  CC      hw/usb/dev-network.o
  CC      hw/usb/dev-bluetooth.o
  CC      hw/usb/dev-smartcard-reader.o
  CC      hw/usb/host-stub.o
  CC      hw/virtio/virtio-rng.o
  CC      hw/virtio/virtio-pci.o
  CC      hw/virtio/virtio-bus.o
  CC      hw/virtio/virtio-mmio.o
  CC      hw/watchdog/watchdog.o
  CC      hw/watchdog/wdt_i6300esb.o
  CC      hw/watchdog/wdt_ib700.o
  CC      migration/migration.o
  CC      migration/socket.o
  CC      migration/fd.o
  CC      migration/exec.o
  CC      migration/tls.o
  CC      migration/colo-comm.o
  CC      migration/colo.o
  CC      migration/colo-failover.o
  CC      migration/vmstate.o
  CC      migration/qemu-file.o
  CC      migration/qemu-file-channel.o
  CC      migration/xbzrle.o
  CC      migration/postcopy-ram.o
  CC      migration/qjson.o
  CC      migration/block.o
  CC      net/net.o
  CC      net/queue.o
  CC      net/checksum.o
  CC      net/util.o
  CC      net/hub.o
  CC      net/socket.o
  CC      net/dump.o
  CC      net/eth.o
  CC      net/tap-win32.o
  CC      net/slirp.o
  CC      net/filter.o
  CC      net/filter-buffer.o
  CC      net/filter-mirror.o
  CC      net/colo-compare.o
  CC      net/colo.o
  CC      net/filter-rewriter.o
  CC      qom/cpu.o
  CC      replay/replay.o
  CC      replay/replay-internal.o
  CC      replay/replay-events.o
  CC      replay/replay-time.o
  CC      replay/replay-input.o
  CC      replay/replay-char.o
  CC      replay/replay-snapshot.o
  CC      slirp/cksum.o
  CC      slirp/if.o
  CC      slirp/ip_icmp.o
  CC      slirp/ip6_icmp.o
  CC      slirp/ip6_input.o
  CC      slirp/ip6_output.o
  CC      slirp/ip_input.o
  CC      slirp/ip_output.o
  CC      slirp/dnssearch.o
  CC      slirp/dhcpv6.o
  CC      slirp/slirp.o
  CC      slirp/mbuf.o
  CC      slirp/misc.o
  CC      slirp/sbuf.o
  CC      slirp/socket.o
  CC      slirp/tcp_input.o
  CC      slirp/tcp_output.o
  CC      slirp/tcp_subr.o
  CC      slirp/tcp_timer.o
  CC      slirp/udp.o
  CC      slirp/udp6.o
  CC      slirp/bootp.o
  CC      slirp/tftp.o
  CC      slirp/arp_table.o
  CC      slirp/ndp_table.o
  CC      ui/keymaps.o
  CC      ui/console.o
  CC      ui/cursor.o
  CC      ui/qemu-pixman.o
  CC      ui/input.o
  CC      ui/input-keymap.o
  CC      ui/input-legacy.o
  CC      ui/sdl.o
  CC      ui/sdl_zoom.o
  CC      ui/x_keymap.o
  CC      ui/vnc.o
  CC      ui/vnc-enc-zlib.o
  CC      ui/vnc-enc-hextile.o
  CC      ui/vnc-enc-tight.o
  CC      ui/vnc-palette.o
  CC      ui/vnc-enc-zrle.o
  CC      ui/vnc-auth-vencrypt.o
  CC      ui/vnc-ws.o
  CC      ui/vnc-jobs.o
  CC      ui/gtk.o
  CC      qga/commands.o
  CC      qga/guest-agent-command-state.o
  CC      qga/main.o
  AS      optionrom/multiboot.o
  AS      optionrom/linuxboot.o
  CC      optionrom/linuxboot_dma.o
  AS      optionrom/kvmvapic.o
  BUILD   optionrom/multiboot.img
  CC      qga/commands-win32.o
  BUILD   optionrom/linuxboot.img
  BUILD   optionrom/linuxboot_dma.img
  CC      qga/channel-win32.o
  BUILD   optionrom/kvmvapic.img
  BUILD   optionrom/multiboot.raw
  BUILD   optionrom/linuxboot.raw
  BUILD   optionrom/linuxboot_dma.raw
  BUILD   optionrom/kvmvapic.raw
  SIGN    optionrom/multiboot.bin
  SIGN    optionrom/linuxboot.bin
  CC      qga/service-win32.o
  CC      qga/vss-win32.o
  SIGN    optionrom/linuxboot_dma.bin
  SIGN    optionrom/kvmvapic.bin
  CC      qga/qapi-generated/qga-qapi-types.o
  CC      qga/qapi-generated/qga-qapi-visit.o
  CC      qga/qapi-generated/qga-qmp-marshal.o
  CC      qmp-introspect.o
  CC      qapi-types.o
  CC      qapi-visit.o
  CC      qapi-event.o
  AR      libqemustub.a
  CC      qemu-img.o
  CC      qmp-marshal.o
  CC      trace/generated-tracers.o
  AR      libqemuutil.a
  LINK    qemu-ga.exe
  LINK    qemu-img.exe
  LINK    qemu-io.exe
  GEN     x86_64-softmmu/hmp-commands.h
  GEN     x86_64-softmmu/hmp-commands-info.h
  GEN     x86_64-softmmu/config-target.h
  CC      x86_64-softmmu/exec.o
  CC      x86_64-softmmu/translate-all.o
  CC      x86_64-softmmu/cpu-exec.o
  CC      x86_64-softmmu/translate-common.o
  CC      x86_64-softmmu/cpu-exec-common.o
  CC      x86_64-softmmu/tcg/tcg.o
  CC      x86_64-softmmu/tcg/tcg-op.o
  CC      x86_64-softmmu/tcg/optimize.o
  CC      x86_64-softmmu/tcg/tcg-common.o
  CC      x86_64-softmmu/fpu/softfloat.o
  CC      x86_64-softmmu/disas.o
  CC      x86_64-softmmu/tcg-runtime.o
  CC      x86_64-softmmu/kvm-stub.o
  CC      x86_64-softmmu/arch_init.o
  CC      x86_64-softmmu/cpus.o
  GEN     aarch64-softmmu/hmp-commands.h
  GEN     aarch64-softmmu/hmp-commands-info.h
  GEN     aarch64-softmmu/config-target.h
  CC      aarch64-softmmu/exec.o
  CC      aarch64-softmmu/translate-all.o
  CC      x86_64-softmmu/monitor.o
  CC      aarch64-softmmu/cpu-exec.o
  CC      aarch64-softmmu/translate-common.o
  CC      aarch64-softmmu/cpu-exec-common.o
  CC      aarch64-softmmu/tcg/tcg.o
  CC      x86_64-softmmu/gdbstub.o
  CC      aarch64-softmmu/tcg/tcg-op.o
  CC      aarch64-softmmu/tcg/optimize.o
  CC      aarch64-softmmu/tcg/tcg-common.o
  CC      aarch64-softmmu/fpu/softfloat.o
  CC      aarch64-softmmu/disas.o
  CC      aarch64-softmmu/tcg-runtime.o
  CC      x86_64-softmmu/balloon.o
  GEN     aarch64-softmmu/gdbstub-xml.c
  CC      aarch64-softmmu/kvm-stub.o
  CC      x86_64-softmmu/ioport.o
  CC      x86_64-softmmu/numa.o
  CC      aarch64-softmmu/arch_init.o
  CC      aarch64-softmmu/cpus.o
  CC      x86_64-softmmu/qtest.o
  CC      aarch64-softmmu/monitor.o
  CC      x86_64-softmmu/bootdevice.o
  CC      aarch64-softmmu/gdbstub.o
  CC      x86_64-softmmu/memory.o
  CC      aarch64-softmmu/balloon.o
  CC      aarch64-softmmu/ioport.o
  CC      aarch64-softmmu/numa.o
  CC      aarch64-softmmu/qtest.o
  CC      x86_64-softmmu/cputlb.o
  CC      x86_64-softmmu/memory_mapping.o
  CC      x86_64-softmmu/dump.o
  CC      x86_64-softmmu/migration/ram.o
  CC      aarch64-softmmu/bootdevice.o
  CC      x86_64-softmmu/migration/savevm.o
  CC      x86_64-softmmu/xen-common-stub.o
  CC      aarch64-softmmu/memory.o
  CC      aarch64-softmmu/cputlb.o
  CC      aarch64-softmmu/memory_mapping.o
  CC      x86_64-softmmu/xen-hvm-stub.o
  CC      x86_64-softmmu/hw/block/virtio-blk.o
  CC      aarch64-softmmu/dump.o
  CC      aarch64-softmmu/migration/ram.o
  CC      x86_64-softmmu/hw/block/dataplane/virtio-blk.o
  CC      aarch64-softmmu/migration/savevm.o
  CC      aarch64-softmmu/xen-common-stub.o
  CC      aarch64-softmmu/xen-hvm-stub.o
  CC      aarch64-softmmu/hw/adc/stm32f2xx_adc.o
  CC      aarch64-softmmu/hw/block/virtio-blk.o
  CC      x86_64-softmmu/hw/char/virtio-serial-bus.o
  CC      x86_64-softmmu/hw/core/nmi.o
  CC      x86_64-softmmu/hw/core/generic-loader.o
In file included from /tmp/qemu-test/src/include/exec/exec-all.h:44:0,
                 from /tmp/qemu-test/src/cputlb.c:23:
/tmp/qemu-test/src/cputlb.c: In function 'tlb_flush_page_by_mmuidx_async_work':
/tmp/qemu-test/src/cputlb.c:54:36: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type 'long unsigned int' [-Werror=format=]
         qemu_log_mask(CPU_LOG_MMU, "%s: " fmt, __func__, \
                                    ^
/tmp/qemu-test/src/include/qemu/log.h:94:22: note: in definition of macro 'qemu_log_mask'
             qemu_log(FMT, ## __VA_ARGS__);              \
                      ^~~
/tmp/qemu-test/src/cputlb.c:293:5: note: in expansion of macro 'tlb_debug'
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
/tmp/qemu-test/src/cputlb.c:57:25: error: format '%llx' expects argument of type 'long long unsigned int', but argument 6 has type 'long unsigned int' [-Werror=format=]
         fprintf(stderr, "%s: " fmt, __func__, ## __VA_ARGS__); \
                         ^
/tmp/qemu-test/src/cputlb.c:293:5: note: in expansion of macro 'tlb_debug'
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
cc1: all warnings being treated as errors
/tmp/qemu-test/src/rules.mak:60: recipe for target 'cputlb.o' failed
make[1]: *** [cputlb.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/hw/block/dataplane/virtio-blk.o
  CC      aarch64-softmmu/hw/char/exynos4210_uart.o
  CC      aarch64-softmmu/hw/char/omap_uart.o
  CC      aarch64-softmmu/hw/char/digic-uart.o
  CC      aarch64-softmmu/hw/char/stm32f2xx_usart.o
  CC      aarch64-softmmu/hw/char/bcm2835_aux.o
  CC      aarch64-softmmu/hw/char/virtio-serial-bus.o
  CC      aarch64-softmmu/hw/core/generic-loader.o
  CC      aarch64-softmmu/hw/core/nmi.o
In file included from /tmp/qemu-test/src/include/exec/exec-all.h:44:0,
                 from /tmp/qemu-test/src/cputlb.c:23:
/tmp/qemu-test/src/cputlb.c: In function 'tlb_flush_page_by_mmuidx_async_work':
/tmp/qemu-test/src/cputlb.c:54:36: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type 'long unsigned int' [-Werror=format=]
         qemu_log_mask(CPU_LOG_MMU, "%s: " fmt, __func__, \
                                    ^
/tmp/qemu-test/src/include/qemu/log.h:94:22: note: in definition of macro 'qemu_log_mask'
             qemu_log(FMT, ## __VA_ARGS__);              \
                      ^~~
/tmp/qemu-test/src/cputlb.c:293:5: note: in expansion of macro 'tlb_debug'
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
/tmp/qemu-test/src/cputlb.c:57:25: error: format '%llx' expects argument of type 'long long unsigned int', but argument 6 has type 'long unsigned int' [-Werror=format=]
         fprintf(stderr, "%s: " fmt, __func__, ## __VA_ARGS__); \
                         ^
/tmp/qemu-test/src/cputlb.c:293:5: note: in expansion of macro 'tlb_debug'
     tlb_debug("page:%d addr:"TARGET_FMT_lx" mmu_idx%" PRIxPTR "\n",
     ^~~~~~~~~
cc1: all warnings being treated as errors
/tmp/qemu-test/src/rules.mak:60: recipe for target 'cputlb.o' failed
make[1]: *** [cputlb.o] Error 1
make[1]: *** Waiting for unfinished jobs....
Makefile:202: recipe for target 'subdir-x86_64-softmmu' failed
make: *** [subdir-x86_64-softmmu] Error 2
make: *** Waiting for unfinished jobs....
Makefile:202: recipe for target 'subdir-aarch64-softmmu' failed
make: *** [subdir-aarch64-softmmu] Error 2
make[1]: *** [docker-run] Error 2
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6smtmz0t/src'
make: *** [docker-run-test-mingw@fedora] Error 2
=== OUTPUT END ===

Test command exited with code: 2


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2016-11-13  5:52 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-09 14:57 [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 01/19] docs: new design document multi-thread-tcg.txt Alex Bennée
2016-11-10 15:00   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 02/19] tcg: add options for enabling MTTCG Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 03/19] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
2016-11-10 15:10   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 04/19] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 05/19] tcg: drop global lock during TCG code execution Alex Bennée
2016-11-10 15:18   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 06/19] tcg: remove global exit_request Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 07/19] tcg: enable tb_lock() for SoftMMU Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 08/19] tcg: enable thread-per-vCPU Alex Bennée
2016-11-10 16:35   ` Richard Henderson
2016-11-10 16:46     ` Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 09/19] tcg: handle EXCP_ATOMIC exception for system emulation Alex Bennée
2016-11-10 16:36   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 10/19] cputlb: add assert_cpu_is_self checks Alex Bennée
2016-11-10 16:39   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 11/19] cputlb: introduce tlb_flush_* async work Alex Bennée
2016-11-10 16:48   ` Richard Henderson
2016-11-10 17:34     ` Alex Bennée
2016-11-10 17:40       ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 12/19] cputlb: tweak qemu_ram_addr_from_host_nofail reporting Alex Bennée
2016-11-10 16:51   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 13/19] cputlb: atomically update tlb fields used by tlb_reset_dirty Alex Bennée
2016-11-09 19:36   ` Pranith Kumar
2016-11-10 16:14     ` Alex Bennée
2016-11-10 17:27       ` Richard Henderson
2016-11-10 18:00         ` Alex Bennée
2016-11-10 18:32           ` Richard Henderson
2016-11-10 17:23   ` Richard Henderson
2016-11-10 18:07     ` Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 14/19] target-arm/powerctl: defer cpu reset work to CPU context Alex Bennée
2016-11-10 17:35   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 15/19] target-arm/cpu: don't reset TLB structures, use cputlb to do it Alex Bennée
2016-11-10 17:48   ` Richard Henderson
2016-11-10 18:08     ` Alex Bennée
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 16/19] target-arm: ensure BQL taken for ARM_CP_IO register access Alex Bennée
2016-11-10 17:54   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 17/19] target-arm: helpers which may affect global state need the BQL Alex Bennée
2016-11-10 17:56   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 18/19] target-arm: don't generate WFE/YIELD calls for MTTCG Alex Bennée
2016-11-10 17:59   ` Richard Henderson
2016-11-09 14:57 ` [Qemu-devel] [PATCH v6 19/19] tcg: enable MTTCG by default for ARM on x86 hosts Alex Bennée
2016-11-10 18:00   ` Richard Henderson
2016-11-10 18:13     ` Alex Bennée
2016-11-10 18:41       ` Richard Henderson
2016-11-09 15:11 ` [Qemu-devel] [PATCH v6 00/19] Remaining MTTCG Base patches and ARM enablement Paolo Bonzini
2016-11-09 18:38   ` Alex Bennée
2016-11-13  5:50 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.