All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible
@ 2015-07-02  8:20 Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz

This is the rebased and updated version of the patches I posted a
couple months ago (well before soft freeze :)).

This version introduces a qemu_mutex_iothread_locked() primitive
in order to avoid recursive locking of the BQL.  The previous
attempts, which used functions such as address_space_rw_unlocked,
required the introduction of a multitude of *_unlocked functions
(e.g. address_space_ldl_unlocked or dma_buf_write_unlocked).

Note that adding unlocked access to TCG would require reverting
commit 3b64349 (memory: Replace io_mem_read/write with
memory_region_dispatch_read/write, 2015-04-26).

Paolo

v2->v3: Fix grammar consistency in patch 3 [Fam]
        Fix bad rebase in patch 5 [Fam]

Jan Kiszka (4):
  memory: Add global-locking property to memory regions
  memory: let address_space_rw/ld*/st* run outside the BQL
  kvm: First step to push iothread lock out of inner run loop
  kvm: Switch to unlocked PIO

Paolo Bonzini (5):
  main-loop: use qemu_mutex_lock_iothread consistently
  main-loop: introduce qemu_mutex_iothread_locked
  exec: pull qemu_flush_coalesced_mmio_buffer() into
    address_space_rw/ld*/st*
  acpi: mark PMTIMER as unlocked
  kvm: Switch to unlocked MMIO

 cpus.c                   | 19 ++++++++++---
 exec.c                   | 69 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/core.c           |  1 +
 include/exec/memory.h    | 26 ++++++++++++++++++
 include/qemu/main-loop.h | 10 +++++++
 kvm-all.c                |  8 ++++--
 memory.c                 | 23 ++++++++--------
 stubs/iothread-lock.c    |  5 ++++
 target-i386/kvm.c        | 24 +++++++++++++++++
 target-mips/kvm.c        |  4 +++
 target-ppc/kvm.c         |  7 +++++
 target-s390x/kvm.c       |  3 +++
 12 files changed, 182 insertions(+), 17 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz, Frederic Konrad

The next patch will require the BQL to be always taken with
qemu_mutex_lock_iothread(), while right now this isn't the case.

Outside TCG mode this is not a problem.  In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread.  This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().

Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/cpus.c b/cpus.c
index 4f0e54d..c09fbef 100644
--- a/cpus.c
+++ b/cpus.c
@@ -954,7 +954,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
     CPUState *cpu = arg;
     int r;
 
-    qemu_mutex_lock(&qemu_global_mutex);
+    qemu_mutex_lock_iothread();
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu->can_do_io = 1;
@@ -1034,10 +1034,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
 
+    qemu_mutex_lock_iothread();
     qemu_tcg_init_cpu_signals();
     qemu_thread_get_self(cpu->thread);
 
-    qemu_mutex_lock(&qemu_global_mutex);
     CPU_FOREACH(cpu) {
         cpu->thread_id = qemu_get_thread_id();
         cpu->created = true;
@@ -1149,7 +1149,11 @@ bool qemu_in_vcpu_thread(void)
 void qemu_mutex_lock_iothread(void)
 {
     atomic_inc(&iothread_requesting_mutex);
-    if (!tcg_enabled() || !first_cpu || !first_cpu->thread) {
+    /* In the simple case there is no need to bump the VCPU thread out of
+     * TCG code execution.
+     */
+    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
+        !first_cpu || !first_cpu->thread) {
         qemu_mutex_lock(&qemu_global_mutex);
         atomic_dec(&iothread_requesting_mutex);
     } else {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz, Frederic Konrad

This function will be used to avoid recursive locking of the iothread lock
whenever address_space_rw/ld*/st* are called with the BQL held, which is
almost always the case.

Tracking whether the iothread is owned is very cheap (just use a TLS
variable) but requires some care because now the lock must always be
taken with qemu_mutex_lock_iothread().  Previously this wasn't the case.
Outside TCG mode this is not a problem.  In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread.  This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().

Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                   |  9 +++++++++
 include/qemu/main-loop.h | 10 ++++++++++
 stubs/iothread-lock.c    |  5 +++++
 3 files changed, 24 insertions(+)

diff --git a/cpus.c b/cpus.c
index c09fbef..f547aeb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1146,6 +1146,13 @@ bool qemu_in_vcpu_thread(void)
     return current_cpu && qemu_cpu_is_self(current_cpu);
 }
 
+static __thread bool iothread_locked = false;
+
+bool qemu_mutex_iothread_locked(void)
+{
+    return iothread_locked;
+}
+
 void qemu_mutex_lock_iothread(void)
 {
     atomic_inc(&iothread_requesting_mutex);
@@ -1164,10 +1171,12 @@ void qemu_mutex_lock_iothread(void)
         atomic_dec(&iothread_requesting_mutex);
         qemu_cond_broadcast(&qemu_io_proceeded_cond);
     }
+    iothread_locked = true;
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
+    iothread_locked = false;
     qemu_mutex_unlock(&qemu_global_mutex);
 }
 
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 0f4a0fd..bc18ca3 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -223,6 +223,16 @@ int qemu_add_child_watch(pid_t pid);
 #endif
 
 /**
+ * qemu_mutex_iothread_locked: Return lock status of the main loop mutex.
+ *
+ * The main loop mutex is the coarsest lock in QEMU, and as such it
+ * must always be taken outside other locks.  This function helps
+ * functions take different paths depending on whether the current
+ * thread is running within the main loop mutex.
+ */
+bool qemu_mutex_iothread_locked(void);
+
+/**
  * qemu_mutex_lock_iothread: Lock the main loop mutex.
  *
  * This function locks the main loop mutex.  The mutex is taken by
diff --git a/stubs/iothread-lock.c b/stubs/iothread-lock.c
index 5d8aca1..dda6f6b 100644
--- a/stubs/iothread-lock.c
+++ b/stubs/iothread-lock.c
@@ -1,6 +1,11 @@
 #include "qemu-common.h"
 #include "qemu/main-loop.h"
 
+bool qemu_mutex_iothread_locked(void)
+{
+    return true;
+}
+
 void qemu_mutex_lock_iothread(void)
 {
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, famz, Frederic Konrad

From: Jan Kiszka <jan.kiszka@siemens.com>

This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/exec/memory.h | 26 ++++++++++++++++++++++++++
 memory.c              | 11 +++++++++++
 2 files changed, 37 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8ae004e..0ebdc55 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -180,6 +180,7 @@ struct MemoryRegion {
     bool rom_device;
     bool warning_printed; /* For reservations */
     bool flush_coalesced_mmio;
+    bool global_locking;
     MemoryRegion *alias;
     hwaddr alias_offset;
     int32_t priority;
@@ -825,6 +826,31 @@ void memory_region_set_flush_coalesced(MemoryRegion *mr);
 void memory_region_clear_flush_coalesced(MemoryRegion *mr);
 
 /**
+ * memory_region_set_global_locking: Declares that access processing requires
+ *                                   QEMU's global lock.
+ *
+ * When this is invoked, accesses to the memory region will be processed while
+ * holding the global lock of QEMU. This is the default behavior of memory
+ * regions.
+ *
+ * @mr: the memory region to be updated.
+ */
+void memory_region_set_global_locking(MemoryRegion *mr);
+
+/**
+ * memory_region_clear_global_locking: Declares that access processing does
+ *                                     not depend on the QEMU global lock.
+ *
+ * By clearing this property, accesses to the memory region will be processed
+ * outside of QEMU's global lock (unless the lock is held on when issuing the
+ * access request). In this case, the device model implementing the access
+ * handlers is responsible for synchronization of concurrency.
+ *
+ * @mr: the memory region to be updated.
+ */
+void memory_region_clear_global_locking(MemoryRegion *mr);
+
+/**
  * memory_region_add_eventfd: Request an eventfd to be triggered when a word
  *                            is written to a location.
  *
diff --git a/memory.c b/memory.c
index 3ac0bd2..b0b8860 100644
--- a/memory.c
+++ b/memory.c
@@ -1012,6 +1012,7 @@ static void memory_region_initfn(Object *obj)
     mr->ram_addr = RAM_ADDR_INVALID;
     mr->enabled = true;
     mr->romd_mode = true;
+    mr->global_locking = true;
     mr->destructor = memory_region_destructor_none;
     QTAILQ_INIT(&mr->subregions);
     QTAILQ_INIT(&mr->coalesced);
@@ -1646,6 +1647,16 @@ void memory_region_clear_flush_coalesced(MemoryRegion *mr)
     }
 }
 
+void memory_region_set_global_locking(MemoryRegion *mr)
+{
+    mr->global_locking = true;
+}
+
+void memory_region_clear_global_locking(MemoryRegion *mr)
+{
+    mr->global_locking = false;
+}
+
 void memory_region_add_eventfd(MemoryRegion *mr,
                                hwaddr addr,
                                unsigned size,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (2 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz, Frederic Konrad

As memory_region_read/write_accessor will now be run also without BQL held,
we need to move coalesced MMIO flushing earlier in the dispatch process.

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c   | 21 +++++++++++++++++++++
 memory.c | 12 ------------
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/exec.c b/exec.c
index f7883d2..f2e6603 100644
--- a/exec.c
+++ b/exec.c
@@ -2316,6 +2316,13 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
     return l;
 }
 
+static void prepare_mmio_access(MemoryRegion *mr)
+{
+    if (mr->flush_coalesced_mmio) {
+        qemu_flush_coalesced_mmio_buffer();
+    }
+}
+
 MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
                              uint8_t *buf, int len, bool is_write)
 {
@@ -2333,6 +2340,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
 
         if (is_write) {
             if (!memory_access_is_direct(mr, is_write)) {
+                prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 /* XXX: could force current_cpu to NULL to avoid
                    potential bugs */
@@ -2374,6 +2382,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
         } else {
             if (!memory_access_is_direct(mr, is_write)) {
                 /* I/O case */
+                prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 switch (l) {
                 case 8:
@@ -2739,6 +2748,8 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, false);
     if (l < 4 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -2828,6 +2839,8 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 8 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -2937,6 +2950,8 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 2 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -3026,6 +3041,8 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
         r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
     } else {
         addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
@@ -3065,6 +3082,8 @@ static inline void address_space_stl_internal(AddressSpace *as,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap32(val);
@@ -3169,6 +3188,8 @@ static inline void address_space_stw_internal(AddressSpace *as,
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, true);
     if (l < 2 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap16(val);
diff --git a/memory.c b/memory.c
index b0b8860..5a0cc66 100644
--- a/memory.c
+++ b/memory.c
@@ -396,9 +396,6 @@ static MemTxResult  memory_region_read_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = mr->ops->read(mr->opaque, addr, size);
     trace_memory_region_ops_read(mr, addr, tmp, size);
     *value |= (tmp & mask) << shift;
@@ -416,9 +413,6 @@ static MemTxResult memory_region_read_with_attrs_accessor(MemoryRegion *mr,
     uint64_t tmp = 0;
     MemTxResult r;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     r = mr->ops->read_with_attrs(mr->opaque, addr, &tmp, size, attrs);
     trace_memory_region_ops_read(mr, addr, tmp, size);
     *value |= (tmp & mask) << shift;
@@ -451,9 +445,6 @@ static MemTxResult memory_region_write_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = (*value >> shift) & mask;
     trace_memory_region_ops_write(mr, addr, tmp, size);
     mr->ops->write(mr->opaque, addr, tmp, size);
@@ -470,9 +461,6 @@ static MemTxResult memory_region_write_with_attrs_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = (*value >> shift) & mask;
     trace_memory_region_ops_write(mr, addr, tmp, size);
     return mr->ops->write_with_attrs(mr->opaque, addr, tmp, size, attrs);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (3 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, famz, Frederic Konrad

From: Jan Kiszka <jan.kiszka@siemens.com>

The MMIO case is further broken up in two cases: if the caller does not
hold the BQL on invocation, the unlocked one takes or avoids BQL depending
on the locking strategy of the target memory region and its coalesced
MMIO handling.  In this case, the caller should not hold _any_ lock
(a friendly suggestion which is disregarded by virtio-scsi-dataplane).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 57 insertions(+), 9 deletions(-)

diff --git a/exec.c b/exec.c
index f2e6603..3457f7e 100644
--- a/exec.c
+++ b/exec.c
@@ -48,6 +48,7 @@
 #endif
 #include "exec/cpu-all.h"
 #include "qemu/rcu_queue.h"
+#include "qemu/main-loop.h"
 #include "exec/cputlb.h"
 #include "translate-all.h"
 
@@ -2316,11 +2317,27 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
     return l;
 }
 
-static void prepare_mmio_access(MemoryRegion *mr)
+static bool prepare_mmio_access(MemoryRegion *mr)
 {
+    bool unlocked = !qemu_mutex_iothread_locked();
+    bool release_lock = false;
+
+    if (unlocked && mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        unlocked = false;
+        release_lock = true;
+    }
     if (mr->flush_coalesced_mmio) {
+        if (unlocked) {
+            qemu_mutex_lock_iothread();
+        }
         qemu_flush_coalesced_mmio_buffer();
+        if (unlocked) {
+            qemu_mutex_unlock_iothread();
+        }
     }
+
+    return release_lock;
 }
 
 MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
@@ -2332,6 +2349,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
     hwaddr addr1;
     MemoryRegion *mr;
     MemTxResult result = MEMTX_OK;
+    bool release_lock = false;
 
     rcu_read_lock();
     while (len > 0) {
@@ -2340,7 +2358,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
 
         if (is_write) {
             if (!memory_access_is_direct(mr, is_write)) {
-                prepare_mmio_access(mr);
+                release_lock |= prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 /* XXX: could force current_cpu to NULL to avoid
                    potential bugs */
@@ -2382,7 +2400,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
         } else {
             if (!memory_access_is_direct(mr, is_write)) {
                 /* I/O case */
-                prepare_mmio_access(mr);
+                release_lock |= prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 switch (l) {
                 case 8:
@@ -2418,6 +2436,12 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
                 memcpy(buf, ptr, l);
             }
         }
+
+        if (release_lock) {
+            qemu_mutex_unlock_iothread();
+            release_lock = false;
+        }
+
         len -= l;
         buf += l;
         addr += l;
@@ -2744,11 +2768,12 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     hwaddr l = 4;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, false);
     if (l < 4 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
@@ -2782,6 +2807,9 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -2834,12 +2862,13 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     hwaddr l = 8;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 8 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
@@ -2873,6 +2902,9 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -2945,12 +2977,13 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     hwaddr l = 2;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 2 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
@@ -2984,6 +3017,9 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -3036,12 +3072,13 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     hwaddr addr1;
     MemTxResult r;
     uint8_t dirty_log_mask;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
     } else {
@@ -3057,6 +3094,9 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
@@ -3077,12 +3117,13 @@ static inline void address_space_stl_internal(AddressSpace *as,
     hwaddr l = 4;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
@@ -3115,6 +3156,9 @@ static inline void address_space_stl_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
@@ -3184,11 +3228,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
     hwaddr l = 2;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, true);
     if (l < 2 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
@@ -3221,6 +3266,9 @@ static inline void address_space_stw_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (4 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, famz

From: Jan Kiszka <jan.kiszka@siemens.com>

This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.

For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c          | 10 ++++++++--
 target-i386/kvm.c  | 24 ++++++++++++++++++++++++
 target-mips/kvm.c  |  4 ++++
 target-ppc/kvm.c   |  7 +++++++
 target-s390x/kvm.c |  3 +++
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index e98b08d..ca428ca 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1755,6 +1755,8 @@ int kvm_cpu_exec(CPUState *cpu)
         return EXCP_HLT;
     }
 
+    qemu_mutex_unlock_iothread();
+
     do {
         MemTxAttrs attrs;
 
@@ -1773,11 +1775,9 @@ int kvm_cpu_exec(CPUState *cpu)
              */
             qemu_cpu_kick_self();
         }
-        qemu_mutex_unlock_iothread();
 
         run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
 
-        qemu_mutex_lock_iothread();
         attrs = kvm_arch_post_run(cpu, run);
 
         if (run_ret < 0) {
@@ -1804,20 +1804,24 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
+            qemu_mutex_lock_iothread();
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
+            qemu_mutex_lock_iothread();
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
@@ -1860,6 +1864,8 @@ int kvm_cpu_exec(CPUState *cpu)
         }
     } while (ret == 0);
 
+    qemu_mutex_lock_iothread();
+
     if (ret < 0) {
         cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
         vm_stop(RUN_STATE_INTERNAL_ERROR);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index daced5c..6426600 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2191,7 +2191,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
     /* Inject NMI */
     if (cpu->interrupt_request & CPU_INTERRUPT_NMI) {
+        qemu_mutex_lock_iothread();
         cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
+        qemu_mutex_unlock_iothread();
+
         DPRINTF("injected NMI\n");
         ret = kvm_vcpu_ioctl(cpu, KVM_NMI);
         if (ret < 0) {
@@ -2200,6 +2203,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
         }
     }
 
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
+
     /* Force the VCPU out of its inner loop to process any INIT requests
      * or (for userspace APIC, but it is cheap to combine the checks here)
      * pending TPR access reports.
@@ -2243,6 +2250,8 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
         DPRINTF("setting tpr\n");
         run->cr8 = cpu_get_apic_tpr(x86_cpu->apic_state);
+
+        qemu_mutex_unlock_iothread();
     }
 }
 
@@ -2256,8 +2265,17 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
     } else {
         env->eflags &= ~IF_MASK;
     }
+
+    /* We need to protect the apic state against concurrent accesses from
+     * different threads in case the userspace irqchip is used. */
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
     cpu_set_apic_tpr(x86_cpu->apic_state, run->cr8);
     cpu_set_apic_base(x86_cpu->apic_state, run->apic_base);
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_unlock_iothread();
+    }
     return cpu_get_mem_attrs(env);
 }
 
@@ -2550,13 +2568,17 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     switch (run->exit_reason) {
     case KVM_EXIT_HLT:
         DPRINTF("handle_hlt\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_halt(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_SET_TPR:
         ret = 0;
         break;
     case KVM_EXIT_TPR_ACCESS:
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_tpr_access(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_FAIL_ENTRY:
         code = run->fail_entry.hardware_entry_failure_reason;
@@ -2582,7 +2604,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     case KVM_EXIT_DEBUG:
         DPRINTF("kvm_exit_debug\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_debug(cpu, &run->debug.arch);
+        qemu_mutex_unlock_iothread();
         break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 948619f..7d2293d 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -99,6 +99,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     struct kvm_mips_interrupt intr;
 
+    qemu_mutex_lock_iothread();
+
     if ((cs->interrupt_request & CPU_INTERRUPT_HARD) &&
             cpu_mips_io_interrupts_pending(cpu)) {
         intr.cpu = -1;
@@ -109,6 +111,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
                          __func__, cs->cpu_index, intr.irq);
         }
     }
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index afb4696..ddf469f 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1242,6 +1242,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     unsigned irq;
 
+    qemu_mutex_lock_iothread();
+
     /* PowerPC QEMU tracks the various core input pins (interrupt, critical
      * interrupt, reset, etc) in PPC-specific env->irq_input_state. */
     if (!cap_interrupt_level &&
@@ -1269,6 +1271,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     /* We don't know if there are more interrupts pending after this. However,
      * the guest will return to userspace in the course of handling this one
      * anyways, so we will get a chance to deliver the rest. */
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
@@ -1570,6 +1574,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     CPUPPCState *env = &cpu->env;
     int ret;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
     case KVM_EXIT_DCR:
         if (run->dcr.is_write) {
@@ -1620,6 +1626,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     }
 
+    qemu_mutex_unlock_iothread();
     return ret;
 }
 
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 135111a..ae3a0af 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2007,6 +2007,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     S390CPU *cpu = S390_CPU(cs);
     int ret = 0;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
         case KVM_EXIT_S390_SIEIC:
             ret = handle_intercept(cpu);
@@ -2027,6 +2029,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
             fprintf(stderr, "Unknown KVM exit: %d\n", run->exit_reason);
             break;
     }
+    qemu_mutex_unlock_iothread();
 
     if (ret == 0) {
         ret = EXCP_INTERRUPT;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (5 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka, famz

From: Jan Kiszka <jan.kiszka@siemens.com>

Do not take the BQL before dispatching PIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables
completely BQL-free PIO handling in KVM mode for upcoming devices with
fine-grained locking.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index ca428ca..ad5ac5e 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1804,13 +1804,12 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (6 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz

Accessing QEMU_CLOCK_VIRTUAL is thread-safe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/acpi/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 0f201d8..fe6215a 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -528,6 +528,7 @@ void acpi_pm_tmr_init(ACPIREGS *ar, acpi_update_sci_fn update_sci,
     ar->tmr.timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, acpi_pm_tmr_timer, ar);
     memory_region_init_io(&ar->tmr.io, memory_region_owner(parent),
                           &acpi_pm_tmr_ops, ar, "acpi-tmr", 4);
+    memory_region_clear_global_locking(&ar->tmr.io);
     memory_region_add_subregion(parent, 8, &ar->tmr.io);
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (7 preceding siblings ...)
  2015-07-02  8:20 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz

Do not take the BQL before dispatching MMIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables completely
BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained
locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index ad5ac5e..df57da0 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1814,13 +1814,12 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.

For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
---
 kvm-all.c          | 10 ++++++++--
 target-i386/kvm.c  | 24 ++++++++++++++++++++++++
 target-mips/kvm.c  |  4 ++++
 target-ppc/kvm.c   |  7 +++++++
 target-s390x/kvm.c |  3 +++
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 53e01d4..af3d10b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1752,6 +1752,8 @@ int kvm_cpu_exec(CPUState *cpu)
         return EXCP_HLT;
     }
 
+    qemu_mutex_unlock_iothread();
+
     do {
         MemTxAttrs attrs;
 
@@ -1770,11 +1772,9 @@ int kvm_cpu_exec(CPUState *cpu)
              */
             qemu_cpu_kick_self();
         }
-        qemu_mutex_unlock_iothread();
 
         run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
 
-        qemu_mutex_lock_iothread();
         attrs = kvm_arch_post_run(cpu, run);
 
         if (run_ret < 0) {
@@ -1801,20 +1801,24 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
+            qemu_mutex_lock_iothread();
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
+            qemu_mutex_lock_iothread();
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
@@ -1857,6 +1861,8 @@ int kvm_cpu_exec(CPUState *cpu)
         }
     } while (ret == 0);
 
+    qemu_mutex_lock_iothread();
+
     if (ret < 0) {
         cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
         vm_stop(RUN_STATE_INTERNAL_ERROR);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5a236e3..45fd3b0 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2192,7 +2192,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
     /* Inject NMI */
     if (cpu->interrupt_request & CPU_INTERRUPT_NMI) {
+        qemu_mutex_lock_iothread();
         cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
+        qemu_mutex_unlock_iothread();
+
         DPRINTF("injected NMI\n");
         ret = kvm_vcpu_ioctl(cpu, KVM_NMI);
         if (ret < 0) {
@@ -2201,6 +2204,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
         }
     }
 
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
+
     /* Force the VCPU out of its inner loop to process any INIT requests
      * or (for userspace APIC, but it is cheap to combine the checks here)
      * pending TPR access reports.
@@ -2244,6 +2251,8 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
         DPRINTF("setting tpr\n");
         run->cr8 = cpu_get_apic_tpr(x86_cpu->apic_state);
+
+        qemu_mutex_unlock_iothread();
     }
 }
 
@@ -2257,8 +2266,17 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
     } else {
         env->eflags &= ~IF_MASK;
     }
+
+    /* We need to protect the apic state against concurrent accesses from
+     * different threads in case the userspace irqchip is used. */
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
     cpu_set_apic_tpr(x86_cpu->apic_state, run->cr8);
     cpu_set_apic_base(x86_cpu->apic_state, run->apic_base);
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_unlock_iothread();
+    }
     return cpu_get_mem_attrs(env);
 }
 
@@ -2551,13 +2569,17 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     switch (run->exit_reason) {
     case KVM_EXIT_HLT:
         DPRINTF("handle_hlt\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_halt(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_SET_TPR:
         ret = 0;
         break;
     case KVM_EXIT_TPR_ACCESS:
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_tpr_access(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_FAIL_ENTRY:
         code = run->fail_entry.hardware_entry_failure_reason;
@@ -2583,7 +2605,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     case KVM_EXIT_DEBUG:
         DPRINTF("kvm_exit_debug\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_debug(cpu, &run->debug.arch);
+        qemu_mutex_unlock_iothread();
         break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 948619f..7d2293d 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -99,6 +99,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     struct kvm_mips_interrupt intr;
 
+    qemu_mutex_lock_iothread();
+
     if ((cs->interrupt_request & CPU_INTERRUPT_HARD) &&
             cpu_mips_io_interrupts_pending(cpu)) {
         intr.cpu = -1;
@@ -109,6 +111,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
                          __func__, cs->cpu_index, intr.irq);
         }
     }
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index afb4696..ddf469f 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1242,6 +1242,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     unsigned irq;
 
+    qemu_mutex_lock_iothread();
+
     /* PowerPC QEMU tracks the various core input pins (interrupt, critical
      * interrupt, reset, etc) in PPC-specific env->irq_input_state. */
     if (!cap_interrupt_level &&
@@ -1269,6 +1271,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     /* We don't know if there are more interrupts pending after this. However,
      * the guest will return to userspace in the course of handling this one
      * anyways, so we will get a chance to deliver the rest. */
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
@@ -1570,6 +1574,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     CPUPPCState *env = &cpu->env;
     int ret;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
     case KVM_EXIT_DCR:
         if (run->dcr.is_write) {
@@ -1620,6 +1626,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     }
 
+    qemu_mutex_unlock_iothread();
     return ret;
 }
 
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index b02ff8d..1ebcd18 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2007,6 +2007,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     S390CPU *cpu = S390_CPU(cs);
     int ret = 0;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
         case KVM_EXIT_S390_SIEIC:
             ret = handle_intercept(cpu);
@@ -2027,6 +2029,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
             fprintf(stderr, "Unknown KVM exit: %d\n", run->exit_reason);
             break;
     }
+    qemu_mutex_unlock_iothread();
 
     if (ret == 0) {
         ret = EXCP_INTERRUPT;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-23  9:45       ` Fam Zheng
@ 2015-06-23  9:49         ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-06-23  9:49 UTC (permalink / raw)
  To: Fam Zheng; +Cc: jan.kiszka, qemu-devel



On 23/06/2015 11:45, Fam Zheng wrote:
>>> > >            break;
>>> > >        case KVM_EXIT_SYSTEM_EVENT:
>>> > >            switch (run->system_event.type) {
>>> > >            case KVM_SYSTEM_EVENT_SHUTDOWN:
>>> > > *              qemu_system_shutdown_request();
>>> > >                ret = EXCP_INTERRUPT;
>>> > >                break;
>>> > >            case KVM_SYSTEM_EVENT_RESET:
>>> > > *              qemu_system_reset_request();
>> > 
>> > These two are thread-safe.
> But above you add lock/unlock around the qemu_system_reset_request() under
> KVM_EXIT_SHUTDOWN.  What's different?

That's unnecessary.

Also, just below this switch there's another call to
kvm_arch_handle_exit that should be wrapped by lock/unlock (it works
anyway because no one handles KVM_EXIT_SYSTEM_EVENT in
kvm_arch_handle_exit, but it's not clean).

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-23  9:29     ` Paolo Bonzini
@ 2015-06-23  9:45       ` Fam Zheng
  2015-06-23  9:49         ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Fam Zheng @ 2015-06-23  9:45 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: jan.kiszka, qemu-devel

On Tue, 06/23 11:29, Paolo Bonzini wrote:
> 
> 
> On 23/06/2015 11:26, Fam Zheng wrote:
> > On Thu, 06/18 18:47, Paolo Bonzini wrote:
> >> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>
> >> This opens the path to get rid of the iothread lock on vmexits in KVM
> >> mode. On x86, the in-kernel irqchips has to be used because we otherwise
> >> need to synchronize APIC and other per-cpu state accesses that could be
> >> changed concurrently.
> >>
> >> s390x and ARM should be fine without specific locking as their
> >> pre/post-run callbacks are empty. MIPS and POWER require locking for
> >> the pre-run callback.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >>  kvm-all.c         | 14 ++++++++++++--
> >>  target-i386/kvm.c | 18 ++++++++++++++++++
> >>  target-mips/kvm.c |  4 ++++
> >>  target-ppc/kvm.c  |  4 ++++
> >>  4 files changed, 38 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kvm-all.c b/kvm-all.c
> >> index b2b1bc3..2bd8e9b 100644
> >> --- a/kvm-all.c
> >> +++ b/kvm-all.c
> >> @@ -1795,6 +1795,8 @@ int kvm_cpu_exec(CPUState *cpu)
> >>          return EXCP_HLT;
> >>      }
> >>  
> >> +    qemu_mutex_unlock_iothread();
> >> +
> >>      do {
> >>          MemTxAttrs attrs;
> >>  
> >> @@ -1813,11 +1815,9 @@ int kvm_cpu_exec(CPUState *cpu)
> >>               */
> >>              qemu_cpu_kick_self();
> >>          }
> >> -        qemu_mutex_unlock_iothread();
> >>  
> >>          run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
> >>  
> >> -        qemu_mutex_lock_iothread();
> >>          attrs = kvm_arch_post_run(cpu, run);
> >>  
> >>          if (run_ret < 0) {
> >> @@ -1836,20 +1836,24 @@ int kvm_cpu_exec(CPUState *cpu)
> >>          switch (run->exit_reason) {
> >>          case KVM_EXIT_IO:
> >>              DPRINTF("handle_io\n");
> >> +            qemu_mutex_lock_iothread();
> >>              kvm_handle_io(run->io.port, attrs,
> >>                            (uint8_t *)run + run->io.data_offset,
> >>                            run->io.direction,
> >>                            run->io.size,
> >>                            run->io.count);
> >> +            qemu_mutex_unlock_iothread();
> >>              ret = 0;
> >>              break;
> >>          case KVM_EXIT_MMIO:
> >>              DPRINTF("handle_mmio\n");
> >> +            qemu_mutex_lock_iothread();
> >>              address_space_rw(&address_space_memory,
> >>                               run->mmio.phys_addr, attrs,
> >>                               run->mmio.data,
> >>                               run->mmio.len,
> >>                               run->mmio.is_write);
> >> +            qemu_mutex_unlock_iothread();
> >>              ret = 0;
> >>              break;
> >>          case KVM_EXIT_IRQ_WINDOW_OPEN:
> >> @@ -1858,7 +1862,9 @@ int kvm_cpu_exec(CPUState *cpu)
> >>              break;
> >>          case KVM_EXIT_SHUTDOWN:
> >>              DPRINTF("shutdown\n");
> >> +            qemu_mutex_lock_iothread();
> >>              qemu_system_reset_request();
> >> +            qemu_mutex_unlock_iothread();
> >>              ret = EXCP_INTERRUPT;
> >>              break;
> >>          case KVM_EXIT_UNKNOWN:
> > 
> > More context:
> > 
> >            fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
> >                    (uint64_t)run->hw.hardware_exit_reason);
> >            ret = -1;
> >            break;
> >        case KVM_EXIT_INTERNAL_ERROR:
> > *          ret = kvm_handle_internal_error(cpu, run);
> 
> This one only accesses data internal to the VCPU thread.
> 
> >            break;
> >        case KVM_EXIT_SYSTEM_EVENT:
> >            switch (run->system_event.type) {
> >            case KVM_SYSTEM_EVENT_SHUTDOWN:
> > *              qemu_system_shutdown_request();
> >                ret = EXCP_INTERRUPT;
> >                break;
> >            case KVM_SYSTEM_EVENT_RESET:
> > *              qemu_system_reset_request();
> 
> These two are thread-safe.

But above you add lock/unlock around the qemu_system_reset_request() under
KVM_EXIT_SHUTDOWN.  What's different?

Fam

> 
> Paolo
> 
> >                ret = EXCP_INTERRUPT;
> >>              break;
> >>          default:
> >>              DPRINTF("kvm_arch_handle_exit\n");
> >> +            qemu_mutex_lock_iothread();
> >>              ret = kvm_arch_handle_exit(cpu, run);
> >> +            qemu_mutex_unlock_iothread();
> >>              break;
> >>          }
> >>      } while (ret == 0);
> >>  
> >> +    qemu_mutex_lock_iothread();
> >> +
> >>      if (ret < 0) {
> >>          cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
> >>          vm_stop(RUN_STATE_INTERNAL_ERROR);
> > 
> > Could you explain why above three "*" calls are safe?
> > 
> > Fam
> > 
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-23  9:26   ` Fam Zheng
@ 2015-06-23  9:29     ` Paolo Bonzini
  2015-06-23  9:45       ` Fam Zheng
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2015-06-23  9:29 UTC (permalink / raw)
  To: Fam Zheng; +Cc: jan.kiszka, qemu-devel



On 23/06/2015 11:26, Fam Zheng wrote:
> On Thu, 06/18 18:47, Paolo Bonzini wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> This opens the path to get rid of the iothread lock on vmexits in KVM
>> mode. On x86, the in-kernel irqchips has to be used because we otherwise
>> need to synchronize APIC and other per-cpu state accesses that could be
>> changed concurrently.
>>
>> s390x and ARM should be fine without specific locking as their
>> pre/post-run callbacks are empty. MIPS and POWER require locking for
>> the pre-run callback.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  kvm-all.c         | 14 ++++++++++++--
>>  target-i386/kvm.c | 18 ++++++++++++++++++
>>  target-mips/kvm.c |  4 ++++
>>  target-ppc/kvm.c  |  4 ++++
>>  4 files changed, 38 insertions(+), 2 deletions(-)
>>
>> diff --git a/kvm-all.c b/kvm-all.c
>> index b2b1bc3..2bd8e9b 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -1795,6 +1795,8 @@ int kvm_cpu_exec(CPUState *cpu)
>>          return EXCP_HLT;
>>      }
>>  
>> +    qemu_mutex_unlock_iothread();
>> +
>>      do {
>>          MemTxAttrs attrs;
>>  
>> @@ -1813,11 +1815,9 @@ int kvm_cpu_exec(CPUState *cpu)
>>               */
>>              qemu_cpu_kick_self();
>>          }
>> -        qemu_mutex_unlock_iothread();
>>  
>>          run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
>>  
>> -        qemu_mutex_lock_iothread();
>>          attrs = kvm_arch_post_run(cpu, run);
>>  
>>          if (run_ret < 0) {
>> @@ -1836,20 +1836,24 @@ int kvm_cpu_exec(CPUState *cpu)
>>          switch (run->exit_reason) {
>>          case KVM_EXIT_IO:
>>              DPRINTF("handle_io\n");
>> +            qemu_mutex_lock_iothread();
>>              kvm_handle_io(run->io.port, attrs,
>>                            (uint8_t *)run + run->io.data_offset,
>>                            run->io.direction,
>>                            run->io.size,
>>                            run->io.count);
>> +            qemu_mutex_unlock_iothread();
>>              ret = 0;
>>              break;
>>          case KVM_EXIT_MMIO:
>>              DPRINTF("handle_mmio\n");
>> +            qemu_mutex_lock_iothread();
>>              address_space_rw(&address_space_memory,
>>                               run->mmio.phys_addr, attrs,
>>                               run->mmio.data,
>>                               run->mmio.len,
>>                               run->mmio.is_write);
>> +            qemu_mutex_unlock_iothread();
>>              ret = 0;
>>              break;
>>          case KVM_EXIT_IRQ_WINDOW_OPEN:
>> @@ -1858,7 +1862,9 @@ int kvm_cpu_exec(CPUState *cpu)
>>              break;
>>          case KVM_EXIT_SHUTDOWN:
>>              DPRINTF("shutdown\n");
>> +            qemu_mutex_lock_iothread();
>>              qemu_system_reset_request();
>> +            qemu_mutex_unlock_iothread();
>>              ret = EXCP_INTERRUPT;
>>              break;
>>          case KVM_EXIT_UNKNOWN:
> 
> More context:
> 
>            fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
>                    (uint64_t)run->hw.hardware_exit_reason);
>            ret = -1;
>            break;
>        case KVM_EXIT_INTERNAL_ERROR:
> *          ret = kvm_handle_internal_error(cpu, run);

This one only accesses data internal to the VCPU thread.

>            break;
>        case KVM_EXIT_SYSTEM_EVENT:
>            switch (run->system_event.type) {
>            case KVM_SYSTEM_EVENT_SHUTDOWN:
> *              qemu_system_shutdown_request();
>                ret = EXCP_INTERRUPT;
>                break;
>            case KVM_SYSTEM_EVENT_RESET:
> *              qemu_system_reset_request();

These two are thread-safe.

Paolo

>                ret = EXCP_INTERRUPT;
>>              break;
>>          default:
>>              DPRINTF("kvm_arch_handle_exit\n");
>> +            qemu_mutex_lock_iothread();
>>              ret = kvm_arch_handle_exit(cpu, run);
>> +            qemu_mutex_unlock_iothread();
>>              break;
>>          }
>>      } while (ret == 0);
>>  
>> +    qemu_mutex_lock_iothread();
>> +
>>      if (ret < 0) {
>>          cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
>>          vm_stop(RUN_STATE_INTERNAL_ERROR);
> 
> Could you explain why above three "*" calls are safe?
> 
> Fam
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-18 16:47 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
  2015-06-18 18:19   ` Christian Borntraeger
@ 2015-06-23  9:26   ` Fam Zheng
  2015-06-23  9:29     ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Fam Zheng @ 2015-06-23  9:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: jan.kiszka, qemu-devel

On Thu, 06/18 18:47, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This opens the path to get rid of the iothread lock on vmexits in KVM
> mode. On x86, the in-kernel irqchips has to be used because we otherwise
> need to synchronize APIC and other per-cpu state accesses that could be
> changed concurrently.
> 
> s390x and ARM should be fine without specific locking as their
> pre/post-run callbacks are empty. MIPS and POWER require locking for
> the pre-run callback.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  kvm-all.c         | 14 ++++++++++++--
>  target-i386/kvm.c | 18 ++++++++++++++++++
>  target-mips/kvm.c |  4 ++++
>  target-ppc/kvm.c  |  4 ++++
>  4 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index b2b1bc3..2bd8e9b 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1795,6 +1795,8 @@ int kvm_cpu_exec(CPUState *cpu)
>          return EXCP_HLT;
>      }
>  
> +    qemu_mutex_unlock_iothread();
> +
>      do {
>          MemTxAttrs attrs;
>  
> @@ -1813,11 +1815,9 @@ int kvm_cpu_exec(CPUState *cpu)
>               */
>              qemu_cpu_kick_self();
>          }
> -        qemu_mutex_unlock_iothread();
>  
>          run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
>  
> -        qemu_mutex_lock_iothread();
>          attrs = kvm_arch_post_run(cpu, run);
>  
>          if (run_ret < 0) {
> @@ -1836,20 +1836,24 @@ int kvm_cpu_exec(CPUState *cpu)
>          switch (run->exit_reason) {
>          case KVM_EXIT_IO:
>              DPRINTF("handle_io\n");
> +            qemu_mutex_lock_iothread();
>              kvm_handle_io(run->io.port, attrs,
>                            (uint8_t *)run + run->io.data_offset,
>                            run->io.direction,
>                            run->io.size,
>                            run->io.count);
> +            qemu_mutex_unlock_iothread();
>              ret = 0;
>              break;
>          case KVM_EXIT_MMIO:
>              DPRINTF("handle_mmio\n");
> +            qemu_mutex_lock_iothread();
>              address_space_rw(&address_space_memory,
>                               run->mmio.phys_addr, attrs,
>                               run->mmio.data,
>                               run->mmio.len,
>                               run->mmio.is_write);
> +            qemu_mutex_unlock_iothread();
>              ret = 0;
>              break;
>          case KVM_EXIT_IRQ_WINDOW_OPEN:
> @@ -1858,7 +1862,9 @@ int kvm_cpu_exec(CPUState *cpu)
>              break;
>          case KVM_EXIT_SHUTDOWN:
>              DPRINTF("shutdown\n");
> +            qemu_mutex_lock_iothread();
>              qemu_system_reset_request();
> +            qemu_mutex_unlock_iothread();
>              ret = EXCP_INTERRUPT;
>              break;
>          case KVM_EXIT_UNKNOWN:

More context:

           fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
                   (uint64_t)run->hw.hardware_exit_reason);
           ret = -1;
           break;
       case KVM_EXIT_INTERNAL_ERROR:
*          ret = kvm_handle_internal_error(cpu, run);
           break;
       case KVM_EXIT_SYSTEM_EVENT:
           switch (run->system_event.type) {
           case KVM_SYSTEM_EVENT_SHUTDOWN:
*              qemu_system_shutdown_request();
               ret = EXCP_INTERRUPT;
               break;
           case KVM_SYSTEM_EVENT_RESET:
*              qemu_system_reset_request();
               ret = EXCP_INTERRUPT;
>              break;
>          default:
>              DPRINTF("kvm_arch_handle_exit\n");
> +            qemu_mutex_lock_iothread();
>              ret = kvm_arch_handle_exit(cpu, run);
> +            qemu_mutex_unlock_iothread();
>              break;
>          }
>      } while (ret == 0);
>  
> +    qemu_mutex_lock_iothread();
> +
>      if (ret < 0) {
>          cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
>          vm_stop(RUN_STATE_INTERNAL_ERROR);

Could you explain why above three "*" calls are safe?

Fam

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-18 16:47 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
@ 2015-06-18 18:19   ` Christian Borntraeger
  2015-06-23  9:26   ` Fam Zheng
  1 sibling, 0 replies; 17+ messages in thread
From: Christian Borntraeger @ 2015-06-18 18:19 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: jan.kiszka

Am 18.06.2015 um 18:47 schrieb Paolo Bonzini:

> @@ -1887,11 +1893,15 @@ int kvm_cpu_exec(CPUState *cpu)
>              break;
>          default:
>              DPRINTF("kvm_arch_handle_exit\n");
> +            qemu_mutex_lock_iothread();
>              ret = kvm_arch_handle_exit(cpu, run);
> +            qemu_mutex_unlock_iothread();
>              break;
>          }
>      } while (ret == 0);
> 


The next logical step would be to do a push down. Get rid of
these two new lines and do it in every kvm_arch_handle_exit 
function. This would allow arch maintainers to do their part
of lock breaking. Can be an addon patch, though.

Christian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-18 16:47 [Qemu-devel] [PATCH for-2.4 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-06-18 16:47 ` Paolo Bonzini
  2015-06-18 18:19   ` Christian Borntraeger
  2015-06-23  9:26   ` Fam Zheng
  0 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2015-06-18 16:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: jan.kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

s390x and ARM should be fine without specific locking as their
pre/post-run callbacks are empty. MIPS and POWER require locking for
the pre-run callback.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c         | 14 ++++++++++++--
 target-i386/kvm.c | 18 ++++++++++++++++++
 target-mips/kvm.c |  4 ++++
 target-ppc/kvm.c  |  4 ++++
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index b2b1bc3..2bd8e9b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1795,6 +1795,8 @@ int kvm_cpu_exec(CPUState *cpu)
         return EXCP_HLT;
     }
 
+    qemu_mutex_unlock_iothread();
+
     do {
         MemTxAttrs attrs;
 
@@ -1813,11 +1815,9 @@ int kvm_cpu_exec(CPUState *cpu)
              */
             qemu_cpu_kick_self();
         }
-        qemu_mutex_unlock_iothread();
 
         run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
 
-        qemu_mutex_lock_iothread();
         attrs = kvm_arch_post_run(cpu, run);
 
         if (run_ret < 0) {
@@ -1836,20 +1836,24 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
+            qemu_mutex_lock_iothread();
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
+            qemu_mutex_lock_iothread();
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
@@ -1858,7 +1862,9 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         case KVM_EXIT_SHUTDOWN:
             DPRINTF("shutdown\n");
+            qemu_mutex_lock_iothread();
             qemu_system_reset_request();
+            qemu_mutex_unlock_iothread();
             ret = EXCP_INTERRUPT;
             break;
         case KVM_EXIT_UNKNOWN:
@@ -1887,11 +1893,15 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         default:
             DPRINTF("kvm_arch_handle_exit\n");
+            qemu_mutex_lock_iothread();
             ret = kvm_arch_handle_exit(cpu, run);
+            qemu_mutex_unlock_iothread();
             break;
         }
     } while (ret == 0);
 
+    qemu_mutex_lock_iothread();
+
     if (ret < 0) {
         cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
         vm_stop(RUN_STATE_INTERNAL_ERROR);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index ca2da84..8c2a891 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2192,7 +2192,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
     /* Inject NMI */
     if (cpu->interrupt_request & CPU_INTERRUPT_NMI) {
+        qemu_mutex_lock_iothread();
         cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
+        qemu_mutex_unlock_iothread();
+
         DPRINTF("injected NMI\n");
         ret = kvm_vcpu_ioctl(cpu, KVM_NMI);
         if (ret < 0) {
@@ -2201,6 +2204,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
         }
     }
 
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
+
     /* Force the VCPU out of its inner loop to process any INIT requests
      * or (for userspace APIC, but it is cheap to combine the checks here)
      * pending TPR access reports.
@@ -2244,6 +2251,8 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
         DPRINTF("setting tpr\n");
         run->cr8 = cpu_get_apic_tpr(x86_cpu->apic_state);
+
+        qemu_mutex_unlock_iothread();
     }
 }
 
@@ -2257,8 +2266,17 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
     } else {
         env->eflags &= ~IF_MASK;
     }
+
+    /* We need to protect the apic state against concurrent accesses from
+     * different threads in case the userspace irqchip is used. */
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
     cpu_set_apic_tpr(x86_cpu->apic_state, run->cr8);
     cpu_set_apic_base(x86_cpu->apic_state, run->apic_base);
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_unlock_iothread();
+    }
     return MEMTXATTRS_UNSPECIFIED;
 }
 
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 948619f..7d2293d 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -99,6 +99,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     struct kvm_mips_interrupt intr;
 
+    qemu_mutex_lock_iothread();
+
     if ((cs->interrupt_request & CPU_INTERRUPT_HARD) &&
             cpu_mips_io_interrupts_pending(cpu)) {
         intr.cpu = -1;
@@ -109,6 +111,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
                          __func__, cs->cpu_index, intr.irq);
         }
     }
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index afb4696..1fa1529 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1242,6 +1242,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     unsigned irq;
 
+    qemu_mutex_lock_iothread();
+
     /* PowerPC QEMU tracks the various core input pins (interrupt, critical
      * interrupt, reset, etc) in PPC-specific env->irq_input_state. */
     if (!cap_interrupt_level &&
@@ -1269,6 +1271,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     /* We don't know if there are more interrupts pending after this. However,
      * the guest will return to userspace in the course of handling this one
      * anyways, so we will get a chance to deliver the rest. */
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-07-02  8:21 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
2015-06-18 16:47 [Qemu-devel] [PATCH for-2.4 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-06-18 16:47 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
2015-06-18 18:19   ` Christian Borntraeger
2015-06-23  9:26   ` Fam Zheng
2015-06-23  9:29     ` Paolo Bonzini
2015-06-23  9:45       ` Fam Zheng
2015-06-23  9:49         ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.