[Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible
@ 2015-06-24 16:25 Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz

This is the rebased and updated version of the patches I posted a
couple months ago (well before soft freeze :)).

This version introduces a qemu_mutex_iothread_locked() primitive
in order to avoid recursive locking of the BQL.  The previous
attempts, which used functions such as address_space_rw_unlocked,
required the introduction of a multitude of *_unlocked functions
(e.g. address_space_ldl_unlocked or dma_buf_write_unlocked).

Note that adding unlocked access to TCG would require reverting
commit 3b64349 (memory: Replace io_mem_read/write with
memory_region_dispatch_read/write, 2015-04-26).

Paolo

v1->v2: fix botched merge in patch 4 [Fam],
	correct locking in patch 6 [Fam],
	push down to kvm_arch_handle_exit in patch 6 [Christian]

Jan Kiszka (4):
  memory: Add global-locking property to memory regions
  memory: let address_space_rw/ld*/st* run outside the BQL
  kvm: First step to push iothread lock out of inner run loop
  kvm: Switch to unlocked PIO

Paolo Bonzini (5):
  main-loop: use qemu_mutex_lock_iothread consistently
  main-loop: introduce qemu_mutex_iothread_locked
  exec: pull qemu_flush_coalesced_mmio_buffer() into
    address_space_rw/ld*/st*
  acpi: mark PMTIMER as unlocked
  kvm: Switch to unlocked MMIO

 cpus.c                   | 19 +++++++++++--
 exec.c                   | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/core.c           |  1 +
 include/exec/memory.h    | 26 +++++++++++++++++
 include/qemu/main-loop.h | 10 +++++++
 kvm-all.c                |  8 ++++--
 memory.c                 | 23 ++++++++--------
 stubs/iothread-lock.c    |  5 ++++
 target-i386/kvm.c        | 24 ++++++++++++++++
 target-mips/kvm.c        |  4 +++
 target-ppc/kvm.c         |  7 +++++
 target-s390x/kvm.c       |  3 ++
 12 files changed, 185 insertions(+), 17 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-25  3:39   ` Fam Zheng
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Frederic Konrad

The next patch will require the BQL to be always taken with
qemu_mutex_lock_iothread(), while right now this isn't the case.

Outside TCG mode this is not a problem.  In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread.  This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().

Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-2-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/cpus.c b/cpus.c
index b85fb5f..02cca5d 100644
--- a/cpus.c
+++ b/cpus.c
@@ -953,7 +953,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
     CPUState *cpu = arg;
     int r;
 
-    qemu_mutex_lock(&qemu_global_mutex);
+    qemu_mutex_lock_iothread();
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu->can_do_io = 1;
@@ -1033,10 +1033,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
 
+    qemu_mutex_lock_iothread();
     qemu_tcg_init_cpu_signals();
     qemu_thread_get_self(cpu->thread);
 
-    qemu_mutex_lock(&qemu_global_mutex);
     CPU_FOREACH(cpu) {
         cpu->thread_id = qemu_get_thread_id();
         cpu->created = true;
@@ -1148,7 +1148,11 @@ bool qemu_in_vcpu_thread(void)
 void qemu_mutex_lock_iothread(void)
 {
     atomic_inc(&iothread_requesting_mutex);
-    if (!tcg_enabled() || !first_cpu || !first_cpu->thread) {
+    /* In the simple case there is no need to bump the VCPU thread out of
+     * TCG code execution.
+     */
+    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
+        !first_cpu || !first_cpu->thread) {
         qemu_mutex_lock(&qemu_global_mutex);
         atomic_dec(&iothread_requesting_mutex);
     } else {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Frederic Konrad

This function will be used to avoid recursive locking of the iothread lock
whenever address_space_rw/ld*/st* are called with the BQL held, which is
almost always the case.

Tracking whether the iothread is owned is very cheap (just use a TLS
variable) but requires some care because now the lock must always be
taken with qemu_mutex_lock_iothread().  Previously this wasn't the case.
Outside TCG mode this is not a problem.  In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread.  This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().

Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-3-git-send-email-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c                   |  9 +++++++++
 include/qemu/main-loop.h | 10 ++++++++++
 stubs/iothread-lock.c    |  5 +++++
 3 files changed, 24 insertions(+)

diff --git a/cpus.c b/cpus.c
index 02cca5d..29fc2dd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1145,6 +1145,13 @@ bool qemu_in_vcpu_thread(void)
     return current_cpu && qemu_cpu_is_self(current_cpu);
 }
 
+static __thread bool iothread_locked = false;
+
+bool qemu_mutex_iothread_locked(void)
+{
+    return iothread_locked;
+}
+
 void qemu_mutex_lock_iothread(void)
 {
     atomic_inc(&iothread_requesting_mutex);
@@ -1163,10 +1170,12 @@ void qemu_mutex_lock_iothread(void)
         atomic_dec(&iothread_requesting_mutex);
         qemu_cond_broadcast(&qemu_io_proceeded_cond);
     }
+    iothread_locked = true;
 }
 
 void qemu_mutex_unlock_iothread(void)
 {
+    iothread_locked = false;
     qemu_mutex_unlock(&qemu_global_mutex);
 }
 
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 0f4a0fd..bc18ca3 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -223,6 +223,16 @@ int qemu_add_child_watch(pid_t pid);
 #endif
 
 /**
+ * qemu_mutex_iothread_locked: Return lock status of the main loop mutex.
+ *
+ * The main loop mutex is the coarsest lock in QEMU, and as such it
+ * must always be taken outside other locks.  This function helps
+ * functions take different paths depending on whether the current
+ * thread is running within the main loop mutex.
+ */
+bool qemu_mutex_iothread_locked(void);
+
+/**
  * qemu_mutex_lock_iothread: Lock the main loop mutex.
  *
  * This function locks the main loop mutex.  The mutex is taken by
diff --git a/stubs/iothread-lock.c b/stubs/iothread-lock.c
index 5d8aca1..dda6f6b 100644
--- a/stubs/iothread-lock.c
+++ b/stubs/iothread-lock.c
@@ -1,6 +1,11 @@
 #include "qemu-common.h"
 #include "qemu/main-loop.h"
 
+bool qemu_mutex_iothread_locked(void)
+{
+    return true;
+}
+
 void qemu_mutex_lock_iothread(void)
 {
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-25  3:44   ` Fam Zheng
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, Frederic Konrad, famz, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
---
 include/exec/memory.h | 26 ++++++++++++++++++++++++++
 memory.c              | 11 +++++++++++
 2 files changed, 37 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8ae004e..fc33348 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -180,6 +180,7 @@ struct MemoryRegion {
     bool rom_device;
     bool warning_printed; /* For reservations */
     bool flush_coalesced_mmio;
+    bool global_locking;
     MemoryRegion *alias;
     hwaddr alias_offset;
     int32_t priority;
@@ -825,6 +826,31 @@ void memory_region_set_flush_coalesced(MemoryRegion *mr);
 void memory_region_clear_flush_coalesced(MemoryRegion *mr);
 
 /**
+ * memory_region_set_global_locking: Declares the access processing requires
+ *                                   QEMU's global lock.
+ *
+ * When this is invoked, access to this memory regions will be processed while
+ * holding the global lock of QEMU. This is the default behavior of memory
+ * regions.
+ *
+ * @mr: the memory region to be updated.
+ */
+void memory_region_set_global_locking(MemoryRegion *mr);
+
+/**
+ * memory_region_clear_global_locking: Declares that access processing does
+ *                                     not depend on the QEMU global lock.
+ *
+ * By clearing this property, accesses to the memory region will be processed
+ * outside of QEMU's global lock (unless the lock is held on when issuing the
+ * access request). In this case, the device model implementing the access
+ * handlers is responsible for synchronization of concurrency.
+ *
+ * @mr: the memory region to be updated.
+ */
+void memory_region_clear_global_locking(MemoryRegion *mr);
+
+/**
  * memory_region_add_eventfd: Request an eventfd to be triggered when a word
  *                            is written to a location.
  *
diff --git a/memory.c b/memory.c
index 3ac0bd2..b0b8860 100644
--- a/memory.c
+++ b/memory.c
@@ -1012,6 +1012,7 @@ static void memory_region_initfn(Object *obj)
     mr->ram_addr = RAM_ADDR_INVALID;
     mr->enabled = true;
     mr->romd_mode = true;
+    mr->global_locking = true;
     mr->destructor = memory_region_destructor_none;
     QTAILQ_INIT(&mr->subregions);
     QTAILQ_INIT(&mr->coalesced);
@@ -1646,6 +1647,16 @@ void memory_region_clear_flush_coalesced(MemoryRegion *mr)
     }
 }
 
+void memory_region_set_global_locking(MemoryRegion *mr)
+{
+    mr->global_locking = true;
+}
+
+void memory_region_clear_global_locking(MemoryRegion *mr)
+{
+    mr->global_locking = false;
+}
+
 void memory_region_add_eventfd(MemoryRegion *mr,
                                hwaddr addr,
                                unsigned size,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (2 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-25  4:59   ` Fam Zheng
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Frederic Konrad

As memory_region_read/write_accessor will now be run also without BQL held,
we need to move coalesced MMIO flushing earlier in the dispatch process.

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c   | 21 +++++++++++++++++++++
 memory.c | 12 ------------
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/exec.c b/exec.c
index f7883d2..f2e6603 100644
--- a/exec.c
+++ b/exec.c
@@ -2316,6 +2316,13 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
     return l;
 }
 
+static void prepare_mmio_access(MemoryRegion *mr)
+{
+    if (mr->flush_coalesced_mmio) {
+        qemu_flush_coalesced_mmio_buffer();
+    }
+}
+
 MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
                              uint8_t *buf, int len, bool is_write)
 {
@@ -2333,6 +2340,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
 
         if (is_write) {
             if (!memory_access_is_direct(mr, is_write)) {
+                prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 /* XXX: could force current_cpu to NULL to avoid
                    potential bugs */
@@ -2374,6 +2382,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
         } else {
             if (!memory_access_is_direct(mr, is_write)) {
                 /* I/O case */
+                prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 switch (l) {
                 case 8:
@@ -2739,6 +2748,8 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, false);
     if (l < 4 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -2828,6 +2839,8 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 8 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -2937,6 +2950,8 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 2 || !memory_access_is_direct(mr, false)) {
+        prepare_mmio_access(mr);
+
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
 #if defined(TARGET_WORDS_BIGENDIAN)
@@ -3026,6 +3041,8 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
         r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
     } else {
         addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
@@ -3065,6 +3082,8 @@ static inline void address_space_stl_internal(AddressSpace *as,
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap32(val);
@@ -3169,6 +3188,8 @@ static inline void address_space_stw_internal(AddressSpace *as,
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, true);
     if (l < 2 || !memory_access_is_direct(mr, true)) {
+        prepare_mmio_access(mr);
+
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap16(val);
diff --git a/memory.c b/memory.c
index b0b8860..5a0cc66 100644
--- a/memory.c
+++ b/memory.c
@@ -396,9 +396,6 @@ static MemTxResult  memory_region_read_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = mr->ops->read(mr->opaque, addr, size);
     trace_memory_region_ops_read(mr, addr, tmp, size);
     *value |= (tmp & mask) << shift;
@@ -416,9 +413,6 @@ static MemTxResult memory_region_read_with_attrs_accessor(MemoryRegion *mr,
     uint64_t tmp = 0;
     MemTxResult r;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     r = mr->ops->read_with_attrs(mr->opaque, addr, &tmp, size, attrs);
     trace_memory_region_ops_read(mr, addr, tmp, size);
     *value |= (tmp & mask) << shift;
@@ -451,9 +445,6 @@ static MemTxResult memory_region_write_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = (*value >> shift) & mask;
     trace_memory_region_ops_write(mr, addr, tmp, size);
     mr->ops->write(mr->opaque, addr, tmp, size);
@@ -470,9 +461,6 @@ static MemTxResult memory_region_write_with_attrs_accessor(MemoryRegion *mr,
 {
     uint64_t tmp;
 
-    if (mr->flush_coalesced_mmio) {
-        qemu_flush_coalesced_mmio_buffer();
-    }
     tmp = (*value >> shift) & mask;
     trace_memory_region_ops_write(mr, addr, tmp, size);
     return mr->ops->write_with_attrs(mr->opaque, addr, tmp, size, attrs);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (3 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-25  5:11   ` Fam Zheng
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, Frederic Konrad, famz, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

The MMIO case is further broken up in two cases: if the caller does not
hold the BQL on invocation, the unlocked one takes or avoids BQL depending
on the locking strategy of the target memory region and its coalesced
MMIO handling.  In this case, the caller should not hold _any_ lock
(a friendly suggestion which is disregarded by virtio-scsi-dataplane).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-6-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 60 insertions(+), 9 deletions(-)

diff --git a/exec.c b/exec.c
index f2e6603..fd0401e 100644
--- a/exec.c
+++ b/exec.c
@@ -48,6 +48,7 @@
 #endif
 #include "exec/cpu-all.h"
 #include "qemu/rcu_queue.h"
+#include "qemu/main-loop.h"
 #include "exec/cputlb.h"
 #include "translate-all.h"
 
@@ -2316,11 +2317,27 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
     return l;
 }
 
-static void prepare_mmio_access(MemoryRegion *mr)
+static bool prepare_mmio_access(MemoryRegion *mr)
 {
+    bool unlocked = !qemu_mutex_iothread_locked();
+    bool release_lock = false;
+
+    if (unlocked && mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        unlocked = false;
+        release_lock = true;
+    }
     if (mr->flush_coalesced_mmio) {
+        if (unlocked) {
+            qemu_mutex_lock_iothread();
+        }
         qemu_flush_coalesced_mmio_buffer();
+        if (unlocked) {
+            qemu_mutex_unlock_iothread();
+        }
     }
+
+    return release_lock;
 }
 
 MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
@@ -2332,6 +2349,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
     hwaddr addr1;
     MemoryRegion *mr;
     MemTxResult result = MEMTX_OK;
+    bool release_lock = false;
 
     rcu_read_lock();
     while (len > 0) {
@@ -2340,7 +2358,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
 
         if (is_write) {
             if (!memory_access_is_direct(mr, is_write)) {
-                prepare_mmio_access(mr);
+                release_lock |= prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 /* XXX: could force current_cpu to NULL to avoid
                    potential bugs */
@@ -2382,7 +2400,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
         } else {
             if (!memory_access_is_direct(mr, is_write)) {
                 /* I/O case */
-                prepare_mmio_access(mr);
+                release_lock |= prepare_mmio_access(mr);
                 l = memory_access_size(mr, l, addr1);
                 switch (l) {
                 case 8:
@@ -2418,6 +2436,12 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
                 memcpy(buf, ptr, l);
             }
         }
+
+        if (release_lock) {
+            qemu_mutex_unlock_iothread();
+            release_lock = false;
+        }
+
         len -= l;
         buf += l;
         addr += l;
@@ -2744,11 +2768,12 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     hwaddr l = 4;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, false);
     if (l < 4 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
@@ -2782,6 +2807,9 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -2834,12 +2862,13 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     hwaddr l = 8;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 8 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
@@ -2873,6 +2902,9 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -2945,12 +2977,13 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     hwaddr l = 2;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  false);
     if (l < 2 || !memory_access_is_direct(mr, false)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         /* I/O case */
         r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
@@ -2984,6 +3017,9 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
     return val;
 }
@@ -3036,12 +3072,13 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     hwaddr addr1;
     MemTxResult r;
     uint8_t dirty_log_mask;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
         r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
     } else {
@@ -3057,6 +3094,9 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
@@ -3077,12 +3117,13 @@ static inline void address_space_stl_internal(AddressSpace *as,
     hwaddr l = 4;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l,
                                  true);
     if (l < 4 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
@@ -3115,6 +3156,9 @@ static inline void address_space_stl_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
@@ -3184,11 +3228,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
     hwaddr l = 2;
     hwaddr addr1;
     MemTxResult r;
+    bool release_lock = false;
 
     rcu_read_lock();
     mr = address_space_translate(as, addr, &addr1, &l, true);
     if (l < 2 || !memory_access_is_direct(mr, true)) {
-        prepare_mmio_access(mr);
+        release_lock |= prepare_mmio_access(mr);
 
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
@@ -3221,6 +3266,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
     if (result) {
         *result = r;
     }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
+    if (release_lock) {
+        qemu_mutex_unlock_iothread();
+    }
     rcu_read_unlock();
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (4 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.

For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
---
 kvm-all.c          | 10 ++++++++--
 target-i386/kvm.c  | 24 ++++++++++++++++++++++++
 target-mips/kvm.c  |  4 ++++
 target-ppc/kvm.c   |  7 +++++++
 target-s390x/kvm.c |  3 +++
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 53e01d4..af3d10b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1752,6 +1752,8 @@ int kvm_cpu_exec(CPUState *cpu)
         return EXCP_HLT;
     }
 
+    qemu_mutex_unlock_iothread();
+
     do {
         MemTxAttrs attrs;
 
@@ -1770,11 +1772,9 @@ int kvm_cpu_exec(CPUState *cpu)
              */
             qemu_cpu_kick_self();
         }
-        qemu_mutex_unlock_iothread();
 
         run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
 
-        qemu_mutex_lock_iothread();
         attrs = kvm_arch_post_run(cpu, run);
 
         if (run_ret < 0) {
@@ -1801,20 +1801,24 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
+            qemu_mutex_lock_iothread();
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
+            qemu_mutex_lock_iothread();
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
+            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
@@ -1857,6 +1861,8 @@ int kvm_cpu_exec(CPUState *cpu)
         }
     } while (ret == 0);
 
+    qemu_mutex_lock_iothread();
+
     if (ret < 0) {
         cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_CODE);
         vm_stop(RUN_STATE_INTERNAL_ERROR);
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5a236e3..45fd3b0 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -2192,7 +2192,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
     /* Inject NMI */
     if (cpu->interrupt_request & CPU_INTERRUPT_NMI) {
+        qemu_mutex_lock_iothread();
         cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
+        qemu_mutex_unlock_iothread();
+
         DPRINTF("injected NMI\n");
         ret = kvm_vcpu_ioctl(cpu, KVM_NMI);
         if (ret < 0) {
@@ -2201,6 +2204,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
         }
     }
 
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
+
     /* Force the VCPU out of its inner loop to process any INIT requests
      * or (for userspace APIC, but it is cheap to combine the checks here)
      * pending TPR access reports.
@@ -2244,6 +2251,8 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
 
         DPRINTF("setting tpr\n");
         run->cr8 = cpu_get_apic_tpr(x86_cpu->apic_state);
+
+        qemu_mutex_unlock_iothread();
     }
 }
 
@@ -2257,8 +2266,17 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
     } else {
         env->eflags &= ~IF_MASK;
     }
+
+    /* We need to protect the apic state against concurrent accesses from
+     * different threads in case the userspace irqchip is used. */
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_lock_iothread();
+    }
     cpu_set_apic_tpr(x86_cpu->apic_state, run->cr8);
     cpu_set_apic_base(x86_cpu->apic_state, run->apic_base);
+    if (!kvm_irqchip_in_kernel()) {
+        qemu_mutex_unlock_iothread();
+    }
     return cpu_get_mem_attrs(env);
 }
 
@@ -2551,13 +2569,17 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     switch (run->exit_reason) {
     case KVM_EXIT_HLT:
         DPRINTF("handle_hlt\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_halt(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_SET_TPR:
         ret = 0;
         break;
     case KVM_EXIT_TPR_ACCESS:
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_tpr_access(cpu);
+        qemu_mutex_unlock_iothread();
         break;
     case KVM_EXIT_FAIL_ENTRY:
         code = run->fail_entry.hardware_entry_failure_reason;
@@ -2583,7 +2605,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     case KVM_EXIT_DEBUG:
         DPRINTF("kvm_exit_debug\n");
+        qemu_mutex_lock_iothread();
         ret = kvm_handle_debug(cpu, &run->debug.arch);
+        qemu_mutex_unlock_iothread();
         break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 948619f..7d2293d 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -99,6 +99,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     struct kvm_mips_interrupt intr;
 
+    qemu_mutex_lock_iothread();
+
     if ((cs->interrupt_request & CPU_INTERRUPT_HARD) &&
             cpu_mips_io_interrupts_pending(cpu)) {
         intr.cpu = -1;
@@ -109,6 +111,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
                          __func__, cs->cpu_index, intr.irq);
         }
     }
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index afb4696..ddf469f 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1242,6 +1242,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     int r;
     unsigned irq;
 
+    qemu_mutex_lock_iothread();
+
     /* PowerPC QEMU tracks the various core input pins (interrupt, critical
      * interrupt, reset, etc) in PPC-specific env->irq_input_state. */
     if (!cap_interrupt_level &&
@@ -1269,6 +1271,8 @@ void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
     /* We don't know if there are more interrupts pending after this. However,
      * the guest will return to userspace in the course of handling this one
      * anyways, so we will get a chance to deliver the rest. */
+
+    qemu_mutex_unlock_iothread();
 }
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
@@ -1570,6 +1574,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     CPUPPCState *env = &cpu->env;
     int ret;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
     case KVM_EXIT_DCR:
         if (run->dcr.is_write) {
@@ -1620,6 +1626,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         break;
     }
 
+    qemu_mutex_unlock_iothread();
     return ret;
 }
 
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index b02ff8d..1ebcd18 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -2007,6 +2007,8 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
     S390CPU *cpu = S390_CPU(cs);
     int ret = 0;
 
+    qemu_mutex_lock_iothread();
+
     switch (run->exit_reason) {
         case KVM_EXIT_S390_SIEIC:
             ret = handle_intercept(cpu);
@@ -2027,6 +2029,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
             fprintf(stderr, "Unknown KVM exit: %d\n", run->exit_reason);
             break;
     }
+    qemu_mutex_unlock_iothread();
 
     if (ret == 0) {
         ret = EXCP_INTERRUPT;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (5 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

Do not take the BQL before dispatching PIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables
completely BQL-free PIO handling in KVM mode for upcoming devices with
fine-grained locking.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-8-git-send-email-pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index af3d10b..07bdcfa 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1801,13 +1801,12 @@ int kvm_cpu_exec(CPUState *cpu)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             kvm_handle_io(run->io.port, attrs,
                           (uint8_t *)run + run->io.data_offset,
                           run->io.direction,
                           run->io.size,
                           run->io.count);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_MMIO:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (6 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz

Accessing QEMU_CLOCK_VIRTUAL is thread-safe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-9-git-send-email-pbonzini@redhat.com>
---
 hw/acpi/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 0f201d8..fe6215a 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -528,6 +528,7 @@ void acpi_pm_tmr_init(ACPIREGS *ar, acpi_update_sci_fn update_sci,
     ar->tmr.timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, acpi_pm_tmr_timer, ar);
     memory_region_init_io(&ar->tmr.io, memory_region_owner(parent),
                           &acpi_pm_tmr_ops, ar, "acpi-tmr", 4);
+    memory_region_clear_global_locking(&ar->tmr.io);
     memory_region_add_subregion(parent, 8, &ar->tmr.io);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO
  2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
                   ` (7 preceding siblings ...)
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
@ 2015-06-24 16:25 ` Paolo Bonzini
  8 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-24 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: borntraeger, famz

Do not take the BQL before dispatching MMIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables completely
BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained
locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-10-git-send-email-pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 07bdcfa..9053ef9 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1811,13 +1811,12 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
@ 2015-06-25  3:39   ` Fam Zheng
  2015-06-25  8:20     ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2015-06-25  3:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: borntraeger, qemu-devel, Frederic Konrad

On Wed, 06/24 18:25, Paolo Bonzini wrote:
> The next patch will require the BQL to be always taken with
> qemu_mutex_lock_iothread(), while right now this isn't the case.
> 
> Outside TCG mode this is not a problem.  In TCG mode, we need to be
> careful and avoid the "prod out of compiled code" step if already
> in a VCPU thread.  This is easily done with a check on current_cpu,
> i.e. qemu_in_vcpu_thread().
> 
> Hopefully, multithreaded TCG will get rid of the whole logic to kick
> VCPUs whenever an I/O event occurs!
> 
> Cc: Frederic Konrad <fred.konrad@greensocs.com>
> Message-Id: <1434646046-27150-2-git-send-email-pbonzini@redhat.com>

Why is this "Message-Id:" included in the commit message if it's not final?

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpus.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index b85fb5f..02cca5d 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -953,7 +953,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>      CPUState *cpu = arg;
>      int r;
>  
> -    qemu_mutex_lock(&qemu_global_mutex);
> +    qemu_mutex_lock_iothread();
>      qemu_thread_get_self(cpu->thread);
>      cpu->thread_id = qemu_get_thread_id();
>      cpu->can_do_io = 1;
> @@ -1033,10 +1033,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>  {
>      CPUState *cpu = arg;
>  
> +    qemu_mutex_lock_iothread();
>      qemu_tcg_init_cpu_signals();
>      qemu_thread_get_self(cpu->thread);
>  
> -    qemu_mutex_lock(&qemu_global_mutex);
>      CPU_FOREACH(cpu) {
>          cpu->thread_id = qemu_get_thread_id();
>          cpu->created = true;
> @@ -1148,7 +1148,11 @@ bool qemu_in_vcpu_thread(void)
>  void qemu_mutex_lock_iothread(void)
>  {
>      atomic_inc(&iothread_requesting_mutex);
> -    if (!tcg_enabled() || !first_cpu || !first_cpu->thread) {
> +    /* In the simple case there is no need to bump the VCPU thread out of
> +     * TCG code execution.
> +     */
> +    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
> +        !first_cpu || !first_cpu->thread) {

This looks like a separate change from the above
"qemu_mutex_lock(&qemu_global_mutex)" conversion. Why do they belong to the
same patch?

Fam

>          qemu_mutex_lock(&qemu_global_mutex);
>          atomic_dec(&iothread_requesting_mutex);
>      } else {
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
@ 2015-06-25  3:44   ` Fam Zheng
  2015-06-25  7:46     ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2015-06-25  3:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: borntraeger, Frederic Konrad, qemu-devel, Jan Kiszka

On Wed, 06/24 18:25, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This introduces the memory region property "global_locking". It is true
> by default. By setting it to false, a device model can request BQL-free
> dispatching of region accesses to its r/w handlers. The actual BQL
> break-up will be provided in a separate patch.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Frederic Konrad <fred.konrad@greensocs.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
> ---
>  include/exec/memory.h | 26 ++++++++++++++++++++++++++
>  memory.c              | 11 +++++++++++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 8ae004e..fc33348 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -180,6 +180,7 @@ struct MemoryRegion {
>      bool rom_device;
>      bool warning_printed; /* For reservations */
>      bool flush_coalesced_mmio;
> +    bool global_locking;
>      MemoryRegion *alias;
>      hwaddr alias_offset;
>      int32_t priority;
> @@ -825,6 +826,31 @@ void memory_region_set_flush_coalesced(MemoryRegion *mr);
>  void memory_region_clear_flush_coalesced(MemoryRegion *mr);
>  
>  /**
> + * memory_region_set_global_locking: Declares the access processing requires
> + *                                   QEMU's global lock.
> + *
> + * When this is invoked, access to this memory regions will be processed while
> + * holding the global lock of QEMU. This is the default behavior of memory
> + * regions.
> + *
> + * @mr: the memory region to be updated.
> + */
> +void memory_region_set_global_locking(MemoryRegion *mr);
> +
> +/**
> + * memory_region_clear_global_locking: Declares that access processing does
> + *                                     not depend on the QEMU global lock.
> + *
> + * By clearing this property, accesses to the memory region will be processed

Inconsistent with singlar form (access) in the previous one. Otherwise,

Reviewed-by: Fam Zheng <famz@redhat.com>

> + * outside of QEMU's global lock (unless the lock is held on when issuing the
> + * access request). In this case, the device model implementing the access
> + * handlers is responsible for synchronization of concurrency.
> + *
> + * @mr: the memory region to be updated.
> + */
> +void memory_region_clear_global_locking(MemoryRegion *mr);
> +
> +/**
>   * memory_region_add_eventfd: Request an eventfd to be triggered when a word
>   *                            is written to a location.
>   *
> diff --git a/memory.c b/memory.c
> index 3ac0bd2..b0b8860 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1012,6 +1012,7 @@ static void memory_region_initfn(Object *obj)
>      mr->ram_addr = RAM_ADDR_INVALID;
>      mr->enabled = true;
>      mr->romd_mode = true;
> +    mr->global_locking = true;
>      mr->destructor = memory_region_destructor_none;
>      QTAILQ_INIT(&mr->subregions);
>      QTAILQ_INIT(&mr->coalesced);
> @@ -1646,6 +1647,16 @@ void memory_region_clear_flush_coalesced(MemoryRegion *mr)
>      }
>  }
>  
> +void memory_region_set_global_locking(MemoryRegion *mr)
> +{
> +    mr->global_locking = true;
> +}
> +
> +void memory_region_clear_global_locking(MemoryRegion *mr)
> +{
> +    mr->global_locking = false;
> +}
> +
>  void memory_region_add_eventfd(MemoryRegion *mr,
>                                 hwaddr addr,
>                                 unsigned size,
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
@ 2015-06-25  4:59   ` Fam Zheng
  0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2015-06-25  4:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: borntraeger, qemu-devel, Frederic Konrad

On Wed, 06/24 18:25, Paolo Bonzini wrote:
> As memory_region_read/write_accessor will now be run also without BQL held,
> we need to move coalesced MMIO flushing earlier in the dispatch process.
> 
> Cc: Frederic Konrad <fred.konrad@greensocs.com>
> Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL
  2015-06-24 16:25 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
@ 2015-06-25  5:11   ` Fam Zheng
  2015-06-25  7:47     ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2015-06-25  5:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: borntraeger, Frederic Konrad, qemu-devel, Jan Kiszka

On Wed, 06/24 18:25, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> The MMIO case is further broken up in two cases: if the caller does not
> hold the BQL on invocation, the unlocked one takes or avoids BQL depending
> on the locking strategy of the target memory region and its coalesced
> MMIO handling.  In this case, the caller should not hold _any_ lock
> (a friendly suggestion which is disregarded by virtio-scsi-dataplane).
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Frederic Konrad <fred.konrad@greensocs.com>
> Message-Id: <1434646046-27150-6-git-send-email-pbonzini@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 60 insertions(+), 9 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index f2e6603..fd0401e 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -48,6 +48,7 @@
>  #endif
>  #include "exec/cpu-all.h"
>  #include "qemu/rcu_queue.h"
> +#include "qemu/main-loop.h"
>  #include "exec/cputlb.h"
>  #include "translate-all.h"
>  
> @@ -2316,11 +2317,27 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
>      return l;
>  }
>  
> -static void prepare_mmio_access(MemoryRegion *mr)
> +static bool prepare_mmio_access(MemoryRegion *mr)
>  {
> +    bool unlocked = !qemu_mutex_iothread_locked();
> +    bool release_lock = false;
> +
> +    if (unlocked && mr->global_locking) {
> +        qemu_mutex_lock_iothread();
> +        unlocked = false;
> +        release_lock = true;
> +    }
>      if (mr->flush_coalesced_mmio) {
> +        if (unlocked) {
> +            qemu_mutex_lock_iothread();
> +        }
>          qemu_flush_coalesced_mmio_buffer();
> +        if (unlocked) {
> +            qemu_mutex_unlock_iothread();
> +        }
>      }
> +
> +    return release_lock;
>  }
>  
>  MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
> @@ -2332,6 +2349,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>      hwaddr addr1;
>      MemoryRegion *mr;
>      MemTxResult result = MEMTX_OK;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      while (len > 0) {
> @@ -2340,7 +2358,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>  
>          if (is_write) {
>              if (!memory_access_is_direct(mr, is_write)) {
> -                prepare_mmio_access(mr);
> +                release_lock |= prepare_mmio_access(mr);
>                  l = memory_access_size(mr, l, addr1);
>                  /* XXX: could force current_cpu to NULL to avoid
>                     potential bugs */
> @@ -2382,7 +2400,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>          } else {
>              if (!memory_access_is_direct(mr, is_write)) {
>                  /* I/O case */
> -                prepare_mmio_access(mr);
> +                release_lock |= prepare_mmio_access(mr);
>                  l = memory_access_size(mr, l, addr1);
>                  switch (l) {
>                  case 8:
> @@ -2418,6 +2436,12 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>                  memcpy(buf, ptr, l);
>              }
>          }
> +
> +        if (release_lock) {
> +            qemu_mutex_unlock_iothread();
> +            release_lock = false;
> +        }
> +
>          len -= l;
>          buf += l;
>          addr += l;
> @@ -2744,11 +2768,12 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
>      hwaddr l = 4;
>      hwaddr addr1;
>      MemTxResult r;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l, false);
>      if (l < 4 || !memory_access_is_direct(mr, false)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>          /* I/O case */
>          r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
> @@ -2782,6 +2807,9 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>      rcu_read_unlock();
>      return val;
>  }
> @@ -2834,12 +2862,13 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
>      hwaddr l = 8;
>      hwaddr addr1;
>      MemTxResult r;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l,
>                                   false);
>      if (l < 8 || !memory_access_is_direct(mr, false)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>          /* I/O case */
>          r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
> @@ -2873,6 +2902,9 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>      rcu_read_unlock();
>      return val;
>  }
> @@ -2945,12 +2977,13 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
>      hwaddr l = 2;
>      hwaddr addr1;
>      MemTxResult r;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l,
>                                   false);
>      if (l < 2 || !memory_access_is_direct(mr, false)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>          /* I/O case */
>          r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
> @@ -2984,6 +3017,9 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>      rcu_read_unlock();
>      return val;
>  }
> @@ -3036,12 +3072,13 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
>      hwaddr addr1;
>      MemTxResult r;
>      uint8_t dirty_log_mask;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l,
>                                   true);
>      if (l < 4 || !memory_access_is_direct(mr, true)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>          r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
>      } else {
> @@ -3057,6 +3094,9 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>      rcu_read_unlock();
>  }
>  
> @@ -3077,12 +3117,13 @@ static inline void address_space_stl_internal(AddressSpace *as,
>      hwaddr l = 4;
>      hwaddr addr1;
>      MemTxResult r;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l,
>                                   true);
>      if (l < 4 || !memory_access_is_direct(mr, true)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>  #if defined(TARGET_WORDS_BIGENDIAN)
>          if (endian == DEVICE_LITTLE_ENDIAN) {
> @@ -3115,6 +3156,9 @@ static inline void address_space_stl_internal(AddressSpace *as,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
>      rcu_read_unlock();
>  }
>  
> @@ -3184,11 +3228,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
>      hwaddr l = 2;
>      hwaddr addr1;
>      MemTxResult r;
> +    bool release_lock = false;
>  
>      rcu_read_lock();
>      mr = address_space_translate(as, addr, &addr1, &l, true);
>      if (l < 2 || !memory_access_is_direct(mr, true)) {
> -        prepare_mmio_access(mr);
> +        release_lock |= prepare_mmio_access(mr);
>  
>  #if defined(TARGET_WORDS_BIGENDIAN)
>          if (endian == DEVICE_LITTLE_ENDIAN) {
> @@ -3221,6 +3266,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
>      if (result) {
>          *result = r;
>      }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }
> +    if (release_lock) {
> +        qemu_mutex_unlock_iothread();
> +    }

Bad rebase?

Fam

>      rcu_read_unlock();
>  }
>  
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-06-25  3:44   ` Fam Zheng
@ 2015-06-25  7:46     ` Paolo Bonzini
  2015-06-25 10:59       ` Fam Zheng
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-25  7:46 UTC (permalink / raw)
  To: Fam Zheng
  Cc: borntraeger, Jan Kiszka, qemu-devel, Dr. David Alan Gilbert,
	Frederic Konrad



On 25/06/2015 05:44, Fam Zheng wrote:
>> > + * memory_region_clear_global_locking: Declares that access processing does
>> > + *                                     not depend on the QEMU global lock.
>> > + *
>> > + * By clearing this property, accesses to the memory region will be processed
> Inconsistent with singlar form (access) in the previous one. Otherwise,

I think this is correct English actually.  "Access processing" in the
first line cannot use a plural because it behaves as a single word.
Let's ask a native speaker. :)

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL
  2015-06-25  5:11   ` Fam Zheng
@ 2015-06-25  7:47     ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-25  7:47 UTC (permalink / raw)
  To: Fam Zheng; +Cc: borntraeger, Jan Kiszka, qemu-devel, Frederic Konrad



On 25/06/2015 07:11, Fam Zheng wrote:
> On Wed, 06/24 18:25, Paolo Bonzini wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> The MMIO case is further broken up in two cases: if the caller does not
>> hold the BQL on invocation, the unlocked one takes or avoids BQL depending
>> on the locking strategy of the target memory region and its coalesced
>> MMIO handling.  In this case, the caller should not hold _any_ lock
>> (a friendly suggestion which is disregarded by virtio-scsi-dataplane).
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> Cc: Frederic Konrad <fred.konrad@greensocs.com>
>> Message-Id: <1434646046-27150-6-git-send-email-pbonzini@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  exec.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
>>  1 file changed, 60 insertions(+), 9 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index f2e6603..fd0401e 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -48,6 +48,7 @@
>>  #endif
>>  #include "exec/cpu-all.h"
>>  #include "qemu/rcu_queue.h"
>> +#include "qemu/main-loop.h"
>>  #include "exec/cputlb.h"
>>  #include "translate-all.h"
>>  
>> @@ -2316,11 +2317,27 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
>>      return l;
>>  }
>>  
>> -static void prepare_mmio_access(MemoryRegion *mr)
>> +static bool prepare_mmio_access(MemoryRegion *mr)
>>  {
>> +    bool unlocked = !qemu_mutex_iothread_locked();
>> +    bool release_lock = false;
>> +
>> +    if (unlocked && mr->global_locking) {
>> +        qemu_mutex_lock_iothread();
>> +        unlocked = false;
>> +        release_lock = true;
>> +    }
>>      if (mr->flush_coalesced_mmio) {
>> +        if (unlocked) {
>> +            qemu_mutex_lock_iothread();
>> +        }
>>          qemu_flush_coalesced_mmio_buffer();
>> +        if (unlocked) {
>> +            qemu_mutex_unlock_iothread();
>> +        }
>>      }
>> +
>> +    return release_lock;
>>  }
>>  
>>  MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>> @@ -2332,6 +2349,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>>      hwaddr addr1;
>>      MemoryRegion *mr;
>>      MemTxResult result = MEMTX_OK;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      while (len > 0) {
>> @@ -2340,7 +2358,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>>  
>>          if (is_write) {
>>              if (!memory_access_is_direct(mr, is_write)) {
>> -                prepare_mmio_access(mr);
>> +                release_lock |= prepare_mmio_access(mr);
>>                  l = memory_access_size(mr, l, addr1);
>>                  /* XXX: could force current_cpu to NULL to avoid
>>                     potential bugs */
>> @@ -2382,7 +2400,7 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>>          } else {
>>              if (!memory_access_is_direct(mr, is_write)) {
>>                  /* I/O case */
>> -                prepare_mmio_access(mr);
>> +                release_lock |= prepare_mmio_access(mr);
>>                  l = memory_access_size(mr, l, addr1);
>>                  switch (l) {
>>                  case 8:
>> @@ -2418,6 +2436,12 @@ MemTxResult address_space_rw(AddressSpace *as, hwaddr addr, MemTxAttrs attrs,
>>                  memcpy(buf, ptr, l);
>>              }
>>          }
>> +
>> +        if (release_lock) {
>> +            qemu_mutex_unlock_iothread();
>> +            release_lock = false;
>> +        }
>> +
>>          len -= l;
>>          buf += l;
>>          addr += l;
>> @@ -2744,11 +2768,12 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
>>      hwaddr l = 4;
>>      hwaddr addr1;
>>      MemTxResult r;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l, false);
>>      if (l < 4 || !memory_access_is_direct(mr, false)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>          /* I/O case */
>>          r = memory_region_dispatch_read(mr, addr1, &val, 4, attrs);
>> @@ -2782,6 +2807,9 @@ static inline uint32_t address_space_ldl_internal(AddressSpace *as, hwaddr addr,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>      rcu_read_unlock();
>>      return val;
>>  }
>> @@ -2834,12 +2862,13 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
>>      hwaddr l = 8;
>>      hwaddr addr1;
>>      MemTxResult r;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l,
>>                                   false);
>>      if (l < 8 || !memory_access_is_direct(mr, false)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>          /* I/O case */
>>          r = memory_region_dispatch_read(mr, addr1, &val, 8, attrs);
>> @@ -2873,6 +2902,9 @@ static inline uint64_t address_space_ldq_internal(AddressSpace *as, hwaddr addr,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>      rcu_read_unlock();
>>      return val;
>>  }
>> @@ -2945,12 +2977,13 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
>>      hwaddr l = 2;
>>      hwaddr addr1;
>>      MemTxResult r;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l,
>>                                   false);
>>      if (l < 2 || !memory_access_is_direct(mr, false)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>          /* I/O case */
>>          r = memory_region_dispatch_read(mr, addr1, &val, 2, attrs);
>> @@ -2984,6 +3017,9 @@ static inline uint32_t address_space_lduw_internal(AddressSpace *as,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>      rcu_read_unlock();
>>      return val;
>>  }
>> @@ -3036,12 +3072,13 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
>>      hwaddr addr1;
>>      MemTxResult r;
>>      uint8_t dirty_log_mask;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l,
>>                                   true);
>>      if (l < 4 || !memory_access_is_direct(mr, true)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>          r = memory_region_dispatch_write(mr, addr1, val, 4, attrs);
>>      } else {
>> @@ -3057,6 +3094,9 @@ void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>      rcu_read_unlock();
>>  }
>>  
>> @@ -3077,12 +3117,13 @@ static inline void address_space_stl_internal(AddressSpace *as,
>>      hwaddr l = 4;
>>      hwaddr addr1;
>>      MemTxResult r;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l,
>>                                   true);
>>      if (l < 4 || !memory_access_is_direct(mr, true)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>  #if defined(TARGET_WORDS_BIGENDIAN)
>>          if (endian == DEVICE_LITTLE_ENDIAN) {
>> @@ -3115,6 +3156,9 @@ static inline void address_space_stl_internal(AddressSpace *as,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>>      rcu_read_unlock();
>>  }
>>  
>> @@ -3184,11 +3228,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
>>      hwaddr l = 2;
>>      hwaddr addr1;
>>      MemTxResult r;
>> +    bool release_lock = false;
>>  
>>      rcu_read_lock();
>>      mr = address_space_translate(as, addr, &addr1, &l, true);
>>      if (l < 2 || !memory_access_is_direct(mr, true)) {
>> -        prepare_mmio_access(mr);
>> +        release_lock |= prepare_mmio_access(mr);
>>  
>>  #if defined(TARGET_WORDS_BIGENDIAN)
>>          if (endian == DEVICE_LITTLE_ENDIAN) {
>> @@ -3221,6 +3266,12 @@ static inline void address_space_stw_internal(AddressSpace *as,
>>      if (result) {
>>          *result = r;
>>      }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
>> +    if (release_lock) {
>> +        qemu_mutex_unlock_iothread();
>> +    }
> 
> Bad rebase?

Yes.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently
  2015-06-25  3:39   ` Fam Zheng
@ 2015-06-25  8:20     ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-25  8:20 UTC (permalink / raw)
  To: Fam Zheng; +Cc: borntraeger, qemu-devel, Frederic Konrad



On 25/06/2015 05:39, Fam Zheng wrote:
> On Wed, 06/24 18:25, Paolo Bonzini wrote:
>> The next patch will require the BQL to be always taken with
>> qemu_mutex_lock_iothread(), while right now this isn't the case.
>>
>> Outside TCG mode this is not a problem.  In TCG mode, we need to be
>> careful and avoid the "prod out of compiled code" step if already
>> in a VCPU thread.  This is easily done with a check on current_cpu,
>> i.e. qemu_in_vcpu_thread().
>>
>> Hopefully, multithreaded TCG will get rid of the whole logic to kick
>> VCPUs whenever an I/O event occurs!
>>
>> Cc: Frederic Konrad <fred.konrad@greensocs.com>
>> Message-Id: <1434646046-27150-2-git-send-email-pbonzini@redhat.com>
> 
> Why is this "Message-Id:" included in the commit message if it's not final?
> 
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  cpus.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index b85fb5f..02cca5d 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -953,7 +953,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>>      CPUState *cpu = arg;
>>      int r;
>>  
>> -    qemu_mutex_lock(&qemu_global_mutex);
>> +    qemu_mutex_lock_iothread();
>>      qemu_thread_get_self(cpu->thread);
>>      cpu->thread_id = qemu_get_thread_id();
>>      cpu->can_do_io = 1;
>> @@ -1033,10 +1033,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>>  {
>>      CPUState *cpu = arg;
>>  
>> +    qemu_mutex_lock_iothread();
>>      qemu_tcg_init_cpu_signals();
>>      qemu_thread_get_self(cpu->thread);
>>  
>> -    qemu_mutex_lock(&qemu_global_mutex);
>>      CPU_FOREACH(cpu) {
>>          cpu->thread_id = qemu_get_thread_id();
>>          cpu->created = true;
>> @@ -1148,7 +1148,11 @@ bool qemu_in_vcpu_thread(void)
>>  void qemu_mutex_lock_iothread(void)
>>  {
>>      atomic_inc(&iothread_requesting_mutex);
>> -    if (!tcg_enabled() || !first_cpu || !first_cpu->thread) {
>> +    /* In the simple case there is no need to bump the VCPU thread out of
>> +     * TCG code execution.
>> +     */
>> +    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
>> +        !first_cpu || !first_cpu->thread) {
> 
> This looks like a separate change from the above
> "qemu_mutex_lock(&qemu_global_mutex)" conversion. Why do they belong to the
> same patch?

Previously, a VCPU thread would never call qemu_mutex_lock_iothread().
Now, it can.

While the change is only necessary when a call is introduced with
current_cpu != NULL (that's patch 5), it should be in this patch or
before because this is the one that changes the invariant.  I put it in
the same because the patch is already very very smal.

Paolo

> Fam
> 
>>          qemu_mutex_lock(&qemu_global_mutex);
>>          atomic_dec(&iothread_requesting_mutex);
>>      } else {
>> -- 
>> 1.8.3.1
>>
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-06-25  7:46     ` Paolo Bonzini
@ 2015-06-25 10:59       ` Fam Zheng
  2015-06-25 11:10         ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2015-06-25 10:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: borntraeger, Jan Kiszka, qemu-devel, Dr. David Alan Gilbert,
	Frederic Konrad

On Thu, 06/25 09:46, Paolo Bonzini wrote:
> 
> 
> On 25/06/2015 05:44, Fam Zheng wrote:
> >> > + * memory_region_clear_global_locking: Declares that access processing does
> >> > + *                                     not depend on the QEMU global lock.
> >> > + *
> >> > + * By clearing this property, accesses to the memory region will be processed
> > Inconsistent with singlar form (access) in the previous one. Otherwise,
> 
> I think this is correct English actually.  "Access processing" in the
> first line cannot use a plural because it behaves as a single word.
> Let's ask a native speaker. :)

I meant the above "When this is invoked, access to this memory regions will be
processed..." compared to this "accesses to the memory region".

Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions
  2015-06-25 10:59       ` Fam Zheng
@ 2015-06-25 11:10         ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-25 11:10 UTC (permalink / raw)
  To: Fam Zheng
  Cc: borntraeger, Frederic Konrad, qemu-devel, Dr. David Alan Gilbert,
	Jan Kiszka



On 25/06/2015 12:59, Fam Zheng wrote:
> I meant the above "When this is invoked, access to this memory regions will be
> processed..." compared to this "accesses to the memory region".

Ah, I see what you mean now.  Fixed.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO
  2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-07-02  8:20 ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-07-02  8:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: famz

Do not take the BQL before dispatching MMIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables completely
BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained
locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index ad5ac5e..df57da0 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1814,13 +1814,12 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO
  2015-06-18 16:47 [Qemu-devel] [PATCH for-2.4 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
@ 2015-06-18 16:47 ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-06-18 16:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: jan.kiszka

Do not take the BQL before dispatching MMIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables completely
BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained
locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 kvm-all.c | 3 +--
 1 file changed, 3 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index d3831c4..87b00b8 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1845,13 +1845,12 @@ int kvm_cpu_exec(CPUState *cpu)
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
-            qemu_mutex_lock_iothread();
+            /* Called outside BQL */
             address_space_rw(&address_space_memory,
                              run->mmio.phys_addr, attrs,
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
-            qemu_mutex_unlock_iothread();
             ret = 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-07-02  8:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-24 16:25 [Qemu-devel] [PATCH for-2.4 v2 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 1/9] main-loop: use qemu_mutex_lock_iothread consistently Paolo Bonzini
2015-06-25  3:39   ` Fam Zheng
2015-06-25  8:20     ` Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 2/9] main-loop: introduce qemu_mutex_iothread_locked Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 3/9] memory: Add global-locking property to memory regions Paolo Bonzini
2015-06-25  3:44   ` Fam Zheng
2015-06-25  7:46     ` Paolo Bonzini
2015-06-25 10:59       ` Fam Zheng
2015-06-25 11:10         ` Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 4/9] exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st* Paolo Bonzini
2015-06-25  4:59   ` Fam Zheng
2015-06-24 16:25 ` [Qemu-devel] [PATCH 5/9] memory: let address_space_rw/ld*/st* run outside the BQL Paolo Bonzini
2015-06-25  5:11   ` Fam Zheng
2015-06-25  7:47     ` Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 6/9] kvm: First step to push iothread lock out of inner run loop Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 7/9] kvm: Switch to unlocked PIO Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 8/9] acpi: mark PMTIMER as unlocked Paolo Bonzini
2015-06-24 16:25 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2015-07-02  8:20 [Qemu-devel] [PATCH for-2.4 0/9 v3] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-07-02  8:20 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini
2015-06-18 16:47 [Qemu-devel] [PATCH for-2.4 0/9] KVM: Do I/O outside BQL whenever possible Paolo Bonzini
2015-06-18 16:47 ` [Qemu-devel] [PATCH 9/9] kvm: Switch to unlocked MMIO Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.