[RFC PATCH 0/3] KVM: allow listener to stop all vcpus before

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 0/3] KVM: allow listener to stop all vcpus before
@ 2022-11-04 15:14 Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 1/3] KVM: keep track of running ioctls Emanuele Giuseppe Esposito
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-11-04 15:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, kvm, Emanuele Giuseppe Esposito

QEMU needs to perform memslots operations like merging and splitting,
and each operation requires more than a single ioctl.
Therefore if a vcpu is concurrently reading the same memslots,
it could end up reading something that was temporarly deleted.
For example, merging two memslots into one would imply:
DELETE(m1)
DELETE(m2)
CREATE(m1+m2)

And a vcpu could attempt to read m2 right after it is deleted, but
before the new one is created.

This approach is 100% QEMU-based. No KVM API modification is involved,
but implies that QEMU must make sure no new ioctl is running and all
vcpus are stopped.

The logic and code are basically taken from David Hildenbrand
proposal given a while ago while reviewing a previous attempt where
I suggested to solve the above problem directly in KVM by extending
its API.

This is the original code:
https://github.com/davidhildenbrand/qemu/commit/86b1bf546a8d00908e33f7362b0b61e2be8dbb7a

I just split the patch in three smaller patches, and used a
QemuLockCnt instead of counter + mutex.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1979276

Emanuele

David Hildenbrand (1):
  kvm: Atomic memslot updates

Emanuele Giuseppe Esposito (2):
  KVM: keep track of running ioctls
  KVM: keep track of running vcpu ioctls

 accel/kvm/kvm-all.c      | 175 ++++++++++++++++++++++++++++++++++++---
 hw/core/cpu-common.c     |   2 +
 include/hw/core/cpu.h    |   3 +
 include/sysemu/kvm_int.h |   8 ++
 4 files changed, 177 insertions(+), 11 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/3] KVM: keep track of running ioctls
  2022-11-04 15:14 [RFC PATCH 0/3] KVM: allow listener to stop all vcpus before Emanuele Giuseppe Esposito
@ 2022-11-04 15:14 ` Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 2/3] KVM: keep track of running vcpu ioctls Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 3/3] kvm: Atomic memslot updates Emanuele Giuseppe Esposito
  2 siblings, 0 replies; 5+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-11-04 15:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, kvm, Emanuele Giuseppe Esposito, David Hildenbrand

kvm_in_ioctl_lock uses a QemuLockCnt to keep track of the running
ioctls. It will be used by the memory listener to make sure no
new ioctl runs when it has to perform memslots modifications.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 accel/kvm/kvm-all.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f99b0becd8..099d7bda56 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -105,6 +105,8 @@ static int kvm_sstep_flags;
 static bool kvm_immediate_exit;
 static hwaddr kvm_max_slot_size = ~0;
 
+QemuLockCnt kvm_in_ioctl_lock;
+
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
     KVM_CAP_INFO(USER_MEMORY),
     KVM_CAP_INFO(DESTROY_MEMORY_REGION_WORKS),
@@ -2310,6 +2312,7 @@ static int kvm_init(MachineState *ms)
     assert(TARGET_PAGE_SIZE <= qemu_real_host_page_size());
 
     s->sigmask_len = 8;
+    qemu_lockcnt_init(&kvm_in_ioctl_lock);
 
 #ifdef KVM_CAP_SET_GUEST_DEBUG
     QTAILQ_INIT(&s->kvm_sw_breakpoints);
@@ -2808,6 +2811,18 @@ static void kvm_eat_signals(CPUState *cpu)
     } while (sigismember(&chkset, SIG_IPI));
 }
 
+static void kvm_set_in_ioctl(bool in_ioctl)
+{
+    if (likely(qemu_mutex_iothread_locked())) {
+        return;
+    }
+    if (in_ioctl) {
+        qemu_lockcnt_inc(&kvm_in_ioctl_lock);
+    } else {
+        qemu_lockcnt_dec(&kvm_in_ioctl_lock);
+    }
+}
+
 int kvm_cpu_exec(CPUState *cpu)
 {
     struct kvm_run *run = cpu->kvm_run;
@@ -3014,7 +3029,9 @@ int kvm_vm_ioctl(KVMState *s, int type, ...)
     va_end(ap);
 
     trace_kvm_vm_ioctl(type, arg);
+    kvm_set_in_ioctl(true);
     ret = ioctl(s->vmfd, type, arg);
+    kvm_set_in_ioctl(false);
     if (ret == -1) {
         ret = -errno;
     }
@@ -3050,7 +3067,9 @@ int kvm_device_ioctl(int fd, int type, ...)
     va_end(ap);
 
     trace_kvm_device_ioctl(fd, type, arg);
+    kvm_set_in_ioctl(true);
     ret = ioctl(fd, type, arg);
+    kvm_set_in_ioctl(false);
     if (ret == -1) {
         ret = -errno;
     }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/3] KVM: keep track of running vcpu ioctls
  2022-11-04 15:14 [RFC PATCH 0/3] KVM: allow listener to stop all vcpus before Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 1/3] KVM: keep track of running ioctls Emanuele Giuseppe Esposito
@ 2022-11-04 15:14 ` Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 3/3] kvm: Atomic memslot updates Emanuele Giuseppe Esposito
  2 siblings, 0 replies; 5+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-11-04 15:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, kvm, Emanuele Giuseppe Esposito, David Hildenbrand

Just as in kvm-all.c, introduce per-vcpu QemuLockCnt to keep
track when vcpus are executing an ioctl.
Keep a per-vcpu lock to minimize cache line bouncing.
The kvm listener will then use the lock to prevent new ioctls
from running when modifying memslots.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 accel/kvm/kvm-all.c   | 14 ++++++++++++++
 hw/core/cpu-common.c  |  2 ++
 include/hw/core/cpu.h |  3 +++
 3 files changed, 19 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 099d7bda56..48cdf1fecd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2811,6 +2811,18 @@ static void kvm_eat_signals(CPUState *cpu)
     } while (sigismember(&chkset, SIG_IPI));
 }
 
+static void kvm_cpu_set_in_ioctl(CPUState *cpu, bool in_ioctl)
+{
+    if (unlikely(qemu_mutex_iothread_locked())) {
+        return;
+    }
+    if (in_ioctl) {
+        qemu_lockcnt_inc(&cpu->in_ioctl_lock);
+    } else {
+        qemu_lockcnt_dec(&cpu->in_ioctl_lock);
+    }
+}
+
 static void kvm_set_in_ioctl(bool in_ioctl)
 {
     if (likely(qemu_mutex_iothread_locked())) {
@@ -3049,7 +3061,9 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...)
     va_end(ap);
 
     trace_kvm_vcpu_ioctl(cpu->cpu_index, type, arg);
+    kvm_cpu_set_in_ioctl(cpu, true);
     ret = ioctl(cpu->kvm_fd, type, arg);
+    kvm_cpu_set_in_ioctl(cpu, false);
     if (ret == -1) {
         ret = -errno;
     }
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index f9fdd46b9d..8d6a4b1b65 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -237,6 +237,7 @@ static void cpu_common_initfn(Object *obj)
     cpu->nr_threads = 1;
 
     qemu_mutex_init(&cpu->work_mutex);
+    qemu_lockcnt_init(&cpu->in_ioctl_lock);
     QSIMPLEQ_INIT(&cpu->work_list);
     QTAILQ_INIT(&cpu->breakpoints);
     QTAILQ_INIT(&cpu->watchpoints);
@@ -248,6 +249,7 @@ static void cpu_common_finalize(Object *obj)
 {
     CPUState *cpu = CPU(obj);
 
+    qemu_lockcnt_destroy(&cpu->in_ioctl_lock);
     qemu_mutex_destroy(&cpu->work_mutex);
 }
 
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index f9b58773f7..07852b2a3c 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -397,6 +397,9 @@ struct CPUState {
     uint32_t kvm_fetch_index;
     uint64_t dirty_pages;
 
+    /* kvm only for now: CPU is in kvm_vcpu_ioctl() (esp. KVM_RUN) */
+    QemuLockCnt in_ioctl_lock;
+
     /* Used for events with 'vcpu' and *without* the 'disabled' properties */
     DECLARE_BITMAP(trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS);
     DECLARE_BITMAP(trace_dstate, CPU_TRACE_DSTATE_MAX_EVENTS);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 3/3]  kvm: Atomic memslot updates
  2022-11-04 15:14 [RFC PATCH 0/3] KVM: allow listener to stop all vcpus before Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 1/3] KVM: keep track of running ioctls Emanuele Giuseppe Esposito
  2022-11-04 15:14 ` [RFC PATCH 2/3] KVM: keep track of running vcpu ioctls Emanuele Giuseppe Esposito
@ 2022-11-04 15:14 ` Emanuele Giuseppe Esposito
  2022-11-08 16:25   ` Paolo Bonzini
  2 siblings, 1 reply; 5+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-11-04 15:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, kvm, David Hildenbrand, Emanuele Giuseppe Esposito

From: David Hildenbrand <david@redhat.com>

If we update an existing memslot (e.g., resize, split), we temporarily
remove the memslot to re-add it immediately afterwards. These updates
are not atomic, especially not for KVM VCPU threads, such that we can
get spurious faults.

Let's inhibit most KVM ioctls while performing relevant updates, such
that we can perform the update just as if it would happen atomically
without additional kernel support.

We capture the add/del changes and apply them in the notifier commit
stage instead. There, we can check for overlaps and perform the ioctl
inhibiting only if really required (-> overlap).

To keep things simple we don't perform additional checks that wouldn't
actually result in an overlap -- such as !RAM memory regions in some
cases (see kvm_set_phys_mem()).

To minimize cache-line bouncing, use a separate indicator
(in_ioctl_lock) per CPU.  Also, make sure to hold the kvm_slots_lock
while performing both actions (removing+re-adding).

We have to wait until all IOCTLs were exited and block new ones from
getting executed. Kick all CPUs, so they will exit the KVM_RUN ioctl.

This approach cannot result in a deadlock as long as the inhibitor does
not hold any locks that might hinder an IOCTL from getting finished and
exited - something fairly unusual. The inhibitor will always hold the BQL.

AFAIKs, one possible candidate would be userfaultfd. If a page cannot be
placed (e.g., during postcopy), because we're waiting for a lock, or if the
userfaultfd thread cannot process a fault, because it is waiting for a
lock, there could be a deadlock. However, the BQL is not applicable here,
because any other guest memory access while holding the BQL would already
result in a deadlock.

Nothing else in the kernel should block forever and wait for userspace
intervention.

Note: pause_all_vcpus()/resume_all_vcpus() or
start_exclusive()/end_exclusive() cannot be used, as they either drop
the BQL or require to be called without the BQL - something inhibitors
cannot handle. We need a low-level locking mechanism that is
deadlock-free even when not releasing the BQL.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Tested-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 accel/kvm/kvm-all.c      | 142 ++++++++++++++++++++++++++++++++++++---
 include/sysemu/kvm_int.h |   8 +++
 2 files changed, 139 insertions(+), 11 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 48cdf1fecd..05b7aa5403 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -46,6 +46,7 @@
 #include "sysemu/hw_accel.h"
 #include "kvm-cpus.h"
 #include "sysemu/dirtylimit.h"
+#include "qemu/range.h"
 
 #include "hw/boards.h"
 #include "monitor/stats.h"
@@ -1294,6 +1295,7 @@ void kvm_set_max_memslot_size(hwaddr max_slot_size)
     kvm_max_slot_size = max_slot_size;
 }
 
+/* Called with KVMMemoryListener.slots_lock held */
 static void kvm_set_phys_mem(KVMMemoryListener *kml,
                              MemoryRegionSection *section, bool add)
 {
@@ -1328,14 +1330,12 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
     ram = memory_region_get_ram_ptr(mr) + mr_offset;
     ram_start_offset = memory_region_get_ram_addr(mr) + mr_offset;
 
-    kvm_slots_lock();
-
     if (!add) {
         do {
             slot_size = MIN(kvm_max_slot_size, size);
             mem = kvm_lookup_matching_slot(kml, start_addr, slot_size);
             if (!mem) {
-                goto out;
+                return;
             }
             if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
                 /*
@@ -1373,7 +1373,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
             start_addr += slot_size;
             size -= slot_size;
         } while (size);
-        goto out;
+        return;
     }
 
     /* register the new slot */
@@ -1398,9 +1398,6 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
         ram += slot_size;
         size -= slot_size;
     } while (size);
-
-out:
-    kvm_slots_unlock();
 }
 
 static void *kvm_dirty_ring_reaper_thread(void *data)
@@ -1457,18 +1454,137 @@ static void kvm_region_add(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
     KVMMemoryListener *kml = container_of(listener, KVMMemoryListener, listener);
+    KVMMemoryUpdate *update;
+
+    update = g_new0(KVMMemoryUpdate, 1);
+    update->section = memory_region_section_new_copy(section);
 
-    memory_region_ref(section->mr);
-    kvm_set_phys_mem(kml, section, true);
+    QSIMPLEQ_INSERT_TAIL(&kml->transaction_add, update, next);
 }
 
 static void kvm_region_del(MemoryListener *listener,
                            MemoryRegionSection *section)
 {
     KVMMemoryListener *kml = container_of(listener, KVMMemoryListener, listener);
+    KVMMemoryUpdate *update;
+
+    update = g_new0(KVMMemoryUpdate, 1);
+    update->section = memory_region_section_new_copy(section);
+
+    QSIMPLEQ_INSERT_TAIL(&kml->transaction_del, update, next);
+}
+
+static void kvm_ioctl_inhibit_begin(void)
+{
+    CPUState *cpu;
+
+    /*
+     * We allow to inhibit only when holding the BQL, so we can identify
+     * when an inhibitor wants to issue an ioctl easily.
+     */
+    g_assert(qemu_mutex_iothread_locked());
+
+    CPU_FOREACH(cpu) {
+        qemu_lockcnt_lock(&cpu->in_ioctl_lock);
+    }
+    qemu_lockcnt_lock(&kvm_in_ioctl_lock);
 
-    kvm_set_phys_mem(kml, section, false);
-    memory_region_unref(section->mr);
+    /* Inhibiting happens rarely, we can keep things simple and spin here. */
+    while (true) {
+        bool any_cpu_in_ioctl = false;
+
+        CPU_FOREACH(cpu) {
+            if (qemu_lockcnt_count(&cpu->in_ioctl_lock)) {
+                any_cpu_in_ioctl = true;
+                qemu_cpu_kick(cpu);
+            }
+        }
+        if (!any_cpu_in_ioctl &&
+            !qemu_lockcnt_count(&kvm_in_ioctl_lock)) {
+            break;
+        }
+        g_usleep(100);
+    }
+}
+
+static void kvm_ioctl_inhibit_end(void)
+{
+    CPUState *cpu;
+
+    qemu_lockcnt_unlock(&kvm_in_ioctl_lock);
+    CPU_FOREACH(cpu) {
+        qemu_lockcnt_unlock(&cpu->in_ioctl_lock);
+    }
+}
+
+static void kvm_region_commit(MemoryListener *listener)
+{
+    KVMMemoryListener *kml = container_of(listener, KVMMemoryListener,
+                                          listener);
+    KVMMemoryUpdate *u1, *u2;
+    bool need_inhibit = false;
+
+    if (QSIMPLEQ_EMPTY(&kml->transaction_add) &&
+        QSIMPLEQ_EMPTY(&kml->transaction_del)) {
+        return;
+    }
+
+    /*
+     * We have to be careful when regions to add overlap with ranges to remove.
+     * We have to simulate atomic KVM memslot updates by making sure no ioctl()
+     * is currently active.
+     *
+     * The lists are order by addresses, so it's easy to find overlaps.
+     */
+    u1 = QSIMPLEQ_FIRST(&kml->transaction_del);
+    u2 = QSIMPLEQ_FIRST(&kml->transaction_add);
+    while (u1 && u2) {
+        Range r1, r2;
+
+        range_init_nofail(&r1, u1->section->offset_within_address_space,
+                          int128_get64(u1->section->size));
+        range_init_nofail(&r2, u2->section->offset_within_address_space,
+                          int128_get64(u2->section->size));
+
+        if (range_overlaps_range(&r1, &r2)) {
+            need_inhibit = true;
+            break;
+        }
+        if (range_lob(&r1) < range_lob(&r2)) {
+            u1 = QSIMPLEQ_NEXT(u1, next);
+        } else {
+            u2 = QSIMPLEQ_NEXT(u2, next);
+        }
+    }
+
+
+    kvm_slots_lock();
+    if (need_inhibit) {
+        kvm_ioctl_inhibit_begin();
+    }
+
+    /* Remove all memslots before adding the new ones. */
+    QSIMPLEQ_FOREACH_SAFE(u1, &kml->transaction_del, next, u2) {
+        kvm_set_phys_mem(kml, u1->section, false);
+        memory_region_unref(u1->section->mr);
+
+        QSIMPLEQ_REMOVE(&kml->transaction_del, u1, KVMMemoryUpdate, next);
+        memory_region_section_free_copy(u1->section);
+        g_free(u1);
+    }
+    QSIMPLEQ_FOREACH_SAFE(u1, &kml->transaction_add, next, u2) {
+        memory_region_ref(u1->section->mr);
+        kvm_set_phys_mem(kml, u1->section, true);
+
+        QSIMPLEQ_REMOVE(&kml->transaction_add, u1, KVMMemoryUpdate, next);
+        memory_region_section_free_copy(u1->section);
+        g_free(u1);
+    }
+
+    if (need_inhibit) {
+        kvm_ioctl_inhibit_end();
+    }
+    kvm_slots_unlock();
 }
 
 static void kvm_log_sync(MemoryListener *listener,
@@ -1612,8 +1728,12 @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
         kml->slots[i].slot = i;
     }
 
+    QSIMPLEQ_INIT(&kml->transaction_add);
+    QSIMPLEQ_INIT(&kml->transaction_del);
+
     kml->listener.region_add = kvm_region_add;
     kml->listener.region_del = kvm_region_del;
+    kml->listener.commit = kvm_region_commit;
     kml->listener.log_start = kvm_log_start;
     kml->listener.log_stop = kvm_log_stop;
     kml->listener.priority = 10;
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 3b4adcdc10..5ea2d7924b 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -12,6 +12,7 @@
 #include "exec/memory.h"
 #include "qapi/qapi-types-common.h"
 #include "qemu/accel.h"
+#include "qemu/queue.h"
 #include "sysemu/kvm.h"
 
 typedef struct KVMSlot
@@ -31,10 +32,17 @@ typedef struct KVMSlot
     ram_addr_t ram_start_offset;
 } KVMSlot;
 
+typedef struct KVMMemoryUpdate {
+    QSIMPLEQ_ENTRY(KVMMemoryUpdate) next;
+    MemoryRegionSection *section;
+} KVMMemoryUpdate;
+
 typedef struct KVMMemoryListener {
     MemoryListener listener;
     KVMSlot *slots;
     int as_id;
+    QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_add;
+    QSIMPLEQ_HEAD(, KVMMemoryUpdate) transaction_del;
 } KVMMemoryListener;
 
 #define KVM_MSI_HASHTAB_SIZE    256
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 3/3] kvm: Atomic memslot updates
  2022-11-04 15:14 ` [RFC PATCH 3/3] kvm: Atomic memslot updates Emanuele Giuseppe Esposito
@ 2022-11-08 16:25   ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2022-11-08 16:25 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, qemu-devel
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, kvm, David Hildenbrand

On 11/4/22 16:14, Emanuele Giuseppe Esposito wrote:
> +    g_assert(qemu_mutex_iothread_locked());

Please add a comment here:

     /* Block further invocations of the ioctls outside the BQL.  */

> +    CPU_FOREACH(cpu) {
> +        qemu_lockcnt_lock(&cpu->in_ioctl_lock);
> +    }
> +    qemu_lockcnt_lock(&kvm_in_ioctl_lock);
>   
> -    kvm_set_phys_mem(kml, section, false);
> -    memory_region_unref(section->mr);
> +    /* Inhibiting happens rarely, we can keep things simple and spin here. */

Not making it spin is pretty easy.  You can add a qemu_event_set to 
kvm_set_in_ioctl() and kvm_cpu_set_in_ioctl(), and here something like:

     if (in_kvm_ioctls()) {
         qemu_event_reset(&kvm_in_ioctl_event);
         if (in_kvm_ioctls()) {
             qemu_event_wait(&kvm_in_ioctl_event);
         }
     }

where in_kvm_ioctls() returns true if any (vCPU or KVM) lockcnt has a 
nonzero count.

Also please create a new header sysemu/accel-blocker.h and 
accel/blocker.c or something like that with all the functions, because 
this code can potentially be used by all KVM-like accelerators.

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-08 16:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-04 15:14 [RFC PATCH 0/3] KVM: allow listener to stop all vcpus before Emanuele Giuseppe Esposito
2022-11-04 15:14 ` [RFC PATCH 1/3] KVM: keep track of running ioctls Emanuele Giuseppe Esposito
2022-11-04 15:14 ` [RFC PATCH 2/3] KVM: keep track of running vcpu ioctls Emanuele Giuseppe Esposito
2022-11-04 15:14 ` [RFC PATCH 3/3] kvm: Atomic memslot updates Emanuele Giuseppe Esposito
2022-11-08 16:25   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.