kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: stop all vcpus before modifying memslots
@ 2022-10-22 15:48 Emanuele Giuseppe Esposito
  2022-10-22 15:48 ` [PATCH 1/2] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl Emanuele Giuseppe Esposito
  2022-10-22 15:48 ` [PATCH 2/2] accel/kvm: introduce begin/commit listener callbacks Emanuele Giuseppe Esposito
  0 siblings, 2 replies; 3+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-22 15:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Maxim Levitsky, Michael S. Tsirkin, Cornelia Huck,
	David Hildenbrand, kvm, Emanuele Giuseppe Esposito

QEMU needs to perform memslots operations like merging and splitting,
and each operation requires more than a single ioctl.
Therefore if a vcpu is concurrently reading the same memslots,
it could end up reading something that was temporarly deleted.
For example, merging two memslots into one would imply:
DELETE(m1)
DELETE(m2)
CREATE(m1+m2)

And a vcpu could attempt to read m2 right after it is deleted, but
before the new one is created.

To solve this problem, use the newly introduced kvm API:
KVM_KICK_ALL_RUNNING_VCPUS and KVM_RESUME_ALL_KICKED_VCPUS.
This new API allows the userspace to respectively stop and resume all running vcpus. A "running" vcpu is a vcpu that is executing
the KVM_RUN ioctl.

While KVM already handles the case of KVM_RUN being called after
KVM_KICK_ALL_RUNNING_VCPUS is invoked but before KVM_RESUME_ALL_KICKED_VCPUS by simply returning immediately,
QEMU also avoids that using the event API.

This is the simplest solution, pausing all vcpus in the kvm
side, so that:
- QEMU just needs to call the new API before making memslots
changes, keeping modifications to the minimum
- dirty page updates are also performed when vcpus are blocked, so
there is no time window between the dirty page ioctl and memslots
modifications, since vcpus are all stopped.
- no need to modify the existing memslots API

This series requires the KVM serie "KVM: API to block and resume all running vcpus in a vm".

Emanuele Giuseppe Esposito (2):
  linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list
    ioctl
  accel/kvm: introduce begin/commit listener callbacks

 accel/kvm/kvm-all.c       | 50 +++++++++++++++++++++++++++++++++++++++
 linux-headers/linux/kvm.h |  3 +++
 2 files changed, 53 insertions(+)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl
  2022-10-22 15:48 [PATCH 0/2] KVM: stop all vcpus before modifying memslots Emanuele Giuseppe Esposito
@ 2022-10-22 15:48 ` Emanuele Giuseppe Esposito
  2022-10-22 15:48 ` [PATCH 2/2] accel/kvm: introduce begin/commit listener callbacks Emanuele Giuseppe Esposito
  1 sibling, 0 replies; 3+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-22 15:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Maxim Levitsky, Michael S. Tsirkin, Cornelia Huck,
	David Hildenbrand, kvm, Emanuele Giuseppe Esposito

Introduce new KVM_KICK_ALL_RUNNING_VCPUS and KVM_RESUME_ALL_KICKED_VCPUS
ioctl that will be used respectively to pause and then resume all vcpus
currently executing KVM_RUN in kvm.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 linux-headers/linux/kvm.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f089349149..1fcf69f903 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -2067,4 +2067,7 @@ struct kvm_stats_desc {
 /* Available with KVM_CAP_XSAVE2 */
 #define KVM_GET_XSAVE2		  _IOR(KVMIO,  0xcf, struct kvm_xsave)
 
+#define KVM_KICK_ALL_RUNNING_VCPUS		_IO(KVMIO,  0xd2)
+#define KVM_RESUME_ALL_KICKED_VCPUS		_IO(KVMIO,  0xd3)
+
 #endif /* __LINUX_KVM_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] accel/kvm: introduce begin/commit listener callbacks
  2022-10-22 15:48 [PATCH 0/2] KVM: stop all vcpus before modifying memslots Emanuele Giuseppe Esposito
  2022-10-22 15:48 ` [PATCH 1/2] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl Emanuele Giuseppe Esposito
@ 2022-10-22 15:48 ` Emanuele Giuseppe Esposito
  1 sibling, 0 replies; 3+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-22 15:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Maxim Levitsky, Michael S. Tsirkin, Cornelia Huck,
	David Hildenbrand, kvm, Emanuele Giuseppe Esposito

These callback make sure that all vcpus are blocked before
performing memslot updates, and resumed once we are finished.

They rely on kvm support for KVM_KICK_ALL_RUNNING_VCPUS and
KVM_RESUME_ALL_KICKED_VCPUS ioctls to respectively pause and
resume all vcpus that are in KVM_RUN state.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 accel/kvm/kvm-all.c | 50 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 645f0a249a..bd0dfa8613 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -178,6 +178,8 @@ bool kvm_has_guest_debug;
 int kvm_sstep_flags;
 static bool kvm_immediate_exit;
 static hwaddr kvm_max_slot_size = ~0;
+static QemuEvent mem_transaction_proceed;
+
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
     KVM_CAP_INFO(USER_MEMORY),
@@ -1523,6 +1525,38 @@ static void kvm_region_del(MemoryListener *listener,
     memory_region_unref(section->mr);
 }
 
+static void kvm_begin(MemoryListener *listener)
+{
+    KVMState *s = kvm_state;
+
+    /*
+     * Make sure BQL is taken so cpus in kvm_cpu_exec that just exited from
+     * KVM_RUN do not continue, since many run->exit_reason take it anyways.
+     */
+    assert(qemu_mutex_iothread_locked());
+
+    /*
+     * Stop incoming cpus that want to execute KVM_RUN from running.
+     * Makes cpus calling qemu_event_wait() in kvm_cpu_exec() block.
+     */
+    qemu_event_reset(&mem_transaction_proceed);
+
+    /* Ask KVM to stop all vcpus that are currently running KVM_RUN */
+    kvm_vm_ioctl(s, KVM_KICK_ALL_RUNNING_VCPUS);
+}
+
+static void kvm_commit(MemoryListener *listener)
+{
+    KVMState *s = kvm_state;
+    assert(qemu_mutex_iothread_locked());
+
+    /* Ask KVM to resume all vcpus that are currently blocked in KVM_RUN */
+    kvm_vm_ioctl(s, KVM_RESUME_ALL_KICKED_VCPUS);
+
+    /* Resume cpus waiting in qemu_event_wait() in kvm_cpu_exec() */
+    qemu_event_set(&mem_transaction_proceed);
+}
+
 static void kvm_log_sync(MemoryListener *listener,
                          MemoryRegionSection *section)
 {
@@ -1668,6 +1702,8 @@ void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
     kml->listener.region_del = kvm_region_del;
     kml->listener.log_start = kvm_log_start;
     kml->listener.log_stop = kvm_log_stop;
+    kml->listener.begin = kvm_begin;
+    kml->listener.commit = kvm_commit;
     kml->listener.priority = 10;
     kml->listener.name = name;
 
@@ -2611,6 +2647,7 @@ static int kvm_init(MachineState *ms)
     }
 
     kvm_state = s;
+    qemu_event_init(&mem_transaction_proceed, false);
 
     ret = kvm_arch_init(ms, s);
     if (ret < 0) {
@@ -2875,6 +2912,19 @@ int kvm_cpu_exec(CPUState *cpu)
     }
 
     qemu_mutex_unlock_iothread();
+
+    /*
+     * Wait that a running memory transaction (memslot update) is concluded.
+     *
+     * If the event state is EV_SET, it means kvm_commit() has already finished
+     * and called qemu_event_set(), therefore cpu can execute.
+     *
+     * If it's EV_FREE, it means kvm_begin() has already called
+     * qemu_event_reset(), therefore a memory transaction is happening and the
+     * cpu must wait.
+     */
+    qemu_event_wait(&mem_transaction_proceed);
+
     cpu_exec_start(cpu);
 
     do {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-22 15:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-22 15:48 [PATCH 0/2] KVM: stop all vcpus before modifying memslots Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 1/2] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 2/2] accel/kvm: introduce begin/commit listener callbacks Emanuele Giuseppe Esposito

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).