All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Handle hypercall code overlay page in userspace
@ 2021-05-24 19:54 Siddharth Chandrasekaran
  2021-05-24 19:54 ` [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages Siddharth Chandrasekaran
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 19:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

Hyprcall code page is specified in the Hyper-V TLFS to be an overlay
page, ie., guest chooses a GPA and the host _places_ a page at that
location, making it visible to the guest and the existing page becomes
inaccessible. Similarly when disabled, the host should _remove_ the
overlay and the old page should become visible to the guest.

Now, KVM directly patches the instructions into the guest chosen GPA for
the hypercall code page. Strictly speaking this is guest memory
corruption as the hyper-v TLFS specifies that the underlying page should
be preserved. Since the guest seldom moves the hypercall code page
around, it didn't see any problems till now. When trying to implement
VSM API, we are seeing some exotic use of overlay pages which start
expecting the underlying page to be intact. To handle those cases, we
need a more generic approach handling these primitives.

This patchset tries build an infrastructure for handling overlay pages
in general by using the new user space MSR filtering feature of KVM to
filter out writes to overlay MSRs, handle them in user space and then
forward those writes back to KVM so it gets an opportunity to write
contents into the page that was overlaid here. Additionally it does some
housekeeping here and there.

P.S. This is a follow up to the my initial approach of handling this in
kernel, see [1] for discussions.

~ Sid.

[1]: https://lore.kernel.org/kvm/20210423090333.21910-1-sidcha@amazon.de/

Siddharth Chandrasekaran (6):
  hyper-v: Overlay abstraction for synic event and msg pages
  hyper-v: Use -1 as invalid overlay address
  kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr()
  kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV)
  kvm/i386: Add support for user space MSR filtering
  hyper-v: Handle hypercall code page as an overlay page

 hw/hyperv/hyperv.c         | 116 +++++++++++++++++++++----------------
 include/hw/hyperv/hyperv.h |  15 +++++
 target/i386/kvm/hyperv.c   |  94 ++++++++++++++++++++++++++++--
 target/i386/kvm/hyperv.h   |   4 ++
 target/i386/kvm/kvm.c      | 113 ++++++++++++++++++++++++++++++++++--
 target/i386/kvm/kvm_i386.h |   1 +
 6 files changed, 282 insertions(+), 61 deletions(-)

-- 
2.17.1



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
@ 2021-05-24 19:54 ` Siddharth Chandrasekaran
  2021-06-08  8:27   ` Alexander Graf
  2021-05-24 19:54 ` [PATCH 2/6] hyper-v: Use -1 as invalid overlay address Siddharth Chandrasekaran
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 19:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

Capture overlay page semantic variables into 'struct overlay_page' and
add methods that operate over it. Adapt existing synic event and mesage
pages to use these methods to setup and manage overlays.

Since all overlay pages use bit 0 of the GPA to indicate if the overlay
is enabled, the checks for this bit is moved into the unified overlaying
method hyperv_overlay_update() so the caller does not need to care about
it.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 hw/hyperv/hyperv.c         | 103 ++++++++++++++++++++-----------------
 include/hw/hyperv/hyperv.h |   9 ++++
 target/i386/kvm/hyperv.c   |  10 ++--
 3 files changed, 68 insertions(+), 54 deletions(-)

diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
index cb1074f234..8d09206702 100644
--- a/hw/hyperv/hyperv.c
+++ b/hw/hyperv/hyperv.c
@@ -28,12 +28,8 @@ struct SynICState {
     CPUState *cs;
 
     bool enabled;
-    hwaddr msg_page_addr;
-    hwaddr event_page_addr;
-    MemoryRegion msg_page_mr;
-    MemoryRegion event_page_mr;
-    struct hyperv_message_page *msg_page;
-    struct hyperv_event_flags_page *event_page;
+    struct hyperv_overlay_page msg_page;
+    struct hyperv_overlay_page event_page;
 };
 
 #define TYPE_SYNIC "hyperv-synic"
@@ -41,43 +37,52 @@ OBJECT_DECLARE_SIMPLE_TYPE(SynICState, SYNIC)
 
 static bool synic_enabled;
 
-bool hyperv_is_synic_enabled(void)
+static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
+                               Object *owner, const char *name)
 {
-    return synic_enabled;
+    memory_region_init_ram(&overlay->mr, owner, name,
+                           qemu_real_host_page_size, &error_abort);
+    overlay->ram_ptr = memory_region_get_ram_ptr(&overlay->mr);
+    overlay->addr = 0;
 }
 
-static SynICState *get_synic(CPUState *cs)
+/**
+ * This method must be called with iothread lock taken as it modifies
+ * the memory hierarchy.
+ */
+static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
 {
-    return SYNIC(object_resolve_path_component(OBJECT(cs), "synic"));
+    /* check if overlay page is enabled */
+    addr = (addr & HYPERV_OVERLAY_ENABLED) ? (addr & TARGET_PAGE_MASK) : 0;
+
+    if (overlay->addr != addr) {
+        if (overlay->addr) {
+            memory_region_del_subregion(get_system_memory(), &overlay->mr);
+        }
+        if (addr) {
+            memory_region_add_subregion(get_system_memory(), addr, &overlay->mr);
+            overlay->ram_ptr = memory_region_get_ram_ptr(&overlay->mr);
+        }
+        overlay->addr = addr;
+    }
 }
 
 static void synic_update(SynICState *synic, bool enable,
                          hwaddr msg_page_addr, hwaddr event_page_addr)
 {
-
     synic->enabled = enable;
-    if (synic->msg_page_addr != msg_page_addr) {
-        if (synic->msg_page_addr) {
-            memory_region_del_subregion(get_system_memory(),
-                                        &synic->msg_page_mr);
-        }
-        if (msg_page_addr) {
-            memory_region_add_subregion(get_system_memory(), msg_page_addr,
-                                        &synic->msg_page_mr);
-        }
-        synic->msg_page_addr = msg_page_addr;
-    }
-    if (synic->event_page_addr != event_page_addr) {
-        if (synic->event_page_addr) {
-            memory_region_del_subregion(get_system_memory(),
-                                        &synic->event_page_mr);
-        }
-        if (event_page_addr) {
-            memory_region_add_subregion(get_system_memory(), event_page_addr,
-                                        &synic->event_page_mr);
-        }
-        synic->event_page_addr = event_page_addr;
-    }
+    hyperv_overlay_update(&synic->msg_page, msg_page_addr);
+    hyperv_overlay_update(&synic->event_page, event_page_addr);
+}
+
+bool hyperv_is_synic_enabled(void)
+{
+    return synic_enabled;
+}
+
+static SynICState *get_synic(CPUState *cs)
+{
+    return SYNIC(object_resolve_path_component(OBJECT(cs), "synic"));
 }
 
 void hyperv_synic_update(CPUState *cs, bool enable,
@@ -104,21 +109,18 @@ static void synic_realize(DeviceState *dev, Error **errp)
     msgp_name = g_strdup_printf("synic-%u-msg-page", vp_index);
     eventp_name = g_strdup_printf("synic-%u-event-page", vp_index);
 
-    memory_region_init_ram(&synic->msg_page_mr, obj, msgp_name,
-                           sizeof(*synic->msg_page), &error_abort);
-    memory_region_init_ram(&synic->event_page_mr, obj, eventp_name,
-                           sizeof(*synic->event_page), &error_abort);
-    synic->msg_page = memory_region_get_ram_ptr(&synic->msg_page_mr);
-    synic->event_page = memory_region_get_ram_ptr(&synic->event_page_mr);
+    alloc_overlay_page(&synic->msg_page, obj, msgp_name);
+    alloc_overlay_page(&synic->event_page, obj, eventp_name);
 
     g_free(msgp_name);
     g_free(eventp_name);
 }
+
 static void synic_reset(DeviceState *dev)
 {
     SynICState *synic = SYNIC(dev);
-    memset(synic->msg_page, 0, sizeof(*synic->msg_page));
-    memset(synic->event_page, 0, sizeof(*synic->event_page));
+    memset(synic->msg_page.ram_ptr, 0, sizeof(struct hyperv_message_page));
+    memset(synic->event_page.ram_ptr, 0, sizeof(struct hyperv_event_flags_page));
     synic_update(synic, false, 0, 0);
 }
 
@@ -254,17 +256,19 @@ static void cpu_post_msg(CPUState *cs, run_on_cpu_data data)
     HvSintRoute *sint_route = data.host_ptr;
     HvSintStagedMessage *staged_msg = sint_route->staged_msg;
     SynICState *synic = sint_route->synic;
+    struct hyperv_message_page *msg_page;
     struct hyperv_message *dst_msg;
     bool wait_for_sint_ack = false;
 
     assert(staged_msg->state == HV_STAGED_MSG_BUSY);
 
-    if (!synic->enabled || !synic->msg_page_addr) {
+    if (!synic->enabled || !synic->msg_page.addr) {
         staged_msg->status = -ENXIO;
         goto posted;
     }
 
-    dst_msg = &synic->msg_page->slot[sint_route->sint];
+    msg_page = synic->msg_page.ram_ptr;
+    dst_msg = &msg_page->slot[sint_route->sint];
 
     if (dst_msg->header.message_type != HV_MESSAGE_NONE) {
         dst_msg->header.message_flags |= HV_MESSAGE_FLAG_PENDING;
@@ -275,7 +279,8 @@ static void cpu_post_msg(CPUState *cs, run_on_cpu_data data)
         staged_msg->status = hyperv_sint_route_set_sint(sint_route);
     }
 
-    memory_region_set_dirty(&synic->msg_page_mr, 0, sizeof(*synic->msg_page));
+    memory_region_set_dirty(&synic->msg_page.mr, 0,
+                            sizeof(struct hyperv_message_page));
 
 posted:
     qatomic_set(&staged_msg->state, HV_STAGED_MSG_POSTED);
@@ -338,22 +343,24 @@ int hyperv_set_event_flag(HvSintRoute *sint_route, unsigned eventno)
     int ret;
     SynICState *synic = sint_route->synic;
     unsigned long *flags, set_mask;
+    struct hyperv_event_flags_page *event_page;
     unsigned set_idx;
 
     if (eventno > HV_EVENT_FLAGS_COUNT) {
         return -EINVAL;
     }
-    if (!synic->enabled || !synic->event_page_addr) {
+    if (!synic->enabled || !synic->event_page.addr) {
         return -ENXIO;
     }
 
     set_idx = BIT_WORD(eventno);
     set_mask = BIT_MASK(eventno);
-    flags = synic->event_page->slot[sint_route->sint].flags;
+    event_page = synic->event_page.ram_ptr;
+    flags = event_page->slot[sint_route->sint].flags;
 
     if ((qatomic_fetch_or(&flags[set_idx], set_mask) & set_mask) != set_mask) {
-        memory_region_set_dirty(&synic->event_page_mr, 0,
-                                sizeof(*synic->event_page));
+        memory_region_set_dirty(&synic->event_page.mr, 0,
+                                sizeof(struct hyperv_event_flags_page));
         ret = hyperv_sint_route_set_sint(sint_route);
     } else {
         ret = 0;
diff --git a/include/hw/hyperv/hyperv.h b/include/hw/hyperv/hyperv.h
index a63ee0003c..3b2e0093b5 100644
--- a/include/hw/hyperv/hyperv.h
+++ b/include/hw/hyperv/hyperv.h
@@ -12,6 +12,15 @@
 
 #include "cpu-qom.h"
 #include "hw/hyperv/hyperv-proto.h"
+#include "exec/memory.h"
+
+#define HYPERV_OVERLAY_ENABLED     (1u << 0)
+
+struct hyperv_overlay_page {
+    hwaddr addr;
+    MemoryRegion mr;
+    void *ram_ptr;
+};
 
 typedef struct HvSintRoute HvSintRoute;
 
diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
index 26efc1e0e6..f49ed2621d 100644
--- a/target/i386/kvm/hyperv.c
+++ b/target/i386/kvm/hyperv.c
@@ -31,12 +31,10 @@ void hyperv_x86_synic_reset(X86CPU *cpu)
 void hyperv_x86_synic_update(X86CPU *cpu)
 {
     CPUX86State *env = &cpu->env;
-    bool enable = env->msr_hv_synic_control & HV_SYNIC_ENABLE;
-    hwaddr msg_page_addr = (env->msr_hv_synic_msg_page & HV_SIMP_ENABLE) ?
-        (env->msr_hv_synic_msg_page & TARGET_PAGE_MASK) : 0;
-    hwaddr event_page_addr = (env->msr_hv_synic_evt_page & HV_SIEFP_ENABLE) ?
-        (env->msr_hv_synic_evt_page & TARGET_PAGE_MASK) : 0;
-    hyperv_synic_update(CPU(cpu), enable, msg_page_addr, event_page_addr);
+
+    hyperv_synic_update(CPU(cpu), env->msr_hv_synic_control & HV_SYNIC_ENABLE,
+                        env->msr_hv_synic_msg_page,
+                        env->msr_hv_synic_evt_page);
 }
 
 static void async_synic_update(CPUState *cs, run_on_cpu_data data)
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] hyper-v: Use -1 as invalid overlay address
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
  2021-05-24 19:54 ` [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages Siddharth Chandrasekaran
@ 2021-05-24 19:54 ` Siddharth Chandrasekaran
  2021-06-08  8:27   ` Alexander Graf
  2021-05-24 19:54 ` [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr() Siddharth Chandrasekaran
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 19:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

When managing overlay pages, we used hwaddr 0 to signal an invalid
address (to disable a page). Although unlikely, 0 _could_ be a valid
overlay offset as Hyper-V TLFS does not specify anything about it.

Use -1 as the invalid address indicator as it can never be a valid
address.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 hw/hyperv/hyperv.c         | 15 +++++++++------
 include/hw/hyperv/hyperv.h |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
index 8d09206702..ac45e8e139 100644
--- a/hw/hyperv/hyperv.c
+++ b/hw/hyperv/hyperv.c
@@ -43,7 +43,7 @@ static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
     memory_region_init_ram(&overlay->mr, owner, name,
                            qemu_real_host_page_size, &error_abort);
     overlay->ram_ptr = memory_region_get_ram_ptr(&overlay->mr);
-    overlay->addr = 0;
+    overlay->addr = HYPERV_INVALID_OVERLAY_GPA;
 }
 
 /**
@@ -52,14 +52,17 @@ static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
  */
 static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
 {
-    /* check if overlay page is enabled */
-    addr = (addr & HYPERV_OVERLAY_ENABLED) ? (addr & TARGET_PAGE_MASK) : 0;
+    if (addr != HYPERV_INVALID_OVERLAY_GPA) {
+        /* check if overlay page is enabled */
+        addr = (addr & HYPERV_OVERLAY_ENABLED) ?
+                (addr & TARGET_PAGE_MASK) : HYPERV_INVALID_OVERLAY_GPA;
+    }
 
     if (overlay->addr != addr) {
-        if (overlay->addr) {
+        if (overlay->addr != HYPERV_INVALID_OVERLAY_GPA) {
             memory_region_del_subregion(get_system_memory(), &overlay->mr);
         }
-        if (addr) {
+        if (addr != HYPERV_INVALID_OVERLAY_GPA) {
             memory_region_add_subregion(get_system_memory(), addr, &overlay->mr);
             overlay->ram_ptr = memory_region_get_ram_ptr(&overlay->mr);
         }
@@ -121,7 +124,7 @@ static void synic_reset(DeviceState *dev)
     SynICState *synic = SYNIC(dev);
     memset(synic->msg_page.ram_ptr, 0, sizeof(struct hyperv_message_page));
     memset(synic->event_page.ram_ptr, 0, sizeof(struct hyperv_event_flags_page));
-    synic_update(synic, false, 0, 0);
+    synic_update(synic, false, HYPERV_INVALID_OVERLAY_GPA, HYPERV_INVALID_OVERLAY_GPA);
 }
 
 static void synic_class_init(ObjectClass *klass, void *data)
diff --git a/include/hw/hyperv/hyperv.h b/include/hw/hyperv/hyperv.h
index 3b2e0093b5..d989193e84 100644
--- a/include/hw/hyperv/hyperv.h
+++ b/include/hw/hyperv/hyperv.h
@@ -15,6 +15,7 @@
 #include "exec/memory.h"
 
 #define HYPERV_OVERLAY_ENABLED     (1u << 0)
+#define HYPERV_INVALID_OVERLAY_GPA ((hwaddr)-1)
 
 struct hyperv_overlay_page {
     hwaddr addr;
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr()
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
  2021-05-24 19:54 ` [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages Siddharth Chandrasekaran
  2021-05-24 19:54 ` [PATCH 2/6] hyper-v: Use -1 as invalid overlay address Siddharth Chandrasekaran
@ 2021-05-24 19:54 ` Siddharth Chandrasekaran
  2021-06-08  8:27   ` Alexander Graf
  2021-05-24 19:54 ` [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV) Siddharth Chandrasekaran
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 19:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

kvm_put_one_msr() zeros cpu->kvm_msr_buf and uses it to set one MSR to
KVM. It is pretty wasteful as cpu->kvm_msr_buf is 4096 bytes long;
instead use a local buffer to avoid memset.

Also, expose this method from kvm_i386.h as hyperv.c needs to set MSRs
in a subsequent patch.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 target/i386/kvm/kvm.c      | 12 ++++++++----
 target/i386/kvm/kvm_i386.h |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d972eb4705..d19a2913fd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2534,12 +2534,16 @@ static void kvm_msr_entry_add(X86CPU *cpu, uint32_t index, uint64_t value)
     msrs->nmsrs++;
 }
 
-static int kvm_put_one_msr(X86CPU *cpu, int index, uint64_t value)
+int kvm_put_one_msr(X86CPU *cpu, int index, uint64_t value)
 {
-    kvm_msr_buf_reset(cpu);
-    kvm_msr_entry_add(cpu, index, value);
+    uint8_t msr_buf[sizeof(struct kvm_msrs) + sizeof(struct kvm_msr_entry)] = { 0 };
+    struct kvm_msrs *msr = (struct kvm_msrs *)msr_buf;
+
+    msr->nmsrs = 1;
+    msr->entries[0].index = index;
+    msr->entries[0].data = value;
 
-    return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, cpu->kvm_msr_buf);
+    return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, msr);
 }
 
 void kvm_put_apicbase(X86CPU *cpu, uint64_t value)
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index dc72508389..0c4cd08071 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -40,6 +40,7 @@ void kvm_synchronize_all_tsc(void);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
 
+int kvm_put_one_msr(X86CPU *cpu, int index, uint64_t value);
 void kvm_put_apicbase(X86CPU *cpu, uint64_t value);
 
 bool kvm_enable_x2apic(void);
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV)
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
                   ` (2 preceding siblings ...)
  2021-05-24 19:54 ` [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr() Siddharth Chandrasekaran
@ 2021-05-24 19:54 ` Siddharth Chandrasekaran
  2021-06-08  8:28   ` Alexander Graf
  2021-05-24 20:01 ` [PATCH 5/6] kvm/i386: Add support for user space MSR filtering Siddharth Chandrasekaran
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 19:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

KVM_CAP_HYPERV is a VM ioctl and can be cached at kvm_arch_init()
instead of performing an ioctl each time in hyperv_enabled() which is
called foreach vCPU. Apart from that, this variable will come in handy
in a subsequent patch.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 target/i386/kvm/kvm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d19a2913fd..362f04ab3f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -93,6 +93,7 @@ static bool has_msr_misc_enable;
 static bool has_msr_smbase;
 static bool has_msr_bndcfgs;
 static int lm_capable_kernel;
+static bool has_hyperv;
 static bool has_msr_hv_hypercall;
 static bool has_msr_hv_crash;
 static bool has_msr_hv_reset;
@@ -715,8 +716,7 @@ unsigned long kvm_arch_vcpu_id(CPUState *cs)
 
 static bool hyperv_enabled(X86CPU *cpu)
 {
-    CPUState *cs = CPU(cpu);
-    return kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV) > 0 &&
+    return has_hyperv &&
         ((cpu->hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_NOTIFY) ||
          cpu->hyperv_features || cpu->hyperv_passthrough);
 }
@@ -2172,6 +2172,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     has_xsave = kvm_check_extension(s, KVM_CAP_XSAVE);
     has_xcrs = kvm_check_extension(s, KVM_CAP_XCRS);
     has_pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2);
+    has_hyperv = kvm_check_extension(s, KVM_CAP_HYPERV);
 
     hv_vpindex_settable = kvm_check_extension(s, KVM_CAP_HYPERV_VP_INDEX);
 
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] kvm/i386: Add support for user space MSR filtering
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
                   ` (3 preceding siblings ...)
  2021-05-24 19:54 ` [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV) Siddharth Chandrasekaran
@ 2021-05-24 20:01 ` Siddharth Chandrasekaran
  2021-06-08  8:48   ` Alexander Graf
  2021-05-24 20:02 ` [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page Siddharth Chandrasekaran
  2021-06-07 19:36 ` [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 20:01 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

Check and enable user space MSR filtering capability and handle new exit
reason KVM_EXIT_X86_WRMSR. This will be used in a follow up patch to
implement hyper-v overlay pages.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 target/i386/kvm/kvm.c | 72 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 362f04ab3f..3591f8cecc 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -117,6 +117,8 @@ static bool has_msr_ucode_rev;
 static bool has_msr_vmx_procbased_ctls2;
 static bool has_msr_perf_capabs;
 static bool has_msr_pkrs;
+static bool has_msr_filtering;
+static bool msr_filters_active;
 
 static uint32_t has_architectural_pmu_version;
 static uint32_t num_architectural_pmu_gp_counters;
@@ -2138,6 +2140,57 @@ static void register_smram_listener(Notifier *n, void *unused)
                                  &smram_address_space, 1);
 }
 
+static void kvm_set_msr_filter_range(struct kvm_msr_filter_range *range, uint32_t flags,
+                                     uint32_t base, uint32_t nmsrs, ...)
+{
+    int i, filter_to_userspace;
+    va_list ap;
+
+    range->flags = flags;
+    range->nmsrs = nmsrs;
+    range->base = base;
+
+    va_start(ap, nmsrs);
+    for (i = 0; i < nmsrs; i++) {
+        filter_to_userspace = va_arg(ap, int);
+        if (!filter_to_userspace) {
+            range->bitmap[i / 8] = 1 << (i % 8);
+        }
+    }
+    va_end(ap);
+}
+
+static int kvm_set_msr_filters(KVMState *s)
+{
+    int r, nmsrs, nfilt = 0, bitmap_pos = 0;
+    struct kvm_msr_filter filter = { };
+    struct kvm_msr_filter_range *range;
+    uint8_t bitmap_buf[KVM_MSR_FILTER_MAX_RANGES * 8] = {0};
+
+    filter.flags = KVM_MSR_FILTER_DEFAULT_ALLOW;
+
+    if (has_hyperv) {
+        /* Hyper-V overlay page MSRs */
+        nmsrs = 2;
+        range = &filter.ranges[nfilt++];
+        range->bitmap = &bitmap_buf[bitmap_pos];
+        kvm_set_msr_filter_range(range, KVM_MSR_FILTER_WRITE,
+                                 HV_X64_MSR_GUEST_OS_ID, nmsrs,
+                                 true, /* HV_X64_MSR_GUEST_OS_ID */
+                                 true  /* HV_X64_MSR_HYPERCALL */);
+        bitmap_pos += ROUND_UP(nmsrs, 8) / 8;
+        assert(bitmap_pos < sizeof(bitmap_buf));
+    }
+
+    r = kvm_vm_ioctl(s, KVM_X86_SET_MSR_FILTER, &filter);
+    if (r != 0) {
+        error_report("kvm: failed to set MSR filters");
+        return -1;
+    }
+
+    return 0;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     uint64_t identity_base = 0xfffbc000;
@@ -2269,6 +2322,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    has_msr_filtering = kvm_check_extension(s, KVM_CAP_X86_USER_SPACE_MSR) &&
+                        kvm_check_extension(s, KVM_CAP_X86_MSR_FILTER);
+    if (has_msr_filtering) {
+        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_USER_SPACE_MSR, 0,
+                                KVM_MSR_EXIT_REASON_FILTER);
+        if (ret == 0) {
+            ret = kvm_set_msr_filters(s);
+            msr_filters_active = (ret == 0);
+        }
+    }
+
     return 0;
 }
 
@@ -4542,6 +4606,11 @@ static bool host_supports_vmx(void)
     return ecx & CPUID_EXT_VMX;
 }
 
+static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
+{
+    return 0;
+}
+
 #define VMX_INVALID_GUEST_STATE 0x80000021
 
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
@@ -4600,6 +4669,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ioapic_eoi_broadcast(run->eoi.vector);
         ret = 0;
         break;
+    case KVM_EXIT_X86_WRMSR:
+        ret = kvm_handle_wrmsr(cpu, run);
+        break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
         ret = -1;
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
                   ` (4 preceding siblings ...)
  2021-05-24 20:01 ` [PATCH 5/6] kvm/i386: Add support for user space MSR filtering Siddharth Chandrasekaran
@ 2021-05-24 20:02 ` Siddharth Chandrasekaran
  2021-06-08  9:02   ` Alexander Graf
  2021-06-07 19:36 ` [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
  6 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-05-24 20:02 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Siddharth Chandrasekaran,
	Alexander Graf, Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis,
	qemu-devel, kvm

Hypercall code page is specified in the Hyper-V TLFS to be an overlay
page, ie., guest chooses a GPA and the host _places_ a page at that
location, making it visible to the guest and the existing page becomes
inaccessible. Similarly when disabled, the host should _remove_ the
overlay and the old page should become visible to the guest.

Until now, KVM patched the hypercall code directly into the guest
chosen GPA which is incorrect; instead, use the new user space MSR
filtering feature to trap hypercall page MSR writes, overlay it as
requested and then invoke a KVM_SET_MSR from user space to bounce back
control KVM. This bounce back is needed as KVM may have to write data
into the newly overlaid page.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
---
 hw/hyperv/hyperv.c         | 10 ++++-
 include/hw/hyperv/hyperv.h |  5 +++
 target/i386/kvm/hyperv.c   | 84 ++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/hyperv.h   |  4 ++
 target/i386/kvm/kvm.c      | 26 +++++++++++-
 5 files changed, 127 insertions(+), 2 deletions(-)

diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
index ac45e8e139..aa5ac5226e 100644
--- a/hw/hyperv/hyperv.c
+++ b/hw/hyperv/hyperv.c
@@ -36,6 +36,7 @@ struct SynICState {
 OBJECT_DECLARE_SIMPLE_TYPE(SynICState, SYNIC)
 
 static bool synic_enabled;
+struct hyperv_overlay_page hcall_page;
 
 static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
                                Object *owner, const char *name)
@@ -50,7 +51,7 @@ static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
  * This method must be called with iothread lock taken as it modifies
  * the memory hierarchy.
  */
-static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
+void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
 {
     if (addr != HYPERV_INVALID_OVERLAY_GPA) {
         /* check if overlay page is enabled */
@@ -70,6 +71,13 @@ static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr ad
     }
 }
 
+void hyperv_overlay_init(void)
+{
+    memory_region_init_ram(&hcall_page.mr, NULL, "hyperv.hcall_page",
+                           qemu_real_host_page_size, &error_abort);
+    hcall_page.addr = HYPERV_INVALID_OVERLAY_GPA;
+}
+
 static void synic_update(SynICState *synic, bool enable,
                          hwaddr msg_page_addr, hwaddr event_page_addr)
 {
diff --git a/include/hw/hyperv/hyperv.h b/include/hw/hyperv/hyperv.h
index d989193e84..f444431a81 100644
--- a/include/hw/hyperv/hyperv.h
+++ b/include/hw/hyperv/hyperv.h
@@ -85,6 +85,11 @@ static inline uint32_t hyperv_vp_index(CPUState *cs)
     return cs->cpu_index;
 }
 
+extern struct hyperv_overlay_page hcall_page;
+
+void hyperv_overlay_init(void);
+void hyperv_overlay_update(struct hyperv_overlay_page *page, hwaddr addr);
+
 void hyperv_synic_add(CPUState *cs);
 void hyperv_synic_reset(CPUState *cs);
 void hyperv_synic_update(CPUState *cs, bool enable,
diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
index f49ed2621d..01c9c2468c 100644
--- a/target/i386/kvm/hyperv.c
+++ b/target/i386/kvm/hyperv.c
@@ -16,6 +16,76 @@
 #include "hyperv.h"
 #include "hw/hyperv/hyperv.h"
 #include "hyperv-proto.h"
+#include "kvm_i386.h"
+
+struct x86_hv_overlay {
+    struct hyperv_overlay_page *page;
+    uint32_t msr;
+    hwaddr gpa;
+};
+
+static void async_overlay_update(CPUState *cs, run_on_cpu_data data)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    struct x86_hv_overlay *overlay = data.host_ptr;
+
+    qemu_mutex_lock_iothread();
+    hyperv_overlay_update(overlay->page, overlay->gpa);
+    qemu_mutex_unlock_iothread();
+
+    /**
+     * Call KVM so it can keep a copy of the MSR data and do other post-overlay
+     * actions such as filling the overlay page contents before returning to
+     * guest. This works because MSR filtering is inactive for KVM_SET_MSRS
+     */
+    kvm_put_one_msr(cpu, overlay->msr, overlay->gpa);
+
+    g_free(overlay);
+}
+
+static void do_overlay_update(X86CPU *cpu, struct hyperv_overlay_page *page,
+                              uint32_t msr, uint64_t data)
+{
+    struct x86_hv_overlay *overlay = g_malloc(sizeof(struct x86_hv_overlay));
+
+    *overlay = (struct x86_hv_overlay) {
+        .page = page,
+        .msr = msr,
+        .gpa = data
+    };
+
+    /**
+     * This will run in this cpu thread before it returns to KVM, but in a
+     * safe environment (i.e. when all cpus are quiescent) -- this is
+     * necessary because memory hierarchy is being changed
+     */
+    async_safe_run_on_cpu(CPU(cpu), async_overlay_update,
+                          RUN_ON_CPU_HOST_PTR(overlay));
+}
+
+static void overlay_update(X86CPU *cpu, uint32_t msr, uint64_t data)
+{
+    switch (msr) {
+    case HV_X64_MSR_GUEST_OS_ID:
+        /**
+         * When GUEST_OS_ID is cleared, hypercall overlay should be removed;
+         * otherwise it is a NOP. We still need to do a SET_MSR here as the
+         * kernel need to keep a copy of data.
+         */
+        if (data != 0) {
+            kvm_put_one_msr(cpu, msr, data);
+            return;
+        }
+        /* Fake a zero write to the overlay page hcall to invalidate the mapping */
+        do_overlay_update(cpu, &hcall_page, msr, 0);
+        break;
+    case HV_X64_MSR_HYPERCALL:
+        do_overlay_update(cpu, &hcall_page, msr, data);
+        break;
+    default:
+        return;
+    }
+}
 
 int hyperv_x86_synic_add(X86CPU *cpu)
 {
@@ -44,6 +114,20 @@ static void async_synic_update(CPUState *cs, run_on_cpu_data data)
     qemu_mutex_unlock_iothread();
 }
 
+int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data)
+{
+    switch (msr) {
+    case HV_X64_MSR_GUEST_OS_ID:
+    case HV_X64_MSR_HYPERCALL:
+        overlay_update(cpu, msr, data);
+        break;
+    default:
+        return -1;
+    }
+
+    return 0;
+}
+
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
 {
     CPUX86State *env = &cpu->env;
diff --git a/target/i386/kvm/hyperv.h b/target/i386/kvm/hyperv.h
index 67543296c3..8e90fa949f 100644
--- a/target/i386/kvm/hyperv.h
+++ b/target/i386/kvm/hyperv.h
@@ -20,8 +20,12 @@
 
 #ifdef CONFIG_KVM
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
+int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data);
+
 #endif
 
+void hyperv_x86_hcall_page_update(X86CPU *cpu);
+
 int hyperv_x86_synic_add(X86CPU *cpu);
 void hyperv_x86_synic_reset(X86CPU *cpu);
 void hyperv_x86_synic_update(X86CPU *cpu);
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3591f8cecc..bfb9eff440 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2333,6 +2333,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    if (has_hyperv && msr_filters_active) {
+        hyperv_overlay_init();
+    }
+
     return 0;
 }
 
@@ -4608,7 +4612,27 @@ static bool host_supports_vmx(void)
 
 static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
 {
-    return 0;
+    int r = -1;
+    uint32_t msr;
+    uint64_t data;
+
+    if (run->msr.reason != KVM_MSR_EXIT_REASON_FILTER) {
+        return -1;
+    }
+
+    msr = run->msr.index;
+    data = run->msr.data;
+
+    switch (msr) {
+    case HV_X64_MSR_GUEST_OS_ID:
+    case HV_X64_MSR_HYPERCALL:
+        r = kvm_hv_handle_wrmsr(cpu, msr, data);
+        break;
+    default:
+        error_report("Unknown MSR exit");
+    }
+
+    return r;
 }
 
 #define VMX_INVALID_GUEST_STATE 0x80000021
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6] Handle hypercall code overlay page in userspace
  2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
                   ` (5 preceding siblings ...)
  2021-05-24 20:02 ` [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page Siddharth Chandrasekaran
@ 2021-06-07 19:36 ` Siddharth Chandrasekaran
  6 siblings, 0 replies; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-06-07 19:36 UTC (permalink / raw)
  To: Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Alexander Graf, Evgeny Iakovlev,
	Liran Alon, Ioannis Aslanidis, qemu-devel, kvm

A reminder email to bring these up on your inboxes :). Would love
to hear your thoughts on them.

~ Sid.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages
  2021-05-24 19:54 ` [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages Siddharth Chandrasekaran
@ 2021-06-08  8:27   ` Alexander Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  8:27 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 21:54, Siddharth Chandrasekaran wrote:
> Capture overlay page semantic variables into 'struct overlay_page' and
> add methods that operate over it. Adapt existing synic event and mesage
> pages to use these methods to setup and manage overlays.
> 
> Since all overlay pages use bit 0 of the GPA to indicate if the overlay
> is enabled, the checks for this bit is moved into the unified overlaying
> method hyperv_overlay_update() so the caller does not need to care about
> it.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>

Reviewed-by: Alexander Graf <graf@amazon.com>


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] hyper-v: Use -1 as invalid overlay address
  2021-05-24 19:54 ` [PATCH 2/6] hyper-v: Use -1 as invalid overlay address Siddharth Chandrasekaran
@ 2021-06-08  8:27   ` Alexander Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  8:27 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 21:54, Siddharth Chandrasekaran wrote:
> When managing overlay pages, we used hwaddr 0 to signal an invalid
> address (to disable a page). Although unlikely, 0 _could_ be a valid
> overlay offset as Hyper-V TLFS does not specify anything about it.
> 
> Use -1 as the invalid address indicator as it can never be a valid
> address.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>

Reviewed-by: Alexander Graf <graf@amazon.com>


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr()
  2021-05-24 19:54 ` [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr() Siddharth Chandrasekaran
@ 2021-06-08  8:27   ` Alexander Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  8:27 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 21:54, Siddharth Chandrasekaran wrote:
> kvm_put_one_msr() zeros cpu->kvm_msr_buf and uses it to set one MSR to
> KVM. It is pretty wasteful as cpu->kvm_msr_buf is 4096 bytes long;
> instead use a local buffer to avoid memset.
> 
> Also, expose this method from kvm_i386.h as hyperv.c needs to set MSRs
> in a subsequent patch.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>

Reviewed-by: Alexander Graf <graf@amazon.com>


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV)
  2021-05-24 19:54 ` [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV) Siddharth Chandrasekaran
@ 2021-06-08  8:28   ` Alexander Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  8:28 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 21:54, Siddharth Chandrasekaran wrote:
> KVM_CAP_HYPERV is a VM ioctl and can be cached at kvm_arch_init()
> instead of performing an ioctl each time in hyperv_enabled() which is
> called foreach vCPU. Apart from that, this variable will come in handy
> in a subsequent patch.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>

Reviewed-by: Alexander Graf <graf@amazon.com>


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] kvm/i386: Add support for user space MSR filtering
  2021-05-24 20:01 ` [PATCH 5/6] kvm/i386: Add support for user space MSR filtering Siddharth Chandrasekaran
@ 2021-06-08  8:48   ` Alexander Graf
  2021-06-08 10:53     ` Siddharth Chandrasekaran
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  8:48 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 22:01, Siddharth Chandrasekaran wrote:
> Check and enable user space MSR filtering capability and handle new exit
> reason KVM_EXIT_X86_WRMSR. This will be used in a follow up patch to
> implement hyper-v overlay pages.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>

This patch will break bisection, because we're no longer handling the 
writes in kernel space after this, but we also don't have user space 
handling available yet, right? It might be better to move all logic in 
this patch that sets up the filter for Hyper-V MSRs into the next one.

> ---
>   target/i386/kvm/kvm.c | 72 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 72 insertions(+)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 362f04ab3f..3591f8cecc 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -117,6 +117,8 @@ static bool has_msr_ucode_rev;
>   static bool has_msr_vmx_procbased_ctls2;
>   static bool has_msr_perf_capabs;
>   static bool has_msr_pkrs;
> +static bool has_msr_filtering;
> +static bool msr_filters_active;
>   
>   static uint32_t has_architectural_pmu_version;
>   static uint32_t num_architectural_pmu_gp_counters;
> @@ -2138,6 +2140,57 @@ static void register_smram_listener(Notifier *n, void *unused)
>                                    &smram_address_space, 1);
>   }
>   
> +static void kvm_set_msr_filter_range(struct kvm_msr_filter_range *range, uint32_t flags,
> +                                     uint32_t base, uint32_t nmsrs, ...)
> +{
> +    int i, filter_to_userspace;
> +    va_list ap;
> +
> +    range->flags = flags;
> +    range->nmsrs = nmsrs;
> +    range->base = base;
> +
> +    va_start(ap, nmsrs);
> +    for (i = 0; i < nmsrs; i++) {
> +        filter_to_userspace = va_arg(ap, int);
> +        if (!filter_to_userspace) {
> +            range->bitmap[i / 8] = 1 << (i % 8);
> +        }
> +    }
> +    va_end(ap);
> +}
> +
> +static int kvm_set_msr_filters(KVMState *s)
> +{
> +    int r, nmsrs, nfilt = 0, bitmap_pos = 0;
> +    struct kvm_msr_filter filter = { };
> +    struct kvm_msr_filter_range *range;
> +    uint8_t bitmap_buf[KVM_MSR_FILTER_MAX_RANGES * 8] = {0};
> +
> +    filter.flags = KVM_MSR_FILTER_DEFAULT_ALLOW;
> +
> +    if (has_hyperv) {
> +        /* Hyper-V overlay page MSRs */

I think you want to extend this comment and indicate in a human readable 
form that you set the filter on WRMSR to trap HV_X64_MSR_GUEST_OS_ID and 
HV_X64_MSR_HYPERCALL into user space here.

> +        nmsrs = 2;
> +        range = &filter.ranges[nfilt++];
> +        range->bitmap = &bitmap_buf[bitmap_pos];
> +        kvm_set_msr_filter_range(range, KVM_MSR_FILTER_WRITE,
> +                                 HV_X64_MSR_GUEST_OS_ID, nmsrs,
> +                                 true, /* HV_X64_MSR_GUEST_OS_ID */
> +                                 true  /* HV_X64_MSR_HYPERCALL */);
> +        bitmap_pos += ROUND_UP(nmsrs, 8) / 8;
> +        assert(bitmap_pos < sizeof(bitmap_buf));
> +    }
> +
> +    r = kvm_vm_ioctl(s, KVM_X86_SET_MSR_FILTER, &filter);
> +    if (r != 0) {
> +        error_report("kvm: failed to set MSR filters");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>   int kvm_arch_init(MachineState *ms, KVMState *s)
>   {
>       uint64_t identity_base = 0xfffbc000;
> @@ -2269,6 +2322,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>           }
>       }
>   
> +    has_msr_filtering = kvm_check_extension(s, KVM_CAP_X86_USER_SPACE_MSR) &&
> +                        kvm_check_extension(s, KVM_CAP_X86_MSR_FILTER);
> +    if (has_msr_filtering) {
> +        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_USER_SPACE_MSR, 0,
> +                                KVM_MSR_EXIT_REASON_FILTER);
> +        if (ret == 0) {
> +            ret = kvm_set_msr_filters(s);
> +            msr_filters_active = (ret == 0);
> +        }
> +    }
> +
>       return 0;
>   }
>   
> @@ -4542,6 +4606,11 @@ static bool host_supports_vmx(void)
>       return ecx & CPUID_EXT_VMX;
>   }
>   
> +static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
> +{
> +    return 0;

The default handler should always set run->msr.error = 1 to mimic the 
existing behavior.

> +}
> +
>   #define VMX_INVALID_GUEST_STATE 0x80000021
>   
>   int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> @@ -4600,6 +4669,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>           ioapic_eoi_broadcast(run->eoi.vector);
>           ret = 0;
>           break;
> +    case KVM_EXIT_X86_WRMSR:
> +        ret = kvm_handle_wrmsr(cpu, run);

Please provide a default RDMSR handler as well here.


Alex

> +        break;
>       default:
>           fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>           ret = -1;
> 



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page
  2021-05-24 20:02 ` [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page Siddharth Chandrasekaran
@ 2021-06-08  9:02   ` Alexander Graf
  2021-06-08 10:55     ` Siddharth Chandrasekaran
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Graf @ 2021-06-08  9:02 UTC (permalink / raw)
  To: Siddharth Chandrasekaran, Paolo Bonzini, Marcelo Tosatti
  Cc: Siddharth Chandrasekaran, Evgeny Iakovlev, Liran Alon,
	Ioannis Aslanidis, qemu-devel, kvm



On 24.05.21 22:02, Siddharth Chandrasekaran wrote:
> Hypercall code page is specified in the Hyper-V TLFS to be an overlay
> page, ie., guest chooses a GPA and the host _places_ a page at that
> location, making it visible to the guest and the existing page becomes
> inaccessible. Similarly when disabled, the host should _remove_ the
> overlay and the old page should become visible to the guest.
> 
> Until now, KVM patched the hypercall code directly into the guest
> chosen GPA which is incorrect; instead, use the new user space MSR
> filtering feature to trap hypercall page MSR writes, overlay it as
> requested and then invoke a KVM_SET_MSR from user space to bounce back
> control KVM. This bounce back is needed as KVM may have to write data
> into the newly overlaid page.
> 
> Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
> ---
>   hw/hyperv/hyperv.c         | 10 ++++-
>   include/hw/hyperv/hyperv.h |  5 +++
>   target/i386/kvm/hyperv.c   | 84 ++++++++++++++++++++++++++++++++++++++
>   target/i386/kvm/hyperv.h   |  4 ++
>   target/i386/kvm/kvm.c      | 26 +++++++++++-
>   5 files changed, 127 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
> index ac45e8e139..aa5ac5226e 100644
> --- a/hw/hyperv/hyperv.c
> +++ b/hw/hyperv/hyperv.c
> @@ -36,6 +36,7 @@ struct SynICState {
>   OBJECT_DECLARE_SIMPLE_TYPE(SynICState, SYNIC)
>   
>   static bool synic_enabled;
> +struct hyperv_overlay_page hcall_page;
>   
>   static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
>                                  Object *owner, const char *name)
> @@ -50,7 +51,7 @@ static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
>    * This method must be called with iothread lock taken as it modifies
>    * the memory hierarchy.
>    */
> -static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
> +void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
>   {
>       if (addr != HYPERV_INVALID_OVERLAY_GPA) {
>           /* check if overlay page is enabled */
> @@ -70,6 +71,13 @@ static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr ad
>       }
>   }
>   
> +void hyperv_overlay_init(void)
> +{
> +    memory_region_init_ram(&hcall_page.mr, NULL, "hyperv.hcall_page",
> +                           qemu_real_host_page_size, &error_abort);
> +    hcall_page.addr = HYPERV_INVALID_OVERLAY_GPA;
> +}
> +
>   static void synic_update(SynICState *synic, bool enable,
>                            hwaddr msg_page_addr, hwaddr event_page_addr)
>   {
> diff --git a/include/hw/hyperv/hyperv.h b/include/hw/hyperv/hyperv.h
> index d989193e84..f444431a81 100644
> --- a/include/hw/hyperv/hyperv.h
> +++ b/include/hw/hyperv/hyperv.h
> @@ -85,6 +85,11 @@ static inline uint32_t hyperv_vp_index(CPUState *cs)
>       return cs->cpu_index;
>   }
>   
> +extern struct hyperv_overlay_page hcall_page;
> +
> +void hyperv_overlay_init(void);
> +void hyperv_overlay_update(struct hyperv_overlay_page *page, hwaddr addr);
> +
>   void hyperv_synic_add(CPUState *cs);
>   void hyperv_synic_reset(CPUState *cs);
>   void hyperv_synic_update(CPUState *cs, bool enable,
> diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
> index f49ed2621d..01c9c2468c 100644
> --- a/target/i386/kvm/hyperv.c
> +++ b/target/i386/kvm/hyperv.c
> @@ -16,6 +16,76 @@
>   #include "hyperv.h"
>   #include "hw/hyperv/hyperv.h"
>   #include "hyperv-proto.h"
> +#include "kvm_i386.h"
> +
> +struct x86_hv_overlay {
> +    struct hyperv_overlay_page *page;
> +    uint32_t msr;
> +    hwaddr gpa;
> +};
> +
> +static void async_overlay_update(CPUState *cs, run_on_cpu_data data)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    struct x86_hv_overlay *overlay = data.host_ptr;
> +
> +    qemu_mutex_lock_iothread();
> +    hyperv_overlay_update(overlay->page, overlay->gpa);
> +    qemu_mutex_unlock_iothread();
> +
> +    /**
> +     * Call KVM so it can keep a copy of the MSR data and do other post-overlay
> +     * actions such as filling the overlay page contents before returning to
> +     * guest. This works because MSR filtering is inactive for KVM_SET_MSRS
> +     */
> +    kvm_put_one_msr(cpu, overlay->msr, overlay->gpa);
> +
> +    g_free(overlay);
> +}
> +
> +static void do_overlay_update(X86CPU *cpu, struct hyperv_overlay_page *page,
> +                              uint32_t msr, uint64_t data)
> +{
> +    struct x86_hv_overlay *overlay = g_malloc(sizeof(struct x86_hv_overlay));
> +
> +    *overlay = (struct x86_hv_overlay) {
> +        .page = page,
> +        .msr = msr,
> +        .gpa = data
> +    };
> +
> +    /**
> +     * This will run in this cpu thread before it returns to KVM, but in a
> +     * safe environment (i.e. when all cpus are quiescent) -- this is
> +     * necessary because memory hierarchy is being changed
> +     */
> +    async_safe_run_on_cpu(CPU(cpu), async_overlay_update,
> +                          RUN_ON_CPU_HOST_PTR(overlay));
> +}
> +
> +static void overlay_update(X86CPU *cpu, uint32_t msr, uint64_t data)
> +{
> +    switch (msr) {
> +    case HV_X64_MSR_GUEST_OS_ID:
> +        /**
> +         * When GUEST_OS_ID is cleared, hypercall overlay should be removed;
> +         * otherwise it is a NOP. We still need to do a SET_MSR here as the
> +         * kernel need to keep a copy of data.
> +         */
> +        if (data != 0) {
> +            kvm_put_one_msr(cpu, msr, data);
> +            return;
> +        }
> +        /* Fake a zero write to the overlay page hcall to invalidate the mapping */
> +        do_overlay_update(cpu, &hcall_page, msr, 0);
> +        break;
> +    case HV_X64_MSR_HYPERCALL:
> +        do_overlay_update(cpu, &hcall_page, msr, data);
> +        break;
> +    default:
> +        return;
> +    }
> +}
>   
>   int hyperv_x86_synic_add(X86CPU *cpu)
>   {
> @@ -44,6 +114,20 @@ static void async_synic_update(CPUState *cs, run_on_cpu_data data)
>       qemu_mutex_unlock_iothread();
>   }
>   
> +int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data)
> +{
> +    switch (msr) {
> +    case HV_X64_MSR_GUEST_OS_ID:
> +    case HV_X64_MSR_HYPERCALL:
> +        overlay_update(cpu, msr, data);
> +        break;
> +    default:
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>   int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
>   {
>       CPUX86State *env = &cpu->env;
> diff --git a/target/i386/kvm/hyperv.h b/target/i386/kvm/hyperv.h
> index 67543296c3..8e90fa949f 100644
> --- a/target/i386/kvm/hyperv.h
> +++ b/target/i386/kvm/hyperv.h
> @@ -20,8 +20,12 @@
>   
>   #ifdef CONFIG_KVM
>   int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
> +int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data);
> +
>   #endif
>   
> +void hyperv_x86_hcall_page_update(X86CPU *cpu);
> +
>   int hyperv_x86_synic_add(X86CPU *cpu);
>   void hyperv_x86_synic_reset(X86CPU *cpu);
>   void hyperv_x86_synic_update(X86CPU *cpu);
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 3591f8cecc..bfb9eff440 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2333,6 +2333,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>           }
>       }
>   
> +    if (has_hyperv && msr_filters_active) {
> +        hyperv_overlay_init();
> +    }
> +
>       return 0;
>   }
>   
> @@ -4608,7 +4612,27 @@ static bool host_supports_vmx(void)
>   
>   static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
>   {
> -    return 0;
> +    int r = -1;
> +    uint32_t msr;
> +    uint64_t data;
> +
> +    if (run->msr.reason != KVM_MSR_EXIT_REASON_FILTER) {
> +        return -1;
> +    }
> +
> +    msr = run->msr.index;
> +    data = run->msr.data;
> +
> +    switch (msr) {
> +    case HV_X64_MSR_GUEST_OS_ID:
> +    case HV_X64_MSR_HYPERCALL:
> +        r = kvm_hv_handle_wrmsr(cpu, msr, data);
> +        break;
> +    default:
> +        error_report("Unknown MSR exit");
> +    }
> +
> +    return r;

I think you can always return 0 here, as long as you set msr.error=1.

Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] kvm/i386: Add support for user space MSR filtering
  2021-06-08  8:48   ` Alexander Graf
@ 2021-06-08 10:53     ` Siddharth Chandrasekaran
  2021-06-25 10:35       ` Siddharth Chandrasekaran
  0 siblings, 1 reply; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-06-08 10:53 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Paolo Bonzini, Marcelo Tosatti, Siddharth Chandrasekaran,
	Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis, qemu-devel, kvm

On Tue, Jun 08, 2021 at 10:48:53AM +0200, Alexander Graf wrote:
> On 24.05.21 22:01, Siddharth Chandrasekaran wrote:
> > Check and enable user space MSR filtering capability and handle new exit
> > reason KVM_EXIT_X86_WRMSR. This will be used in a follow up patch to
> > implement hyper-v overlay pages.
> > 
> > Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
> 
> This patch will break bisection, because we're no longer handling the writes
> in kernel space after this, but we also don't have user space handling
> available yet, right? It might be better to move all logic in this patch
> that sets up the filter for Hyper-V MSRs into the next one.

Yes, that's correct. I'll just bounce back all reads/writes to KVM. That
should maintain the existing behaviour.

> > ---
> >   target/i386/kvm/kvm.c | 72 +++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 72 insertions(+)
> > 
> > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> > index 362f04ab3f..3591f8cecc 100644
> > --- a/target/i386/kvm/kvm.c
> > +++ b/target/i386/kvm/kvm.c
> > @@ -117,6 +117,8 @@ static bool has_msr_ucode_rev;
> >   static bool has_msr_vmx_procbased_ctls2;
> >   static bool has_msr_perf_capabs;
> >   static bool has_msr_pkrs;
> > +static bool has_msr_filtering;
> > +static bool msr_filters_active;
> >   static uint32_t has_architectural_pmu_version;
> >   static uint32_t num_architectural_pmu_gp_counters;
> > @@ -2138,6 +2140,57 @@ static void register_smram_listener(Notifier *n, void *unused)
> >                                    &smram_address_space, 1);
> >   }
> > +static void kvm_set_msr_filter_range(struct kvm_msr_filter_range *range, uint32_t flags,
> > +                                     uint32_t base, uint32_t nmsrs, ...)
> > +{
> > +    int i, filter_to_userspace;
> > +    va_list ap;
> > +
> > +    range->flags = flags;
> > +    range->nmsrs = nmsrs;
> > +    range->base = base;
> > +
> > +    va_start(ap, nmsrs);
> > +    for (i = 0; i < nmsrs; i++) {
> > +        filter_to_userspace = va_arg(ap, int);
> > +        if (!filter_to_userspace) {
> > +            range->bitmap[i / 8] = 1 << (i % 8);
> > +        }
> > +    }
> > +    va_end(ap);
> > +}
> > +
> > +static int kvm_set_msr_filters(KVMState *s)
> > +{
> > +    int r, nmsrs, nfilt = 0, bitmap_pos = 0;
> > +    struct kvm_msr_filter filter = { };
> > +    struct kvm_msr_filter_range *range;
> > +    uint8_t bitmap_buf[KVM_MSR_FILTER_MAX_RANGES * 8] = {0};
> > +
> > +    filter.flags = KVM_MSR_FILTER_DEFAULT_ALLOW;
> > +
> > +    if (has_hyperv) {
> > +        /* Hyper-V overlay page MSRs */
> 
> I think you want to extend this comment and indicate in a human readable
> form that you set the filter on WRMSR to trap HV_X64_MSR_GUEST_OS_ID and
> HV_X64_MSR_HYPERCALL into user space here.

Sure.

> > +        nmsrs = 2;
> > +        range = &filter.ranges[nfilt++];
> > +        range->bitmap = &bitmap_buf[bitmap_pos];
> > +        kvm_set_msr_filter_range(range, KVM_MSR_FILTER_WRITE,
> > +                                 HV_X64_MSR_GUEST_OS_ID, nmsrs,
> > +                                 true, /* HV_X64_MSR_GUEST_OS_ID */
> > +                                 true  /* HV_X64_MSR_HYPERCALL */);
> > +        bitmap_pos += ROUND_UP(nmsrs, 8) / 8;
> > +        assert(bitmap_pos < sizeof(bitmap_buf));
> > +    }
> > +
> > +    r = kvm_vm_ioctl(s, KVM_X86_SET_MSR_FILTER, &filter);
> > +    if (r != 0) {
> > +        error_report("kvm: failed to set MSR filters");
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >   int kvm_arch_init(MachineState *ms, KVMState *s)
> >   {
> >       uint64_t identity_base = 0xfffbc000;
> > @@ -2269,6 +2322,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >           }
> >       }
> > +    has_msr_filtering = kvm_check_extension(s, KVM_CAP_X86_USER_SPACE_MSR) &&
> > +                        kvm_check_extension(s, KVM_CAP_X86_MSR_FILTER);
> > +    if (has_msr_filtering) {
> > +        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_USER_SPACE_MSR, 0,
> > +                                KVM_MSR_EXIT_REASON_FILTER);
> > +        if (ret == 0) {
> > +            ret = kvm_set_msr_filters(s);
> > +            msr_filters_active = (ret == 0);
> > +        }
> > +    }
> > +
> >       return 0;
> >   }
> > @@ -4542,6 +4606,11 @@ static bool host_supports_vmx(void)
> >       return ecx & CPUID_EXT_VMX;
> >   }
> > +static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
> > +{
> > +    return 0;
> 
> The default handler should always set run->msr.error = 1 to mimic the
> existing behavior.

Will do, thanks.

> > +}
> > +
> >   #define VMX_INVALID_GUEST_STATE 0x80000021
> >   int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> > @@ -4600,6 +4669,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
> >           ioapic_eoi_broadcast(run->eoi.vector);
> >           ret = 0;
> >           break;
> > +    case KVM_EXIT_X86_WRMSR:
> > +        ret = kvm_handle_wrmsr(cpu, run);
> 
> Please provide a default RDMSR handler as well here.

Ack.

~ Sid.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page
  2021-06-08  9:02   ` Alexander Graf
@ 2021-06-08 10:55     ` Siddharth Chandrasekaran
  0 siblings, 0 replies; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-06-08 10:55 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Paolo Bonzini, Marcelo Tosatti, Siddharth Chandrasekaran,
	Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis, qemu-devel, kvm

On Tue, Jun 08, 2021 at 11:02:45AM +0200, Alexander Graf wrote:
> On 24.05.21 22:02, Siddharth Chandrasekaran wrote:
> > Hypercall code page is specified in the Hyper-V TLFS to be an overlay
> > page, ie., guest chooses a GPA and the host _places_ a page at that
> > location, making it visible to the guest and the existing page becomes
> > inaccessible. Similarly when disabled, the host should _remove_ the
> > overlay and the old page should become visible to the guest.
> > 
> > Until now, KVM patched the hypercall code directly into the guest
> > chosen GPA which is incorrect; instead, use the new user space MSR
> > filtering feature to trap hypercall page MSR writes, overlay it as
> > requested and then invoke a KVM_SET_MSR from user space to bounce back
> > control KVM. This bounce back is needed as KVM may have to write data
> > into the newly overlaid page.
> > 
> > Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
> > ---
> >   hw/hyperv/hyperv.c         | 10 ++++-
> >   include/hw/hyperv/hyperv.h |  5 +++
> >   target/i386/kvm/hyperv.c   | 84 ++++++++++++++++++++++++++++++++++++++
> >   target/i386/kvm/hyperv.h   |  4 ++
> >   target/i386/kvm/kvm.c      | 26 +++++++++++-
> >   5 files changed, 127 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
> > index ac45e8e139..aa5ac5226e 100644
> > --- a/hw/hyperv/hyperv.c
> > +++ b/hw/hyperv/hyperv.c
> > @@ -36,6 +36,7 @@ struct SynICState {
> >   OBJECT_DECLARE_SIMPLE_TYPE(SynICState, SYNIC)
> >   static bool synic_enabled;
> > +struct hyperv_overlay_page hcall_page;
> >   static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
> >                                  Object *owner, const char *name)
> > @@ -50,7 +51,7 @@ static void alloc_overlay_page(struct hyperv_overlay_page *overlay,
> >    * This method must be called with iothread lock taken as it modifies
> >    * the memory hierarchy.
> >    */
> > -static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
> > +void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr addr)
> >   {
> >       if (addr != HYPERV_INVALID_OVERLAY_GPA) {
> >           /* check if overlay page is enabled */
> > @@ -70,6 +71,13 @@ static void hyperv_overlay_update(struct hyperv_overlay_page *overlay, hwaddr ad
> >       }
> >   }
> > +void hyperv_overlay_init(void)
> > +{
> > +    memory_region_init_ram(&hcall_page.mr, NULL, "hyperv.hcall_page",
> > +                           qemu_real_host_page_size, &error_abort);
> > +    hcall_page.addr = HYPERV_INVALID_OVERLAY_GPA;
> > +}
> > +
> >   static void synic_update(SynICState *synic, bool enable,
> >                            hwaddr msg_page_addr, hwaddr event_page_addr)
> >   {
> > diff --git a/include/hw/hyperv/hyperv.h b/include/hw/hyperv/hyperv.h
> > index d989193e84..f444431a81 100644
> > --- a/include/hw/hyperv/hyperv.h
> > +++ b/include/hw/hyperv/hyperv.h
> > @@ -85,6 +85,11 @@ static inline uint32_t hyperv_vp_index(CPUState *cs)
> >       return cs->cpu_index;
> >   }
> > +extern struct hyperv_overlay_page hcall_page;
> > +
> > +void hyperv_overlay_init(void);
> > +void hyperv_overlay_update(struct hyperv_overlay_page *page, hwaddr addr);
> > +
> >   void hyperv_synic_add(CPUState *cs);
> >   void hyperv_synic_reset(CPUState *cs);
> >   void hyperv_synic_update(CPUState *cs, bool enable,
> > diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
> > index f49ed2621d..01c9c2468c 100644
> > --- a/target/i386/kvm/hyperv.c
> > +++ b/target/i386/kvm/hyperv.c
> > @@ -16,6 +16,76 @@
> >   #include "hyperv.h"
> >   #include "hw/hyperv/hyperv.h"
> >   #include "hyperv-proto.h"
> > +#include "kvm_i386.h"
> > +
> > +struct x86_hv_overlay {
> > +    struct hyperv_overlay_page *page;
> > +    uint32_t msr;
> > +    hwaddr gpa;
> > +};
> > +
> > +static void async_overlay_update(CPUState *cs, run_on_cpu_data data)
> > +{
> > +    X86CPU *cpu = X86_CPU(cs);
> > +    struct x86_hv_overlay *overlay = data.host_ptr;
> > +
> > +    qemu_mutex_lock_iothread();
> > +    hyperv_overlay_update(overlay->page, overlay->gpa);
> > +    qemu_mutex_unlock_iothread();
> > +
> > +    /**
> > +     * Call KVM so it can keep a copy of the MSR data and do other post-overlay
> > +     * actions such as filling the overlay page contents before returning to
> > +     * guest. This works because MSR filtering is inactive for KVM_SET_MSRS
> > +     */
> > +    kvm_put_one_msr(cpu, overlay->msr, overlay->gpa);
> > +
> > +    g_free(overlay);
> > +}
> > +
> > +static void do_overlay_update(X86CPU *cpu, struct hyperv_overlay_page *page,
> > +                              uint32_t msr, uint64_t data)
> > +{
> > +    struct x86_hv_overlay *overlay = g_malloc(sizeof(struct x86_hv_overlay));
> > +
> > +    *overlay = (struct x86_hv_overlay) {
> > +        .page = page,
> > +        .msr = msr,
> > +        .gpa = data
> > +    };
> > +
> > +    /**
> > +     * This will run in this cpu thread before it returns to KVM, but in a
> > +     * safe environment (i.e. when all cpus are quiescent) -- this is
> > +     * necessary because memory hierarchy is being changed
> > +     */
> > +    async_safe_run_on_cpu(CPU(cpu), async_overlay_update,
> > +                          RUN_ON_CPU_HOST_PTR(overlay));
> > +}
> > +
> > +static void overlay_update(X86CPU *cpu, uint32_t msr, uint64_t data)
> > +{
> > +    switch (msr) {
> > +    case HV_X64_MSR_GUEST_OS_ID:
> > +        /**
> > +         * When GUEST_OS_ID is cleared, hypercall overlay should be removed;
> > +         * otherwise it is a NOP. We still need to do a SET_MSR here as the
> > +         * kernel need to keep a copy of data.
> > +         */
> > +        if (data != 0) {
> > +            kvm_put_one_msr(cpu, msr, data);
> > +            return;
> > +        }
> > +        /* Fake a zero write to the overlay page hcall to invalidate the mapping */
> > +        do_overlay_update(cpu, &hcall_page, msr, 0);
> > +        break;
> > +    case HV_X64_MSR_HYPERCALL:
> > +        do_overlay_update(cpu, &hcall_page, msr, data);
> > +        break;
> > +    default:
> > +        return;
> > +    }
> > +}
> >   int hyperv_x86_synic_add(X86CPU *cpu)
> >   {
> > @@ -44,6 +114,20 @@ static void async_synic_update(CPUState *cs, run_on_cpu_data data)
> >       qemu_mutex_unlock_iothread();
> >   }
> > +int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data)
> > +{
> > +    switch (msr) {
> > +    case HV_X64_MSR_GUEST_OS_ID:
> > +    case HV_X64_MSR_HYPERCALL:
> > +        overlay_update(cpu, msr, data);
> > +        break;
> > +    default:
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >   int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
> >   {
> >       CPUX86State *env = &cpu->env;
> > diff --git a/target/i386/kvm/hyperv.h b/target/i386/kvm/hyperv.h
> > index 67543296c3..8e90fa949f 100644
> > --- a/target/i386/kvm/hyperv.h
> > +++ b/target/i386/kvm/hyperv.h
> > @@ -20,8 +20,12 @@
> >   #ifdef CONFIG_KVM
> >   int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
> > +int kvm_hv_handle_wrmsr(X86CPU *cpu, uint32_t msr, uint64_t data);
> > +
> >   #endif
> > +void hyperv_x86_hcall_page_update(X86CPU *cpu);
> > +
> >   int hyperv_x86_synic_add(X86CPU *cpu);
> >   void hyperv_x86_synic_reset(X86CPU *cpu);
> >   void hyperv_x86_synic_update(X86CPU *cpu);
> > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> > index 3591f8cecc..bfb9eff440 100644
> > --- a/target/i386/kvm/kvm.c
> > +++ b/target/i386/kvm/kvm.c
> > @@ -2333,6 +2333,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >           }
> >       }
> > +    if (has_hyperv && msr_filters_active) {
> > +        hyperv_overlay_init();
> > +    }
> > +
> >       return 0;
> >   }
> > @@ -4608,7 +4612,27 @@ static bool host_supports_vmx(void)
> >   static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
> >   {
> > -    return 0;
> > +    int r = -1;
> > +    uint32_t msr;
> > +    uint64_t data;
> > +
> > +    if (run->msr.reason != KVM_MSR_EXIT_REASON_FILTER) {
> > +        return -1;
> > +    }
> > +
> > +    msr = run->msr.index;
> > +    data = run->msr.data;
> > +
> > +    switch (msr) {
> > +    case HV_X64_MSR_GUEST_OS_ID:
> > +    case HV_X64_MSR_HYPERCALL:
> > +        r = kvm_hv_handle_wrmsr(cpu, msr, data);
> > +        break;
> > +    default:
> > +        error_report("Unknown MSR exit");
> > +    }
> > +
> > +    return r;
> 
> I think you can always return 0 here, as long as you set msr.error=1.

Ack, will do that in v2. Thanks for your review.

~ Sid.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] kvm/i386: Add support for user space MSR filtering
  2021-06-08 10:53     ` Siddharth Chandrasekaran
@ 2021-06-25 10:35       ` Siddharth Chandrasekaran
  0 siblings, 0 replies; 17+ messages in thread
From: Siddharth Chandrasekaran @ 2021-06-25 10:35 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Paolo Bonzini, Marcelo Tosatti, Siddharth Chandrasekaran,
	Evgeny Iakovlev, Liran Alon, Ioannis Aslanidis, qemu-devel, kvm

On Tue, Jun 08, 2021 at 12:53:17PM +0200, Siddharth Chandrasekaran wrote:
> On Tue, Jun 08, 2021 at 10:48:53AM +0200, Alexander Graf wrote:
> > On 24.05.21 22:01, Siddharth Chandrasekaran wrote:
> > > Check and enable user space MSR filtering capability and handle new exit
> > > reason KVM_EXIT_X86_WRMSR. This will be used in a follow up patch to
> > > implement hyper-v overlay pages.
> > > 
> > > Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
> > 
> > This patch will break bisection, because we're no longer handling the writes
> > in kernel space after this, but we also don't have user space handling
> > available yet, right? It might be better to move all logic in this patch
> > that sets up the filter for Hyper-V MSRs into the next one.
> 
> Yes, that's correct. I'll just bounce back all reads/writes to KVM. That
> should maintain the existing behaviour.

Okay, bouncing back to KVM was a bad idea :). Moved filters to next
patch as suggested.

~ Sid.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-06-25 10:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-24 19:54 [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran
2021-05-24 19:54 ` [PATCH 1/6] hyper-v: Overlay abstraction for synic event and msg pages Siddharth Chandrasekaran
2021-06-08  8:27   ` Alexander Graf
2021-05-24 19:54 ` [PATCH 2/6] hyper-v: Use -1 as invalid overlay address Siddharth Chandrasekaran
2021-06-08  8:27   ` Alexander Graf
2021-05-24 19:54 ` [PATCH 3/6] kvm/i386: Stop using cpu->kvm_msr_buf in kvm_put_one_msr() Siddharth Chandrasekaran
2021-06-08  8:27   ` Alexander Graf
2021-05-24 19:54 ` [PATCH 4/6] kvm/i386: Avoid multiple calls to check_extension(KVM_CAP_HYPERV) Siddharth Chandrasekaran
2021-06-08  8:28   ` Alexander Graf
2021-05-24 20:01 ` [PATCH 5/6] kvm/i386: Add support for user space MSR filtering Siddharth Chandrasekaran
2021-06-08  8:48   ` Alexander Graf
2021-06-08 10:53     ` Siddharth Chandrasekaran
2021-06-25 10:35       ` Siddharth Chandrasekaran
2021-05-24 20:02 ` [PATCH 6/6] hyper-v: Handle hypercall code page as an overlay page Siddharth Chandrasekaran
2021-06-08  9:02   ` Alexander Graf
2021-06-08 10:55     ` Siddharth Chandrasekaran
2021-06-07 19:36 ` [PATCH 0/6] Handle hypercall code overlay page in userspace Siddharth Chandrasekaran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.