All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication
@ 2020-05-01 18:51 Forrest Yuan Yu
  2020-05-01 18:51 ` [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall Forrest Yuan Yu
  2020-05-01 20:23 ` [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Sean Christopherson
  0 siblings, 2 replies; 8+ messages in thread
From: Forrest Yuan Yu @ 2020-05-01 18:51 UTC (permalink / raw)
  To: kvm; +Cc: Forrest Yuan Yu

This RFC is to add a hypercall UCALL so that guest can communicate with
the userspace hypervisor. A guest may want to do this in different
scenarios. For example, a guest can ask the userspace hypervisor to
harden security by setting restricted access permissions on guest SLAT
entries.

Please note this hypercall provides the infrastructure for this type of
communication but does not enforce the protocol. A proposal for the
guest/userspace communication protocol will follow if this RFC is
accepted.

Forrest Yuan Yu (1):
  KVM: x86: add KVM_HC_UCALL hypercall

 Documentation/virt/kvm/api.rst                |  15 +-
 Documentation/virt/kvm/cpuid.rst              |   3 +
 Documentation/virt/kvm/hypercalls.rst         |  14 ++
 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/include/uapi/asm/kvm_para.h          |   1 +
 arch/x86/kvm/x86.c                            |  39 +++-
 include/uapi/linux/kvm.h                      |   1 +
 include/uapi/linux/kvm_para.h                 |   1 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/hypercall_ucall.c    | 195 ++++++++++++++++++
 11 files changed, 264 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/hypercall_ucall.c

-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-01 18:51 [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Forrest Yuan Yu
@ 2020-05-01 18:51 ` Forrest Yuan Yu
  2020-05-01 20:45   ` Sean Christopherson
  2020-05-01 20:23 ` [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Sean Christopherson
  1 sibling, 1 reply; 8+ messages in thread
From: Forrest Yuan Yu @ 2020-05-01 18:51 UTC (permalink / raw)
  To: kvm; +Cc: Forrest Yuan Yu

The purpose of this new hypercall is to exchange message between
guest and hypervisor. For example, a guest may want to ask hypervisor
to harden security by setting restricted access permission on guest
SLAT entry. In this case, the guest can use this hypercall to send
a message to the hypervisor which will do its job and send back
anything it wants the guest to know.

Signed-off-by: Forrest Yuan Yu <yuanyu@google.com>
---
 Documentation/virt/kvm/api.rst                |  15 +-
 Documentation/virt/kvm/cpuid.rst              |   3 +
 Documentation/virt/kvm/hypercalls.rst         |  14 ++
 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/include/uapi/asm/kvm_para.h          |   1 +
 arch/x86/kvm/x86.c                            |  39 +++-
 include/uapi/linux/kvm.h                      |   1 +
 include/uapi/linux/kvm_para.h                 |   1 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/hypercall_ucall.c    | 195 ++++++++++++++++++
 11 files changed, 264 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/hypercall_ucall.c

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index efbbe570aa9b..ae8958a7ad15 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4854,8 +4854,8 @@ to the byte array.
 
 .. note::
 
-      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
-      KVM_EXIT_EPR the corresponding
+      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_HYPERCALL, KVM_EXIT_OSI,
+      KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
 
 operations are complete (and guest state is consistent) only after userspace
 has re-entered the kernel with KVM_RUN.  The kernel side will first finish
@@ -5802,6 +5802,17 @@ If present, this capability can be enabled for a VM, meaning that KVM
 will allow the transition to secure guest mode.  Otherwise KVM will
 veto the transition.
 
+7.20 KVM_CAP_UCALL
+------------------------------
+
+:Architectures: x86
+
+This capability indicates that KVM supports hypercall ucall. It being enabled
+means userspace is ready to receive a message sent by a guest using hypercall
+ucall. When it is not enabled, a hypercall ucall made by a guest will not cause
+control to be handed to userspace. Instead, kvm will return -KVM_ENOSYS without
+userspace participation.
+
 8. Other capabilities.
 ======================
 
diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index 01b081f6e7ea..ff313f6827bf 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -86,6 +86,9 @@ KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
                                               before using paravirtualized
                                               sched yield.
 
+KVM_FEATURE_UCALL                 14          guest checks this feature bit
+                                              before calling hypercall ucall.
+
 KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
                                               per-cpu warps are expeced in
                                               kvmclock
diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst
index dbaf207e560d..ce3f30d5b2ee 100644
--- a/Documentation/virt/kvm/hypercalls.rst
+++ b/Documentation/virt/kvm/hypercalls.rst
@@ -169,3 +169,17 @@ a0: destination APIC ID
 
 :Usage example: When sending a call-function IPI-many to vCPUs, yield if
 	        any of the IPI target vCPUs was preempted.
+
+8. KVM_HC_UCALL
+---------------------
+
+:Architecture: x86
+:Status: active
+:Purpose: Hypercall used to exchange message between VM and hypervisor.
+
+a0: message type
+
+a1, a2, a3: dependent on message type
+
+:Usage example: A guest asks hypervisor to harden security by setting
+restricted access permission on guest SLAT entry.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 42a2d0d3984a..433e96c126a5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -979,6 +979,7 @@ struct kvm_arch {
 
 	bool guest_can_read_msr_platform_info;
 	bool exception_payload_enabled;
+	bool hypercall_ucall_enabled;
 
 	struct kvm_pmu_event_filter *pmu_event_filter;
 	struct task_struct *nx_lpage_recovery_thread;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 2a8e0b6b9805..9524434463f2 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -31,6 +31,7 @@
 #define KVM_FEATURE_PV_SEND_IPI	11
 #define KVM_FEATURE_POLL_CONTROL	12
 #define KVM_FEATURE_PV_SCHED_YIELD	13
+#define KVM_FEATURE_UCALL		14
 
 #define KVM_HINTS_REALTIME      0
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c5835f9cb9ad..388a4f89464d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3385,6 +3385,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_GET_MSR_FEATURES:
 	case KVM_CAP_MSR_PLATFORM_INFO:
 	case KVM_CAP_EXCEPTION_PAYLOAD:
+	case KVM_CAP_UCALL:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -4895,6 +4896,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		kvm->arch.exception_payload_enabled = cap->args[0];
 		r = 0;
 		break;
+	case KVM_CAP_UCALL:
+		kvm->arch.hypercall_ucall_enabled = cap->args[0];
+		r = 0;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -7554,6 +7559,19 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
 		kvm_vcpu_yield_to(target);
 }
 
+static int complete_hypercall(struct kvm_vcpu *vcpu)
+{
+	u64 ret = vcpu->run->hypercall.ret;
+
+	if (!is_64_bit_mode(vcpu))
+		ret = (u32)ret;
+	kvm_rax_write(vcpu, ret);
+
+	++vcpu->stat.hypercalls;
+
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -7605,17 +7623,26 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		kvm_sched_yield(vcpu->kvm, a0);
 		ret = 0;
 		break;
+	case KVM_HC_UCALL:
+		if (vcpu->kvm->arch.hypercall_ucall_enabled) {
+			vcpu->run->hypercall.nr = nr;
+			vcpu->run->hypercall.args[0] = a0;
+			vcpu->run->hypercall.args[1] = a1;
+			vcpu->run->hypercall.args[2] = a2;
+			vcpu->run->hypercall.args[3] = a3;
+			vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
+			vcpu->arch.complete_userspace_io = complete_hypercall;
+			return 0; // message is going to userspace
+		}
+		ret = -KVM_ENOSYS;
+		break;
 	default:
 		ret = -KVM_ENOSYS;
 		break;
 	}
 out:
-	if (!op_64_bit)
-		ret = (u32)ret;
-	kvm_rax_write(vcpu, ret);
-
-	++vcpu->stat.hypercalls;
-	return kvm_skip_emulated_instruction(vcpu);
+	vcpu->run->hypercall.ret = ret;
+	return complete_hypercall(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 428c7dde6b4b..c1fcac311c76 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1017,6 +1017,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_VCPU_RESETS 179
 #define KVM_CAP_S390_PROTECTED 180
 #define KVM_CAP_PPC_SECURE_GUEST 181
+#define KVM_CAP_UCALL 182
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 8b86609849b9..4e5ad8dec801 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -29,6 +29,7 @@
 #define KVM_HC_CLOCK_PAIRING		9
 #define KVM_HC_SEND_IPI		10
 #define KVM_HC_SCHED_YIELD		11
+#define KVM_HC_UCALL			12
 
 /*
  * hypercalls use architecture specific
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index a9b2b48947ff..c796c8efaa23 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -18,6 +18,7 @@
 /x86_64/vmx_set_nested_state_test
 /x86_64/vmx_tsc_adjust_test
 /x86_64/xss_msr_test
+/x86_64/hypercall_ucall
 /clear_dirty_log_test
 /demand_paging_test
 /dirty_log_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 712a2ddd2a27..b3aeec375644 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -33,6 +33,7 @@ TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
 TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus
 TEST_GEN_PROGS_x86_64 += steal_time
+TEST_GEN_PROGS_x86_64 += x86_64/hypercall_ucall
 
 TEST_GEN_PROGS_aarch64 += clear_dirty_log_test
 TEST_GEN_PROGS_aarch64 += demand_paging_test
diff --git a/tools/testing/selftests/kvm/x86_64/hypercall_ucall.c b/tools/testing/selftests/kvm/x86_64/hypercall_ucall.c
new file mode 100644
index 000000000000..132b6d1c98e2
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/hypercall_ucall.c
@@ -0,0 +1,195 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hypercall KVM_HC_UCALL test
+ *
+ * Copyright (C) 2020, Google LLC.
+ *
+ * Author:
+ *   Forrest Yuan Yu <yuanyu@google.com>
+ */
+
+#include <stdio.h>
+#include "linux/kernel.h"
+#include "linux/kvm_para.h"
+#include "linux/overflow.h"
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID		5
+#define HYPERCALL_RET	0xBEEF
+#define HYPERCALL_ARG0	1000
+#define HYPERCALL_ARG1	1001
+#define HYPERCALL_ARG2	1002
+#define HYPERCALL_ARG3	1003
+
+static inline bool is_feature_ucall_enabled(void)
+{
+	u32 eax, ebx, ecx, edx;
+
+	asm volatile("cpuid"
+		     : "=a"(eax), "=b"(ebx), "=c"(ecx), "=d"(edx)
+		     : "a"(KVM_CPUID_FEATURES), "c"(0));
+
+	return eax & (1 << KVM_FEATURE_UCALL);
+}
+
+static inline void guest_info_to_host(u16 info)
+{
+	asm volatile("in %%dx, %%ax" : : "d" (info));
+}
+
+static inline long hypercall_ucall(void)
+{
+	long ret;
+
+	asm volatile("vmcall"
+		     : "=a"(ret)
+		     : "a"(KVM_HC_UCALL), "b"(HYPERCALL_ARG0),
+		     "c"(HYPERCALL_ARG1), "d"(HYPERCALL_ARG2),
+		     "S"(HYPERCALL_ARG3)
+		     : "memory");
+
+	return ret;
+}
+
+void guest_code(void)
+{
+	long ret;
+
+	/* invoke ucall without it having been enabled */
+	ret = hypercall_ucall();
+	guest_info_to_host(ret);
+
+	/* check feature ucall the first time, host will see 0 */
+	guest_info_to_host(is_feature_ucall_enabled() ? 1 : 0);
+
+	/*
+	 * check feature ucall the second time, host will see 1 because
+	 * by now userspace should have enabled feature ucall
+	 */
+	guest_info_to_host(is_feature_ucall_enabled() ? 1 : 0);
+
+	/*
+	 * the following demonstrate the right way to make a hypercall ucall:
+	 * check the existence of the feature then do ucall
+	 */
+	if (is_feature_ucall_enabled()) {
+		/*
+		 * now that ucall is enabled, kvm will hand control to userspace
+		 * which will set the return value and finish it
+		 */
+		ret = hypercall_ucall();
+		/*
+		 * now we have received the return value set by userspace, send
+		 * it back to userspace for double check
+		 */
+		guest_info_to_host(ret);
+	} else {
+		/* this should not happen */
+		guest_info_to_host(-1);
+	}
+}
+
+void assert_guest_info(struct kvm_vm *vm, u16 expected, char *interpretation)
+{
+	struct kvm_run *state = vcpu_state(vm, VCPU_ID);
+
+	TEST_ASSERT(
+		state->exit_reason == KVM_EXIT_IO,
+		"Got exit_reason other than KVM_EXIT_IO: %u (%s).\n",
+		state->exit_reason, exit_reason_str(state->exit_reason));
+
+	TEST_ASSERT(
+		state->io.port == expected,
+		"Test failed: expecting %s 0x%x but got 0x%x.\n",
+		interpretation, expected, state->io.port);
+}
+
+static void set_ucall_enabled(struct kvm_vm *vm, bool enable)
+{
+	struct kvm_enable_cap cap = {};
+
+	cap.cap = KVM_CAP_UCALL;
+	cap.flags = 0;
+	cap.args[0] = (int)enable;
+	vm_enable_cap(vm, &cap);
+}
+
+int main(int argc, char *argv[])
+{
+	struct kvm_vm *vm;
+	struct kvm_run *state;
+	int hypercall_args[] = {
+		HYPERCALL_ARG0, HYPERCALL_ARG1, HYPERCALL_ARG2, HYPERCALL_ARG3
+	};
+	int arg_nr = ARRAY_SIZE(hypercall_args);
+	int i;
+	int expected_ucall_bit;
+	struct kvm_cpuid_entry2 *entry;
+
+	/* Create VM */
+	vm = vm_create_default(VCPU_ID, 0, guest_code);
+	vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+
+	/*
+	 * run VM which will do hypercall ucall, however, since the feature
+	 * has not been enabled, the hypercall will do nothing and return
+	 * -KVM_EOPNOTSUPP, which means an innocent userspace won't break even
+	 * if the guest tries to invoke this hypercall
+	 */
+	vcpu_run(vm, VCPU_ID);
+	assert_guest_info(vm, -KVM_EOPNOTSUPP, "hypercall return value");
+
+	/*
+	 * continue to run VM which will check feature ucall, which of course
+	 * hasn't been enabled yet
+	 */
+	expected_ucall_bit = 0;
+	vcpu_run(vm, VCPU_ID);
+	assert_guest_info(vm, expected_ucall_bit, "ucall bit");
+
+	TEST_ASSERT(kvm_check_cap(KVM_CAP_UCALL), "CAP UCALL exists.");
+
+	set_ucall_enabled(vm, true);
+
+	/* enable feature ucall and let VM run to check it again */
+	entry = kvm_get_supported_cpuid_index(KVM_CPUID_FEATURES, 0);
+	entry->eax |= 1 << KVM_FEATURE_UCALL;
+	vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+	expected_ucall_bit = 1;
+	vcpu_run(vm, VCPU_ID);
+	assert_guest_info(vm, expected_ucall_bit, "ucall bit");
+
+	/*
+	 * continue to run VM which will do hypercall ucall, which this time
+	 * will give control to userspace because userspace is supposed to
+	 * finish it
+	 */
+	vcpu_run(vm, VCPU_ID);
+	state = vcpu_state(vm, VCPU_ID);
+	TEST_ASSERT(
+		state->exit_reason == KVM_EXIT_HYPERCALL,
+		"Got exit_reason other than KVM_EXIT_HYPERCALL: %u (%s).\n",
+		state->exit_reason, exit_reason_str(state->exit_reason));
+	for (i = 0; i < arg_nr; i++) {
+		TEST_ASSERT(
+			state->hypercall.args[i] == hypercall_args[i],
+			"Got unexpected hypercall argument [%d]: %lld.\n",
+			i, state->hypercall.args[i]);
+	}
+
+	/* userspace finishes it with HYPERCALL_RET */
+	state->hypercall.ret = (u16)HYPERCALL_RET;
+
+	/*
+	 * continue VM which will see the finished ucall with a return value.
+	 * verify the value guest sees is the one we set from userspace just now
+	 */
+	vcpu_run(vm, VCPU_ID);
+	assert_guest_info(vm, HYPERCALL_RET, "hypercall return value");
+
+	kvm_vm_free(vm);
+
+	return 0;
+}
-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication
  2020-05-01 18:51 [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Forrest Yuan Yu
  2020-05-01 18:51 ` [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall Forrest Yuan Yu
@ 2020-05-01 20:23 ` Sean Christopherson
  1 sibling, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2020-05-01 20:23 UTC (permalink / raw)
  To: Forrest Yuan Yu; +Cc: kvm

On Fri, May 01, 2020 at 11:51:46AM -0700, Forrest Yuan Yu wrote:
> This RFC is to add a hypercall UCALL so that guest can communicate with
> the userspace hypervisor. A guest may want to do this in different
> scenarios. For example, a guest can ask the userspace hypervisor to
> harden security by setting restricted access permissions on guest SLAT
> entries.
> 
> Please note this hypercall provides the infrastructure for this type of
> communication but does not enforce the protocol. A proposal for the
> guest/userspace communication protocol will follow if this RFC is
> accepted.

It's hard to review certain aspects of the proposal without seeing the
usage of it.  Can you post the whole thing as an RFC?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-01 18:51 ` [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall Forrest Yuan Yu
@ 2020-05-01 20:45   ` Sean Christopherson
  2020-05-02  1:05     ` Liran Alon
  2020-05-05 22:50     ` Forrest Yuan Yu
  0 siblings, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2020-05-01 20:45 UTC (permalink / raw)
  To: Forrest Yuan Yu; +Cc: kvm

On Fri, May 01, 2020 at 11:51:47AM -0700, Forrest Yuan Yu wrote:
> The purpose of this new hypercall is to exchange message between
> guest and hypervisor. For example, a guest may want to ask hypervisor
> to harden security by setting restricted access permission on guest
> SLAT entry. In this case, the guest can use this hypercall to send

Please do s/SLAT/TDP everywhere.  IMO, TDP is a less than stellar acronym,
e.g. what will we do if we go to three dimensions!?!?  But, we're stuck
with it :-)

> a message to the hypervisor which will do its job and send back
> anything it wants the guest to know.

Hrm, so this reintroduces KVM_EXIT_HYPERCALL without justifying _why_ it
needs to be reintroduced.  I'm not familiar with the history, but the
comments in the documentation advise "use KVM_EXIT_IO or KVM_EXIT_MMIO".

Off the top of my head, IO and/or MMIO has a few advantages:

  - Allows the guest kernel to delegate permissions to guest userspace,
    whereas KVM restrict hypercalls to CPL0.
  - Allows "pass-through", whereas VMCALL is unconditionally forwarded to
    L1.
  - Is vendor agnostic, e.g. VMX and SVM recognized different opcodes for
    VMCALL vs VMMCALL.
 
> Signed-off-by: Forrest Yuan Yu <yuanyu@google.com>
> ---
> diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
> index 01b081f6e7ea..ff313f6827bf 100644
> --- a/Documentation/virt/kvm/cpuid.rst
> +++ b/Documentation/virt/kvm/cpuid.rst
> @@ -86,6 +86,9 @@ KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
>                                                before using paravirtualized
>                                                sched yield.
>  
> +KVM_FEATURE_UCALL                 14          guest checks this feature bit
> +                                              before calling hypercall ucall.

Why make the UCALL a KVM CPUID feature?  I can understand wanting to query
KVM support from host userspace, but presumably the guest will care about
capabiliteis built on top of the UCALL, not the UCALL itself.

> +
>  KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
>                                                per-cpu warps are expeced in
>                                                kvmclock
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c5835f9cb9ad..388a4f89464d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3385,6 +3385,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_GET_MSR_FEATURES:
>  	case KVM_CAP_MSR_PLATFORM_INFO:
>  	case KVM_CAP_EXCEPTION_PAYLOAD:
> +	case KVM_CAP_UCALL:

For whatever reason I have a metnal block with UCALL, can we call this
KVM_CAP_USERSPACE_HYPERCALL

>  		r = 1;
>  		break;
>  	case KVM_CAP_SYNC_REGS:
> @@ -4895,6 +4896,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		kvm->arch.exception_payload_enabled = cap->args[0];
>  		r = 0;
>  		break;
> +	case KVM_CAP_UCALL:
> +		kvm->arch.hypercall_ucall_enabled = cap->args[0];
> +		r = 0;
> +		break;
>  	default:
>  		r = -EINVAL;
>  		break;
> @@ -7554,6 +7559,19 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
>  		kvm_vcpu_yield_to(target);
>  }
>  
> +static int complete_hypercall(struct kvm_vcpu *vcpu)
> +{
> +	u64 ret = vcpu->run->hypercall.ret;
> +
> +	if (!is_64_bit_mode(vcpu))

Do we really anticipate adding support in 32-bit guests?  Honest question.

> +		ret = (u32)ret;
> +	kvm_rax_write(vcpu, ret);
> +
> +	++vcpu->stat.hypercalls;
> +
> +	return kvm_skip_emulated_instruction(vcpu);
> +}
> +
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long nr, a0, a1, a2, a3, ret;
> @@ -7605,17 +7623,26 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  		kvm_sched_yield(vcpu->kvm, a0);
>  		ret = 0;
>  		break;
> +	case KVM_HC_UCALL:
> +		if (vcpu->kvm->arch.hypercall_ucall_enabled) {
> +			vcpu->run->hypercall.nr = nr;
> +			vcpu->run->hypercall.args[0] = a0;
> +			vcpu->run->hypercall.args[1] = a1;
> +			vcpu->run->hypercall.args[2] = a2;
> +			vcpu->run->hypercall.args[3] = a3;

If performance is a justification for a more direct userspace exit, why
limit it to just four parameters?  E.g. why not copy all registers to
kvm_sync_regs and reverse the process on the way back in?

> +			vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> +			vcpu->arch.complete_userspace_io = complete_hypercall;
> +			return 0; // message is going to userspace
> +		}
> +		ret = -KVM_ENOSYS;
> +		break;
>  	default:
>  		ret = -KVM_ENOSYS;
>  		break;
>  	}
>  out:
> -	if (!op_64_bit)
> -		ret = (u32)ret;
> -	kvm_rax_write(vcpu, ret);
> -
> -	++vcpu->stat.hypercalls;
> -	return kvm_skip_emulated_instruction(vcpu);
> +	vcpu->run->hypercall.ret = ret;
> +	return complete_hypercall(vcpu);
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-01 20:45   ` Sean Christopherson
@ 2020-05-02  1:05     ` Liran Alon
  2020-05-05 18:49       ` Jim Mattson
  2020-05-05 23:53       ` Forrest Yuan Yu
  2020-05-05 22:50     ` Forrest Yuan Yu
  1 sibling, 2 replies; 8+ messages in thread
From: Liran Alon @ 2020-05-02  1:05 UTC (permalink / raw)
  To: Sean Christopherson, Forrest Yuan Yu; +Cc: kvm


On 01/05/2020 23:45, Sean Christopherson wrote:
> On Fri, May 01, 2020 at 11:51:47AM -0700, Forrest Yuan Yu wrote:
>> The purpose of this new hypercall is to exchange message between
>> guest and hypervisor. For example, a guest may want to ask hypervisor
>> to harden security by setting restricted access permission on guest
>> SLAT entry. In this case, the guest can use this hypercall to send
>>
>> a message to the hypervisor which will do its job and send back
>> anything it wants the guest to know.
> Hrm, so this reintroduces KVM_EXIT_HYPERCALL without justifying _why_ it
> needs to be reintroduced.  I'm not familiar with the history, but the
> comments in the documentation advise "use KVM_EXIT_IO or KVM_EXIT_MMIO".
Both of these options have the disadvantage of requiring instruction 
emulation (Although a trivial one for PIO). Which enlarge host attack 
surface.
I think this is one of the reasons why Hyper-V defined their PV devices 
(E.g. NetVSC/StorVSC) doorbell kick with hypercall instead of PIO/MMIO.
(This is currently not much relevant as KVM's instruction emulator is 
not optional and is not even offloaded to host userspace. But relevant 
for the future...)
>
> Off the top of my head, IO and/or MMIO has a few advantages:
>
>    - Allows the guest kernel to delegate permissions to guest userspace,
>      whereas KVM restrict hypercalls to CPL0.
>    - Allows "pass-through", whereas VMCALL is unconditionally forwarded to
>      L1.
>    - Is vendor agnostic, e.g. VMX and SVM recognized different opcodes for
>      VMCALL vs VMMCALL.
I agree with all the above (I believe similar rational had led VMware to 
design their Backdoor PIO interface).

Also note that recently AWS introduced Nitro Enclave PV device which is 
also de-facto a PV control-plane interface between guest and host userspace.
Why similar approach couldn't have been used here?
(Capability is exposed on a per-VM basis by attaching PV device to VM, 
communication interface is device specific and no KVM changes, only host 
userspace changes).
>   
>> Signed-off-by: Forrest Yuan Yu <yuanyu@google.com>
>> ---
>> diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
>> index 01b081f6e7ea..ff313f6827bf 100644
>> --- a/Documentation/virt/kvm/cpuid.rst
>> +++ b/Documentation/virt/kvm/cpuid.rst
>> @@ -86,6 +86,9 @@ KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
>>                                                 before using paravirtualized
>>                                                 sched yield.
>>   
>> +KVM_FEATURE_UCALL                 14          guest checks this feature bit
>> +                                              before calling hypercall ucall.
> Why make the UCALL a KVM CPUID feature?  I can understand wanting to query
> KVM support from host userspace, but presumably the guest will care about
> capabiliteis built on top of the UCALL, not the UCALL itself.
I agree with this.
In case of PV device approach, device detection by guest will be the 
capability discovery.
>
>> +
>>   KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
>>                                                 per-cpu warps are expeced in
>>                                                 kvmclock
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c5835f9cb9ad..388a4f89464d 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3385,6 +3385,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>   	case KVM_CAP_GET_MSR_FEATURES:
>>   	case KVM_CAP_MSR_PLATFORM_INFO:
>>   	case KVM_CAP_EXCEPTION_PAYLOAD:
>> +	case KVM_CAP_UCALL:
> For whatever reason I have a metnal block with UCALL, can we call this
> KVM_CAP_USERSPACE_HYPERCALL
+1
>
>>   		r = 1;
>>   		break;
>>   	case KVM_CAP_SYNC_REGS:
>> @@ -4895,6 +4896,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>   		kvm->arch.exception_payload_enabled = cap->args[0];
>>   		r = 0;
>>   		break;
>> +	case KVM_CAP_UCALL:
>> +		kvm->arch.hypercall_ucall_enabled = cap->args[0];
>> +		r = 0;
>> +		break;
>>   	default:
>>   		r = -EINVAL;
>>   		break;
>> @@ -7554,6 +7559,19 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
>>   		kvm_vcpu_yield_to(target);
>>   }
>>   
>> +static int complete_hypercall(struct kvm_vcpu *vcpu)
>> +{
>> +	u64 ret = vcpu->run->hypercall.ret;
>> +
>> +	if (!is_64_bit_mode(vcpu))
> Do we really anticipate adding support in 32-bit guests?  Honest question.
>
>> +		ret = (u32)ret;
>> +	kvm_rax_write(vcpu, ret);
>> +
>> +	++vcpu->stat.hypercalls;
>> +
>> +	return kvm_skip_emulated_instruction(vcpu);
>> +}
>> +
>>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>   {
>>   	unsigned long nr, a0, a1, a2, a3, ret;
>> @@ -7605,17 +7623,26 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>   		kvm_sched_yield(vcpu->kvm, a0);
>>   		ret = 0;
>>   		break;
>> +	case KVM_HC_UCALL:
>> +		if (vcpu->kvm->arch.hypercall_ucall_enabled) {
>> +			vcpu->run->hypercall.nr = nr;
>> +			vcpu->run->hypercall.args[0] = a0;
>> +			vcpu->run->hypercall.args[1] = a1;
>> +			vcpu->run->hypercall.args[2] = a2;
>> +			vcpu->run->hypercall.args[3] = a3;
> If performance is a justification for a more direct userspace exit, why
> limit it to just four parameters?  E.g. why not copy all registers to
> kvm_sync_regs and reverse the process on the way back in?
I don't think performance should be relevant for a hypercall interface. 
It's control-plane path.
If a fast-path is required, guest should use this interface to 
coordinate a separate fast-path (e.g. via ring-buffer on some guest 
memory page).

Anyway, these kind of questions is another reason why I agree with Sean 
it seems using a PV device is preferred.
Instead of forcing a general userspace hypercall interface standard, one 
could just implement whatever PV device it wants in host userspace which 
is device specific.

In QEMU's VMPort implementation BTW, userspace calls 
cpu_synchronize_state() which de-facto syncs tons of vCPU state from KVM 
to userspace. Not just the GP registers.
Because it's a slow-path, it's considered fine. :P

-Liran

>
>> +			vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
>> +			vcpu->arch.complete_userspace_io = complete_hypercall;
>> +			return 0; // message is going to userspace
>> +		}
>> +		ret = -KVM_ENOSYS;
>> +		break;
>>   	default:
>>   		ret = -KVM_ENOSYS;
>>   		break;
>>   	}
>>   out:
>> -	if (!op_64_bit)
>> -		ret = (u32)ret;
>> -	kvm_rax_write(vcpu, ret);
>> -
>> -	++vcpu->stat.hypercalls;
>> -	return kvm_skip_emulated_instruction(vcpu);
>> +	vcpu->run->hypercall.ret = ret;
>> +	return complete_hypercall(vcpu);
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-02  1:05     ` Liran Alon
@ 2020-05-05 18:49       ` Jim Mattson
  2020-05-05 23:53       ` Forrest Yuan Yu
  1 sibling, 0 replies; 8+ messages in thread
From: Jim Mattson @ 2020-05-05 18:49 UTC (permalink / raw)
  To: Liran Alon; +Cc: Sean Christopherson, Forrest Yuan Yu, kvm list

On Fri, May 1, 2020 at 6:05 PM Liran Alon <liran.alon@oracle.com> wrote:
>
>
> On 01/05/2020 23:45, Sean Christopherson wrote:
> > Off the top of my head, IO and/or MMIO has a few advantages:
> >
> >    - Allows the guest kernel to delegate permissions to guest userspace,
> >      whereas KVM restrict hypercalls to CPL0.
> >    - Allows "pass-through", whereas VMCALL is unconditionally forwarded to
> >      L1.
> >    - Is vendor agnostic, e.g. VMX and SVM recognized different opcodes for
> >      VMCALL vs VMMCALL.
> I agree with all the above (I believe similar rational had led VMware to
> design their Backdoor PIO interface).

Just to set the record straight...

VMware's backdoor PIO interface predates both VMX and SVM, so VMCALL
and VMMCALL played no role whatsoever in its design. Moreover,
VMware's backdoor PIO interface actually does not allow the guest
kernel to delegate permissions to guest userspace. VMware ignores the
I/O permission bitmap in the TSS for the backdoor ports, so userspace
always has access to them. It's the VMware hypervisor that decides
whether or not to accept certain hypercalls at CPL>0.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-01 20:45   ` Sean Christopherson
  2020-05-02  1:05     ` Liran Alon
@ 2020-05-05 22:50     ` Forrest Yuan Yu
  1 sibling, 0 replies; 8+ messages in thread
From: Forrest Yuan Yu @ 2020-05-05 22:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, alaingef, ningyang

On Fri, May 01, 2020 at 01:45:52PM -0700, Sean Christopherson wrote:
> On Fri, May 01, 2020 at 11:51:47AM -0700, Forrest Yuan Yu wrote:
> > The purpose of this new hypercall is to exchange message between
> > guest and hypervisor. For example, a guest may want to ask hypervisor
> > to harden security by setting restricted access permission on guest
> > SLAT entry. In this case, the guest can use this hypercall to send
> 
> Please do s/SLAT/TDP everywhere.  IMO, TDP is a less than stellar acronym,
> e.g. what will we do if we go to three dimensions!?!?  But, we're stuck
> with it :-)

Will do.

> 
> > a message to the hypervisor which will do its job and send back
> > anything it wants the guest to know.
> 
> Hrm, so this reintroduces KVM_EXIT_HYPERCALL without justifying _why_ it
> needs to be reintroduced.  I'm not familiar with the history, but the
> comments in the documentation advise "use KVM_EXIT_IO or KVM_EXIT_MMIO".
> 
> Off the top of my head, IO and/or MMIO has a few advantages:
> 
>   - Allows the guest kernel to delegate permissions to guest userspace,
>     whereas KVM restrict hypercalls to CPL0.
>   - Allows "pass-through", whereas VMCALL is unconditionally forwarded to
>     L1.
>   - Is vendor agnostic, e.g. VMX and SVM recognized different opcodes for
>     VMCALL vs VMMCALL.

Yes I'm aware of the document advice but still feel hypercall is the
right thing to do. First of all, hypercalls, by definition, exist for
exactly these kind of jobs. When a guest wants to talk to the
hypervisor, it makes a hypercall. It just feels natural.

Of course IO/MMIO can do the same thing, but for simple jobs like asking
hypervisor to set restricted access permission, IO/MMIO feels heavy (and
IMHO feels a little hacky).

Also I'm not reintroducing KVM_EXIT_HYPERCALL the same way as before, in
that we don't pass all unkown hypercalls to userspace. Instead, we just
pass one specific call to userspace, one that userspace has opted into.

When userspace does nothing -- not opting in -- the behavior is the same
as the current code base.

There are many ways to allow certain hypercalls to be delegated from the
guest kernel to guest userspace, and we would probably want a mechanism
that allows finer control than I/O permission bitmaps (for PIO) or
mapping MMIO regions as writable in a userspace context. Something like
this is not tied to a particular communication channel.

Regarding the nested scenario, I would argue there should not be a
hypercall interface from a guest to any hypervisor other than its
immediate parent.

VMCALL and VMMCALL having different opcodes shouldn't be a problem since
kvm already supports hypercalls using these opcodes.

>  
> > Signed-off-by: Forrest Yuan Yu <yuanyu@google.com>
> > ---
> > diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
> > index 01b081f6e7ea..ff313f6827bf 100644
> > --- a/Documentation/virt/kvm/cpuid.rst
> > +++ b/Documentation/virt/kvm/cpuid.rst
> > @@ -86,6 +86,9 @@ KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
> >                                                before using paravirtualized
> >                                                sched yield.
> >  
> > +KVM_FEATURE_UCALL                 14          guest checks this feature bit
> > +                                              before calling hypercall ucall.
> 
> Why make the UCALL a KVM CPUID feature?  I can understand wanting to query
> KVM support from host userspace, but presumably the guest will care about
> capabiliteis built on top of the UCALL, not the UCALL itself.

This is to keep the behavior the same when UCALL is not recognized by
the current userspaces which are not aware of this hypercall. Without
this, userspace may break if a guest invokes UCALL.

> 
> > +
> >  KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
> >                                                per-cpu warps are expeced in
> >                                                kvmclock
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index c5835f9cb9ad..388a4f89464d 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -3385,6 +3385,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >  	case KVM_CAP_GET_MSR_FEATURES:
> >  	case KVM_CAP_MSR_PLATFORM_INFO:
> >  	case KVM_CAP_EXCEPTION_PAYLOAD:
> > +	case KVM_CAP_UCALL:
> 
> For whatever reason I have a metnal block with UCALL, can we call this
> KVM_CAP_USERSPACE_HYPERCALL

I was trying to follow the convention. The name UCALL was used in places
like tools/testing/selftests/kvm/include/kvm_util.h for similar
purposes. I can rename it for better readability.

> 
> >  		r = 1;
> >  		break;
> >  	case KVM_CAP_SYNC_REGS:
> > @@ -4895,6 +4896,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >  		kvm->arch.exception_payload_enabled = cap->args[0];
> >  		r = 0;
> >  		break;
> > +	case KVM_CAP_UCALL:
> > +		kvm->arch.hypercall_ucall_enabled = cap->args[0];
> > +		r = 0;
> > +		break;
> >  	default:
> >  		r = -EINVAL;
> >  		break;
> > @@ -7554,6 +7559,19 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
> >  		kvm_vcpu_yield_to(target);
> >  }
> >  
> > +static int complete_hypercall(struct kvm_vcpu *vcpu)
> > +{
> > +	u64 ret = vcpu->run->hypercall.ret;
> > +
> > +	if (!is_64_bit_mode(vcpu))
> 
> Do we really anticipate adding support in 32-bit guests?  Honest question.

Oh I was trying to keep the behavior exactly the same. It was there
under kvm_emulate_hypercall() so I pasted this block here.

> 
> > +		ret = (u32)ret;
> > +	kvm_rax_write(vcpu, ret);
> > +
> > +	++vcpu->stat.hypercalls;
> > +
> > +	return kvm_skip_emulated_instruction(vcpu);
> > +}
> > +
> >  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >  {
> >  	unsigned long nr, a0, a1, a2, a3, ret;
> > @@ -7605,17 +7623,26 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >  		kvm_sched_yield(vcpu->kvm, a0);
> >  		ret = 0;
> >  		break;
> > +	case KVM_HC_UCALL:
> > +		if (vcpu->kvm->arch.hypercall_ucall_enabled) {
> > +			vcpu->run->hypercall.nr = nr;
> > +			vcpu->run->hypercall.args[0] = a0;
> > +			vcpu->run->hypercall.args[1] = a1;
> > +			vcpu->run->hypercall.args[2] = a2;
> > +			vcpu->run->hypercall.args[3] = a3;
> 
> If performance is a justification for a more direct userspace exit, why
> limit it to just four parameters?  E.g. why not copy all registers to
> kvm_sync_regs and reverse the process on the way back in?
> 
> > +			vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> > +			vcpu->arch.complete_userspace_io = complete_hypercall;
> > +			return 0; // message is going to userspace
> > +		}
> > +		ret = -KVM_ENOSYS;
> > +		break;
> >  	default:
> >  		ret = -KVM_ENOSYS;
> >  		break;
> >  	}
> >  out:
> > -	if (!op_64_bit)
> > -		ret = (u32)ret;
> > -	kvm_rax_write(vcpu, ret);
> > -
> > -	++vcpu->stat.hypercalls;
> > -	return kvm_skip_emulated_instruction(vcpu);
> > +	vcpu->run->hypercall.ret = ret;
> > +	return complete_hypercall(vcpu);
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall
  2020-05-02  1:05     ` Liran Alon
  2020-05-05 18:49       ` Jim Mattson
@ 2020-05-05 23:53       ` Forrest Yuan Yu
  1 sibling, 0 replies; 8+ messages in thread
From: Forrest Yuan Yu @ 2020-05-05 23:53 UTC (permalink / raw)
  To: Liran Alon; +Cc: Sean Christopherson, kvm, ningyang, alaingef

On Sat, May 02, 2020 at 04:05:06AM +0300, Liran Alon wrote:
> 
> On 01/05/2020 23:45, Sean Christopherson wrote:
> > On Fri, May 01, 2020 at 11:51:47AM -0700, Forrest Yuan Yu wrote:
> > > The purpose of this new hypercall is to exchange message between
> > > guest and hypervisor. For example, a guest may want to ask hypervisor
> > > to harden security by setting restricted access permission on guest
> > > SLAT entry. In this case, the guest can use this hypercall to send
> > > 
> > > a message to the hypervisor which will do its job and send back
> > > anything it wants the guest to know.
> > Hrm, so this reintroduces KVM_EXIT_HYPERCALL without justifying _why_ it
> > needs to be reintroduced.  I'm not familiar with the history, but the
> > comments in the documentation advise "use KVM_EXIT_IO or KVM_EXIT_MMIO".
> Both of these options have the disadvantage of requiring instruction
> emulation (Although a trivial one for PIO). Which enlarge host attack
> surface.
> I think this is one of the reasons why Hyper-V defined their PV devices
> (E.g. NetVSC/StorVSC) doorbell kick with hypercall instead of PIO/MMIO.
> (This is currently not much relevant as KVM's instruction emulator is not
> optional and is not even offloaded to host userspace. But relevant for the
> future...)

Good point. Again to me the simplicity of the hypercall interface really
makes it an ideal mechanism for message passing.

> > 
> > Off the top of my head, IO and/or MMIO has a few advantages:
> > 
> >    - Allows the guest kernel to delegate permissions to guest userspace,
> >      whereas KVM restrict hypercalls to CPL0.
> >    - Allows "pass-through", whereas VMCALL is unconditionally forwarded to
> >      L1.
> >    - Is vendor agnostic, e.g. VMX and SVM recognized different opcodes for
> >      VMCALL vs VMMCALL.
> I agree with all the above (I believe similar rational had led VMware to
> design their Backdoor PIO interface).
> 
> Also note that recently AWS introduced Nitro Enclave PV device which is also
> de-facto a PV control-plane interface between guest and host userspace.
> Why similar approach couldn't have been used here?
> (Capability is exposed on a per-VM basis by attaching PV device to VM,
> communication interface is device specific and no KVM changes, only host
> userspace changes).
> > > Signed-off-by: Forrest Yuan Yu <yuanyu@google.com>
> > > ---
> > > diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
> > > index 01b081f6e7ea..ff313f6827bf 100644
> > > --- a/Documentation/virt/kvm/cpuid.rst
> > > +++ b/Documentation/virt/kvm/cpuid.rst
> > > @@ -86,6 +86,9 @@ KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
> > >                                                 before using paravirtualized
> > >                                                 sched yield.
> > > +KVM_FEATURE_UCALL                 14          guest checks this feature bit
> > > +                                              before calling hypercall ucall.
> > Why make the UCALL a KVM CPUID feature?  I can understand wanting to query
> > KVM support from host userspace, but presumably the guest will care about
> > capabiliteis built on top of the UCALL, not the UCALL itself.
> I agree with this.
> In case of PV device approach, device detection by guest will be the
> capability discovery.
> > 
> > > +
> > >   KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
> > >                                                 per-cpu warps are expeced in
> > >                                                 kvmclock
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index c5835f9cb9ad..388a4f89464d 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -3385,6 +3385,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > >   	case KVM_CAP_GET_MSR_FEATURES:
> > >   	case KVM_CAP_MSR_PLATFORM_INFO:
> > >   	case KVM_CAP_EXCEPTION_PAYLOAD:
> > > +	case KVM_CAP_UCALL:
> > For whatever reason I have a metnal block with UCALL, can we call this
> > KVM_CAP_USERSPACE_HYPERCALL
> +1
> > 
> > >   		r = 1;
> > >   		break;
> > >   	case KVM_CAP_SYNC_REGS:
> > > @@ -4895,6 +4896,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> > >   		kvm->arch.exception_payload_enabled = cap->args[0];
> > >   		r = 0;
> > >   		break;
> > > +	case KVM_CAP_UCALL:
> > > +		kvm->arch.hypercall_ucall_enabled = cap->args[0];
> > > +		r = 0;
> > > +		break;
> > >   	default:
> > >   		r = -EINVAL;
> > >   		break;
> > > @@ -7554,6 +7559,19 @@ static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
> > >   		kvm_vcpu_yield_to(target);
> > >   }
> > > +static int complete_hypercall(struct kvm_vcpu *vcpu)
> > > +{
> > > +	u64 ret = vcpu->run->hypercall.ret;
> > > +
> > > +	if (!is_64_bit_mode(vcpu))
> > Do we really anticipate adding support in 32-bit guests?  Honest question.
> > 
> > > +		ret = (u32)ret;
> > > +	kvm_rax_write(vcpu, ret);
> > > +
> > > +	++vcpu->stat.hypercalls;
> > > +
> > > +	return kvm_skip_emulated_instruction(vcpu);
> > > +}
> > > +
> > >   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> > >   {
> > >   	unsigned long nr, a0, a1, a2, a3, ret;
> > > @@ -7605,17 +7623,26 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> > >   		kvm_sched_yield(vcpu->kvm, a0);
> > >   		ret = 0;
> > >   		break;
> > > +	case KVM_HC_UCALL:
> > > +		if (vcpu->kvm->arch.hypercall_ucall_enabled) {
> > > +			vcpu->run->hypercall.nr = nr;
> > > +			vcpu->run->hypercall.args[0] = a0;
> > > +			vcpu->run->hypercall.args[1] = a1;
> > > +			vcpu->run->hypercall.args[2] = a2;
> > > +			vcpu->run->hypercall.args[3] = a3;
> > If performance is a justification for a more direct userspace exit, why
> > limit it to just four parameters?  E.g. why not copy all registers to
> > kvm_sync_regs and reverse the process on the way back in?
> I don't think performance should be relevant for a hypercall interface. It's
> control-plane path.
> If a fast-path is required, guest should use this interface to coordinate a
> separate fast-path (e.g. via ring-buffer on some guest memory page).
> 
> Anyway, these kind of questions is another reason why I agree with Sean it
> seems using a PV device is preferred.

I agree. Performance is not an issue here. However, I also don't think
this would make a PV device very suitable for a task like message
passing. When I try to imagine using a device to pass several bytes of
messages, it doesn't feel right ...

> Instead of forcing a general userspace hypercall interface standard, one
> could just implement whatever PV device it wants in host userspace which is
> device specific.
> 
> In QEMU's VMPort implementation BTW, userspace calls cpu_synchronize_state()
> which de-facto syncs tons of vCPU state from KVM to userspace. Not just the
> GP registers.
> Because it's a slow-path, it's considered fine. :P
> 
> -Liran
> 
> > 
> > > +			vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> > > +			vcpu->arch.complete_userspace_io = complete_hypercall;
> > > +			return 0; // message is going to userspace
> > > +		}
> > > +		ret = -KVM_ENOSYS;
> > > +		break;
> > >   	default:
> > >   		ret = -KVM_ENOSYS;
> > >   		break;
> > >   	}
> > >   out:
> > > -	if (!op_64_bit)
> > > -		ret = (u32)ret;
> > > -	kvm_rax_write(vcpu, ret);
> > > -
> > > -	++vcpu->stat.hypercalls;
> > > -	return kvm_skip_emulated_instruction(vcpu);
> > > +	vcpu->run->hypercall.ret = ret;
> > > +	return complete_hypercall(vcpu);
> > >   }
> > >   EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-05 23:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01 18:51 [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Forrest Yuan Yu
2020-05-01 18:51 ` [PATCH RFC 1/1] KVM: x86: add KVM_HC_UCALL hypercall Forrest Yuan Yu
2020-05-01 20:45   ` Sean Christopherson
2020-05-02  1:05     ` Liran Alon
2020-05-05 18:49       ` Jim Mattson
2020-05-05 23:53       ` Forrest Yuan Yu
2020-05-05 22:50     ` Forrest Yuan Yu
2020-05-01 20:23 ` [PATCH RFC 0/1] Hypercall UCALL for guest/userspace communication Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.