All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
@ 2021-04-21 12:28 Aaron Lewis
  2021-04-21 12:28 ` [PATCH v2 2/2] selftests: kvm: Allows " Aaron Lewis
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Aaron Lewis @ 2021-04-21 12:28 UTC (permalink / raw)
  To: david.edmondson, seanjc; +Cc: jmattson, kvm, Aaron Lewis

Add a fallback mechanism to the in-kernel instruction emulator that
allows userspace the opportunity to process an instruction the emulator
was unable to.  When the in-kernel instruction emulator fails to process
an instruction it will either inject a #UD into the guest or exit to
userspace with exit reason KVM_INTERNAL_ERROR.  This is because it does
not know how to proceed in an appropriate manner.  This feature lets
userspace get involved to see if it can figure out a better path
forward.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 Documentation/virt/kvm/api.rst  | 19 ++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  6 ++++++
 arch/x86/kvm/x86.c              | 35 +++++++++++++++++++++++++++++----
 include/uapi/linux/kvm.h        | 23 ++++++++++++++++++++++
 tools/include/uapi/linux/kvm.h  | 23 ++++++++++++++++++++++
 5 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 307f2fcf1b02..fe6c3f1cae7e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6233,6 +6233,25 @@ KVM_RUN_BUS_LOCK flag is used to distinguish between them.
 This capability can be used to check / enable 2nd DAWR feature provided
 by POWER10 processor.
 
+7.24 KVM_CAP_EXIT_ON_EMULATION_FAILURE
+--------------------------------------
+
+:Architectures: x86
+:Parameters: args[0] whether the feature should be enabled or not
+
+With this capability enabled, the in-kernel instruction emulator packs the exit
+struct of KVM_INTERNAL_ERROR with the instruction length and instruction bytes
+when an error occurs while emulating an instruction.  This allows userspace to
+then take a look at the instruction and see if it is able to handle it more
+gracefully than the in-kernel emulator.
+
+When this capability is enabled use the emulation_failure struct instead of the
+internal struct for the exit struct.  They have the same layout, but the
+emulation_failure struct matches the content better.
+
+It is noted that KVM still exits on certain types (skip) even if this capability
+is not enabled, and KVM will never exit if VMware #GP emulation fails.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..07235d08e976 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1049,6 +1049,12 @@ struct kvm_arch {
 	bool exception_payload_enabled;
 
 	bool bus_lock_detection_enabled;
+	/*
+	 * If exit_on_emulation_error is set, and the in-kernel instruction
+	 * emulator fails to emulate an instruction, allow userspace
+	 * the opportunity to look at it.
+	 */
+	bool exit_on_emulation_error;
 
 	/* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
 	u32 user_space_msr_mask;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eca63625aee4..9cdfb7fbead5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3771,6 +3771,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_X86_USER_SPACE_MSR:
 	case KVM_CAP_X86_MSR_FILTER:
 	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
+	case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
 		r = 1;
 		break;
 #ifdef CONFIG_KVM_XEN
@@ -5357,6 +5358,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			kvm->arch.bus_lock_detection_enabled = true;
 		r = 0;
 		break;
+	case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
+		kvm->arch.exit_on_emulation_error = cap->args[0];
+		r = 0;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -7119,8 +7124,31 @@ void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
 }
 EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
 
+static void prepare_emulation_failure_exit(struct kvm_vcpu *vcpu)
+{
+	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+	u64 insn_size = ctxt->fetch.end - ctxt->fetch.data;
+	struct kvm_run *run = vcpu->run;
+
+
+	run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	run->emulation_failure.suberror = KVM_INTERNAL_ERROR_EMULATION;
+	run->emulation_failure.ndata = 0;
+	run->emulation_failure.flags = 0;
+	if (insn_size) {
+		run->emulation_failure.ndata = 3;
+		run->emulation_failure.flags |=
+			KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES;
+		run->emulation_failure.insn_size = insn_size;
+		memcpy(run->emulation_failure.insn_bytes,
+		       ctxt->fetch.data, sizeof(ctxt->fetch.data));
+	}
+}
+
 static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
 {
+	struct kvm *kvm = vcpu->kvm;
+
 	++vcpu->stat.insn_emulation_fail;
 	trace_kvm_emulate_insn_failed(vcpu);
 
@@ -7129,10 +7157,9 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
 		return 1;
 	}
 
-	if (emulation_type & EMULTYPE_SKIP) {
-		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
-		vcpu->run->internal.ndata = 0;
+	if (kvm->arch.exit_on_emulation_error ||
+	    (emulation_type & EMULTYPE_SKIP)) {
+		prepare_emulation_failure_exit(vcpu);
 		return 0;
 	}
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6afee209620..d7d109e6998f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -279,6 +279,9 @@ struct kvm_xen_exit {
 /* Encounter unexpected vm-exit reason */
 #define KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON	4
 
+/* Flags that describe what fields in emulation_failure hold valid data  */
+#define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0)
+
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
@@ -382,6 +385,25 @@ struct kvm_run {
 			__u32 ndata;
 			__u64 data[16];
 		} internal;
+		/*
+		 * KVM_INTERNAL_ERROR_EMULATION
+		 *
+		 * "struct emulation_failure" is an overlay of "struct internal"
+		 * that is used for the KVM_INTERNAL_ERROR_EMULATION sub-type of
+		 * KVM_EXIT_INTERNAL_ERROR.  Note, unlike other internal error
+		 * sub-types, this struct is ABI!  It also needs to be backwards
+		 * compabile with "struct internal".  Take special care that
+		 * "ndata" is correct, that new fields are enumerated in "flags",
+		 * and that each flag enumerates fields that are 64-bit aligned
+		 * and sized (so that ndata+internal.data[] is valid/accurate).
+		 */
+		struct {
+			__u32 suberror;
+			__u32 ndata;
+			__u64 flags;
+			__u8  insn_size;
+			__u8  insn_bytes[15];
+		} emulation_failure;
 		/* KVM_EXIT_OSI */
 		struct {
 			__u64 gprs[32];
@@ -1078,6 +1100,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING 192
 #define KVM_CAP_X86_BUS_LOCK_EXIT 193
 #define KVM_CAP_PPC_DAWR1 194
+#define KVM_CAP_EXIT_ON_EMULATION_FAILURE 195
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index f6afee209620..d7d109e6998f 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -279,6 +279,9 @@ struct kvm_xen_exit {
 /* Encounter unexpected vm-exit reason */
 #define KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON	4
 
+/* Flags that describe what fields in emulation_failure hold valid data  */
+#define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0)
+
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
@@ -382,6 +385,25 @@ struct kvm_run {
 			__u32 ndata;
 			__u64 data[16];
 		} internal;
+		/*
+		 * KVM_INTERNAL_ERROR_EMULATION
+		 *
+		 * "struct emulation_failure" is an overlay of "struct internal"
+		 * that is used for the KVM_INTERNAL_ERROR_EMULATION sub-type of
+		 * KVM_EXIT_INTERNAL_ERROR.  Note, unlike other internal error
+		 * sub-types, this struct is ABI!  It also needs to be backwards
+		 * compabile with "struct internal".  Take special care that
+		 * "ndata" is correct, that new fields are enumerated in "flags",
+		 * and that each flag enumerates fields that are 64-bit aligned
+		 * and sized (so that ndata+internal.data[] is valid/accurate).
+		 */
+		struct {
+			__u32 suberror;
+			__u32 ndata;
+			__u64 flags;
+			__u8  insn_size;
+			__u8  insn_bytes[15];
+		} emulation_failure;
 		/* KVM_EXIT_OSI */
 		struct {
 			__u64 gprs[32];
@@ -1078,6 +1100,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING 192
 #define KVM_CAP_X86_BUS_LOCK_EXIT 193
 #define KVM_CAP_PPC_DAWR1 194
+#define KVM_CAP_EXIT_ON_EMULATION_FAILURE 195
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.31.1.498.g6c1eba8ee3d-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/2] selftests: kvm: Allows userspace to handle emulation errors.
  2021-04-21 12:28 [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors Aaron Lewis
@ 2021-04-21 12:28 ` Aaron Lewis
  2021-04-21 14:03 ` [PATCH v2 1/2] kvm: x86: Allow " David Edmondson
  2021-04-22 12:57 ` Jim Mattson
  2 siblings, 0 replies; 17+ messages in thread
From: Aaron Lewis @ 2021-04-21 12:28 UTC (permalink / raw)
  To: david.edmondson, seanjc; +Cc: jmattson, kvm, Aaron Lewis

This test exercises the feature KVM_CAP_EXIT_ON_EMULATION_FAILURE.  When
enabled, errors in the in-kernel instruction emulator are forwarded to
userspace with the instruction bytes stored in the exit struct for
KVM_EXIT_INTERNAL_ERROR.  So, when the guest attempts to emulate an
'flds' instruction, which isn't able to be emulated in KVM, instead
of failing, KVM sends the instruction to userspace to handle.

For this test to work properly the module parameter
'allow_smaller_maxphyaddr' has to be set.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/processor.h  |   3 +
 .../selftests/kvm/lib/x86_64/processor.c      |  58 +++++
 .../kvm/x86_64/emulator_error_test.c          | 209 ++++++++++++++++++
 5 files changed, 272 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/emulator_error_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 7bd7e776c266..ec9e20a2f752 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -7,6 +7,7 @@
 /x86_64/cr4_cpuid_sync_test
 /x86_64/debug_regs
 /x86_64/evmcs_test
+/x86_64/emulator_error_test
 /x86_64/get_cpuid_test
 /x86_64/get_msr_index_features
 /x86_64/kvm_pv_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 67eebb53235f..5ff705d92d02 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -41,6 +41,7 @@ LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_ha
 TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test
 TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
 TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
+TEST_GEN_PROGS_x86_64 += x86_64/emulator_error_test
 TEST_GEN_PROGS_x86_64 += x86_64/get_cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 0b30b4e15c38..40f70f91e6a2 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -394,6 +394,9 @@ void vcpu_init_descriptor_tables(struct kvm_vm *vm, uint32_t vcpuid);
 void vm_handle_exception(struct kvm_vm *vm, int vector,
 			void (*handler)(struct ex_regs *));
 
+uint64_t vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr);
+void vm_set_page_table_entry(struct kvm_vm *vm, uint64_t vaddr, uint64_t mask);
+
 /*
  * set_cpuid() - overwrites a matching cpuid entry with the provided value.
  *		 matches based on ent->function && ent->index. returns true
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index a8906e60a108..1aa4fc5e84c6 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -292,6 +292,64 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 	pte[index[0]].present = 1;
 }
 
+static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm,
+						       uint64_t vaddr)
+{
+	uint16_t index[4];
+	struct pageMapL4Entry *pml4e;
+	struct pageDirectoryPointerEntry *pdpe;
+	struct pageDirectoryEntry *pde;
+	struct pageTableEntry *pte;
+
+	TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K, "Attempt to use "
+		"unknown or unsupported guest mode, mode: 0x%x", vm->mode);
+	TEST_ASSERT((vaddr % vm->page_size) == 0,
+		"Virtual address not on page boundary,\n"
+		"  vaddr: 0x%lx vm->page_size: 0x%x",
+		vaddr, vm->page_size);
+	TEST_ASSERT(sparsebit_is_set(vm->vpages_valid,
+		(vaddr >> vm->page_shift)),
+		"Invalid virtual address, vaddr: 0x%lx",
+		vaddr);
+
+	index[0] = (vaddr >> 12) & 0x1ffu;
+	index[1] = (vaddr >> 21) & 0x1ffu;
+	index[2] = (vaddr >> 30) & 0x1ffu;
+	index[3] = (vaddr >> 39) & 0x1ffu;
+
+	pml4e = addr_gpa2hva(vm, vm->pgd);
+	TEST_ASSERT(pml4e[index[3]].present,
+		"Expected pml4e to be present for gva: 0x%08lx", vaddr);
+
+	pdpe = addr_gpa2hva(vm, pml4e[index[3]].address * vm->page_size);
+	TEST_ASSERT(pdpe[index[2]].present,
+		"Expected pdpe to be present for gva: 0x%08lx", vaddr);
+
+	pde = addr_gpa2hva(vm, pdpe[index[2]].address * vm->page_size);
+	TEST_ASSERT(pde[index[1]].present,
+		"Expected pde to be present for gva: 0x%08lx", vaddr);
+
+	pte = addr_gpa2hva(vm, pde[index[1]].address * vm->page_size);
+	TEST_ASSERT(pte[index[0]].present,
+		"Expected pte to be present for gva: 0x%08lx", vaddr);
+
+	return &pte[index[0]];
+}
+
+uint64_t vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr)
+{
+	struct pageTableEntry *pte = _vm_get_page_table_entry(vm, vaddr);
+
+	return *(uint64_t *)pte;
+}
+
+void vm_set_page_table_entry(struct kvm_vm *vm, uint64_t vaddr, uint64_t mask)
+{
+	struct pageTableEntry *pte = _vm_get_page_table_entry(vm, vaddr);
+
+	*(uint64_t *)pte |= mask;
+}
+
 void virt_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent)
 {
 	struct pageMapL4Entry *pml4e, *pml4e_start;
diff --git a/tools/testing/selftests/kvm/x86_64/emulator_error_test.c b/tools/testing/selftests/kvm/x86_64/emulator_error_test.c
new file mode 100644
index 000000000000..825da57ad791
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/emulator_error_test.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020, Google LLC.
+ *
+ * Tests for KVM_CAP_EXIT_ON_EMULATION_FAILURE capability.
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "vmx.h"
+
+#define VCPU_ID	   1
+#define PAGE_SIZE  4096
+#define MAXPHYADDR 36
+
+#define MEM_REGION_GVA	0x0000123456789000
+#define MEM_REGION_GPA	0x0000000700000000
+#define MEM_REGION_SLOT	10
+#define MEM_REGION_SIZE PAGE_SIZE
+
+extern char fld_start, fld_end;
+
+static void guest_code(void)
+{
+	__asm__ __volatile__("fld_start: flds (%[addr]); fld_end:"
+			     :: [addr]"r"(MEM_REGION_GVA));
+
+	GUEST_DONE();
+}
+
+static void run_guest(struct kvm_vm *vm)
+{
+	int rc;
+
+	rc = _vcpu_run(vm, VCPU_ID);
+	TEST_ASSERT(rc == 0, "vcpu_run failed: %d\n", rc);
+}
+
+/*
+ * Accessors to get R/M, REG, and Mod bits described in the SDM vol 2,
+ * figure 2-2 "Table Interpretation of ModR/M Byte (C8H)".
+ */
+#define GET_RM(insn_byte) (insn_byte & 0x7)
+#define GET_REG(insn_byte) ((insn_byte & 0x38) >> 3)
+#define GET_MOD(insn_byte) ((insn_byte & 0xc) >> 6)
+
+bool is_flds(uint8_t *insn_bytes, uint8_t insn_size)
+{
+	return insn_size >= 2 &&
+	       insn_bytes[0] == 0xd9 &&
+	       GET_REG(insn_bytes[1]) == 0x0;
+}
+
+static void process_exit_on_emulation_error(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct kvm_regs regs;
+	uint8_t *insn_bytes;
+	uint8_t insn_size;
+	uint64_t flags;
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_INTERNAL_ERROR,
+		    "Unexpected exit reason: %u (%s)",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+
+	TEST_ASSERT(run->emulation_failure.suberror == KVM_INTERNAL_ERROR_EMULATION,
+		    "Unexpected suberror: %u",
+		    run->emulation_failure.suberror);
+
+	if (run->emulation_failure.ndata >= 1) {
+		flags = run->emulation_failure.flags;
+		if ((flags & KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES) &&
+		    run->emulation_failure.ndata >= 3) {
+			insn_size = run->emulation_failure.insn_size;
+			insn_bytes = run->emulation_failure.insn_bytes;
+
+			TEST_ASSERT(insn_size <= 15 && insn_size > 0,
+				    "Unexpected instruction size: %u",
+				    insn_size);
+
+			TEST_ASSERT(is_flds(insn_bytes, insn_size),
+				    "Unexpected instruction.  Expected 'flds' (0xd9 /0), encountered (0x%x /%u)",
+				    insn_bytes[0], (insn_size >= 2) ? GET_REG(insn_bytes[1]) : 0);
+
+			vcpu_regs_get(vm, VCPU_ID, &regs);
+			regs.rip += (uintptr_t)(&fld_end) - (uintptr_t)(&fld_start);
+			vcpu_regs_set(vm, VCPU_ID, &regs);
+		}
+	}
+}
+
+static void do_guest_assert(struct kvm_vm *vm, struct ucall *uc)
+{
+	TEST_FAIL("%s at %s:%ld", (const char *)uc->args[0], __FILE__,
+		  uc->args[1]);
+}
+
+static void check_for_guest_assert(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc;
+
+	if (run->exit_reason == KVM_EXIT_IO &&
+	    get_ucall(vm, VCPU_ID, &uc) == UCALL_ABORT) {
+		do_guest_assert(vm, &uc);
+	}
+}
+
+static void process_ucall_done(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc;
+
+	check_for_guest_assert(vm);
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Unexpected exit reason: %u (%s)",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+
+	TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc) == UCALL_DONE,
+		    "Unexpected ucall command: %lu, expected UCALL_DONE (%d)",
+		    uc.cmd, UCALL_DONE);
+}
+
+static uint64_t process_ucall(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc;
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Unexpected exit reason: %u (%s)",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+
+	switch (get_ucall(vm, VCPU_ID, &uc)) {
+	case UCALL_SYNC:
+		break;
+	case UCALL_ABORT:
+		do_guest_assert(vm, &uc);
+		break;
+	case UCALL_DONE:
+		process_ucall_done(vm);
+		break;
+	default:
+		TEST_ASSERT(false, "Unexpected ucall");
+	}
+
+	return uc.cmd;
+}
+
+int main(int argc, char *argv[])
+{
+	struct kvm_enable_cap emul_failure_cap = {
+		.cap = KVM_CAP_EXIT_ON_EMULATION_FAILURE,
+		.args[0] = 1,
+	};
+	struct kvm_cpuid_entry2 *entry;
+	struct kvm_cpuid2 *cpuid;
+	struct kvm_vm *vm;
+	uint64_t *hva;
+	uint64_t gpa;
+	int rc;
+
+	/* Tell stdout not to buffer its content */
+	setbuf(stdout, NULL);
+
+	vm = vm_create_default(VCPU_ID, 0, guest_code);
+
+	cpuid = kvm_get_supported_cpuid();
+
+	entry = kvm_get_supported_cpuid_index(0x80000008, 0);
+	if (entry) {
+		entry->eax = (entry->eax & 0xffffff00) | MAXPHYADDR;
+		set_cpuid(cpuid, entry);
+	}
+
+	vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+
+	entry = kvm_get_supported_cpuid_index(0x80000008, 0);
+
+	rc = kvm_check_cap(KVM_CAP_EXIT_ON_EMULATION_FAILURE);
+	TEST_ASSERT(rc, "KVM_CAP_EXIT_ON_EMULATION_FAILURE is unavailable");
+	vm_enable_cap(vm, &emul_failure_cap);
+
+	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
+				    MEM_REGION_GPA, MEM_REGION_SLOT,
+				    MEM_REGION_SIZE / PAGE_SIZE, 0);
+	gpa = vm_phy_pages_alloc(vm, MEM_REGION_SIZE / PAGE_SIZE,
+				 MEM_REGION_GPA, MEM_REGION_SLOT);
+	TEST_ASSERT(gpa == MEM_REGION_GPA, "Failed vm_phy_pages_alloc\n");
+	virt_map(vm, MEM_REGION_GVA, MEM_REGION_GPA, 1, 0);
+	hva = addr_gpa2hva(vm, MEM_REGION_GPA);
+	memset(hva, 0, PAGE_SIZE);
+	vm_set_page_table_entry(vm, MEM_REGION_GVA, 1ull << 36);
+
+	run_guest(vm);
+	process_exit_on_emulation_error(vm);
+	run_guest(vm);
+
+	TEST_ASSERT(process_ucall(vm) == UCALL_DONE, "Expected UCALL_DONE");
+
+	kvm_vm_free(vm);
+
+	return 0;
+}
-- 
2.31.1.498.g6c1eba8ee3d-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 12:28 [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors Aaron Lewis
  2021-04-21 12:28 ` [PATCH v2 2/2] selftests: kvm: Allows " Aaron Lewis
@ 2021-04-21 14:03 ` David Edmondson
  2021-04-21 16:24   ` Aaron Lewis
  2021-04-22 12:57 ` Jim Mattson
  2 siblings, 1 reply; 17+ messages in thread
From: David Edmondson @ 2021-04-21 14:03 UTC (permalink / raw)
  To: Aaron Lewis, seanjc; +Cc: jmattson, kvm, Aaron Lewis

On Wednesday, 2021-04-21 at 05:28:32 -07, Aaron Lewis wrote:

> Add a fallback mechanism to the in-kernel instruction emulator that
> allows userspace the opportunity to process an instruction the emulator
> was unable to.  When the in-kernel instruction emulator fails to process
> an instruction it will either inject a #UD into the guest or exit to
> userspace with exit reason KVM_INTERNAL_ERROR.  This is because it does
> not know how to proceed in an appropriate manner.  This feature lets
> userspace get involved to see if it can figure out a better path
> forward.
>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>  Documentation/virt/kvm/api.rst  | 19 ++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h |  6 ++++++
>  arch/x86/kvm/x86.c              | 35 +++++++++++++++++++++++++++++----
>  include/uapi/linux/kvm.h        | 23 ++++++++++++++++++++++
>  tools/include/uapi/linux/kvm.h  | 23 ++++++++++++++++++++++
>  5 files changed, 102 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 307f2fcf1b02..fe6c3f1cae7e 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6233,6 +6233,25 @@ KVM_RUN_BUS_LOCK flag is used to distinguish between them.
>  This capability can be used to check / enable 2nd DAWR feature provided
>  by POWER10 processor.
>  
> +7.24 KVM_CAP_EXIT_ON_EMULATION_FAILURE
> +--------------------------------------
> +
> +:Architectures: x86
> +:Parameters: args[0] whether the feature should be enabled or not
> +
> +With this capability enabled, the in-kernel instruction emulator packs the exit
> +struct of KVM_INTERNAL_ERROR with the instruction length and instruction bytes
> +when an error occurs while emulating an instruction.  This allows userspace to
> +then take a look at the instruction and see if it is able to handle it more
> +gracefully than the in-kernel emulator.

It will do so without KVM_CAP_EXIT_ON_EMULATION_FAILURE enabled in this
set of changes.

That is, the payload can be present irrespective of the capability being
enabled.

> +When this capability is enabled use the emulation_failure struct instead of the
> +internal struct for the exit struct.  They have the same layout, but the
> +emulation_failure struct matches the content better.
> +
> +It is noted that KVM still exits on certain types (skip) even if this capability
> +is not enabled, and KVM will never exit if VMware #GP emulation fails.
> +
>  8. Other capabilities.
>  ======================
>  
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3768819693e5..07235d08e976 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1049,6 +1049,12 @@ struct kvm_arch {
>  	bool exception_payload_enabled;
>  
>  	bool bus_lock_detection_enabled;
> +	/*
> +	 * If exit_on_emulation_error is set, and the in-kernel instruction
> +	 * emulator fails to emulate an instruction, allow userspace
> +	 * the opportunity to look at it.
> +	 */
> +	bool exit_on_emulation_error;
>  
>  	/* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
>  	u32 user_space_msr_mask;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eca63625aee4..9cdfb7fbead5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3771,6 +3771,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_X86_USER_SPACE_MSR:
>  	case KVM_CAP_X86_MSR_FILTER:
>  	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
> +	case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
>  		r = 1;
>  		break;
>  #ifdef CONFIG_KVM_XEN
> @@ -5357,6 +5358,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			kvm->arch.bus_lock_detection_enabled = true;
>  		r = 0;
>  		break;
> +	case KVM_CAP_EXIT_ON_EMULATION_FAILURE:
> +		kvm->arch.exit_on_emulation_error = cap->args[0];
> +		r = 0;
> +		break;
>  	default:
>  		r = -EINVAL;
>  		break;
> @@ -7119,8 +7124,31 @@ void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
>  }
>  EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
>  
> +static void prepare_emulation_failure_exit(struct kvm_vcpu *vcpu)
> +{
> +	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
> +	u64 insn_size = ctxt->fetch.end - ctxt->fetch.data;
> +	struct kvm_run *run = vcpu->run;
> +
> +

Unnecessary extra blank line here...

> +	run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +	run->emulation_failure.suberror = KVM_INTERNAL_ERROR_EMULATION;
> +	run->emulation_failure.ndata = 0;
> +	run->emulation_failure.flags = 0;

...that I would move here, but your call :-)

> +	if (insn_size) {
> +		run->emulation_failure.ndata = 3;
> +		run->emulation_failure.flags |=
> +			KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES;
> +		run->emulation_failure.insn_size = insn_size;
> +		memcpy(run->emulation_failure.insn_bytes,
> +		       ctxt->fetch.data, sizeof(ctxt->fetch.data));

We're relying on the fact that insn_bytes is at least as large as
fetch.data, which is fine, but worth an assertion?

"Leaking" irrelevant bytes here also seems bad, but I can't immediately
see a problem as a result.

> +	}
> +}
> +
>  static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>  {
> +	struct kvm *kvm = vcpu->kvm;
> +
>  	++vcpu->stat.insn_emulation_fail;
>  	trace_kvm_emulate_insn_failed(vcpu);
>  
> @@ -7129,10 +7157,9 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>  		return 1;
>  	}
>  
> -	if (emulation_type & EMULTYPE_SKIP) {
> -		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> -		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> -		vcpu->run->internal.ndata = 0;
> +	if (kvm->arch.exit_on_emulation_error ||
> +	    (emulation_type & EMULTYPE_SKIP)) {
> +		prepare_emulation_failure_exit(vcpu);
>  		return 0;
>  	}
>  
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index f6afee209620..d7d109e6998f 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -279,6 +279,9 @@ struct kvm_xen_exit {
>  /* Encounter unexpected vm-exit reason */
>  #define KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON	4
>  
> +/* Flags that describe what fields in emulation_failure hold valid data  */
> +#define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0)
> +
>  /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
>  struct kvm_run {
>  	/* in */
> @@ -382,6 +385,25 @@ struct kvm_run {
>  			__u32 ndata;
>  			__u64 data[16];
>  		} internal;
> +		/*
> +		 * KVM_INTERNAL_ERROR_EMULATION
> +		 *
> +		 * "struct emulation_failure" is an overlay of "struct internal"
> +		 * that is used for the KVM_INTERNAL_ERROR_EMULATION sub-type of
> +		 * KVM_EXIT_INTERNAL_ERROR.  Note, unlike other internal error
> +		 * sub-types, this struct is ABI!  It also needs to be backwards
> +		 * compabile with "struct internal".  Take special care that
> +		 * "ndata" is correct, that new fields are enumerated in "flags",
> +		 * and that each flag enumerates fields that are 64-bit aligned
> +		 * and sized (so that ndata+internal.data[] is valid/accurate).
> +		 */
> +		struct {
> +			__u32 suberror;
> +			__u32 ndata;
> +			__u64 flags;
> +			__u8  insn_size;
> +			__u8  insn_bytes[15];
> +		} emulation_failure;
>  		/* KVM_EXIT_OSI */
>  		struct {
>  			__u64 gprs[32];
> @@ -1078,6 +1100,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_DIRTY_LOG_RING 192
>  #define KVM_CAP_X86_BUS_LOCK_EXIT 193
>  #define KVM_CAP_PPC_DAWR1 194
> +#define KVM_CAP_EXIT_ON_EMULATION_FAILURE 195
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
> index f6afee209620..d7d109e6998f 100644
> --- a/tools/include/uapi/linux/kvm.h
> +++ b/tools/include/uapi/linux/kvm.h
> @@ -279,6 +279,9 @@ struct kvm_xen_exit {
>  /* Encounter unexpected vm-exit reason */
>  #define KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON	4
>  
> +/* Flags that describe what fields in emulation_failure hold valid data  */

Extra space at the end.

> +#define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0)
> +
>  /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
>  struct kvm_run {
>  	/* in */
> @@ -382,6 +385,25 @@ struct kvm_run {
>  			__u32 ndata;
>  			__u64 data[16];
>  		} internal;
> +		/*
> +		 * KVM_INTERNAL_ERROR_EMULATION
> +		 *
> +		 * "struct emulation_failure" is an overlay of "struct internal"
> +		 * that is used for the KVM_INTERNAL_ERROR_EMULATION sub-type of
> +		 * KVM_EXIT_INTERNAL_ERROR.  Note, unlike other internal error
> +		 * sub-types, this struct is ABI!  It also needs to be backwards
> +		 * compabile with "struct internal".  Take special care that
> +		 * "ndata" is correct, that new fields are enumerated in "flags",
> +		 * and that each flag enumerates fields that are 64-bit aligned
> +		 * and sized (so that ndata+internal.data[] is valid/accurate).
> +		 */
> +		struct {
> +			__u32 suberror;
> +			__u32 ndata;
> +			__u64 flags;
> +			__u8  insn_size;
> +			__u8  insn_bytes[15];
> +		} emulation_failure;
>  		/* KVM_EXIT_OSI */
>  		struct {
>  			__u64 gprs[32];
> @@ -1078,6 +1100,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_DIRTY_LOG_RING 192
>  #define KVM_CAP_X86_BUS_LOCK_EXIT 193
>  #define KVM_CAP_PPC_DAWR1 194
> +#define KVM_CAP_EXIT_ON_EMULATION_FAILURE 195
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> -- 
> 2.31.1.498.g6c1eba8ee3d-goog

dme.
-- 
Everyone I know, goes away in the end.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 14:03 ` [PATCH v2 1/2] kvm: x86: Allow " David Edmondson
@ 2021-04-21 16:24   ` Aaron Lewis
  2021-04-21 17:10     ` David Edmondson
  0 siblings, 1 reply; 17+ messages in thread
From: Aaron Lewis @ 2021-04-21 16:24 UTC (permalink / raw)
  To: David Edmondson; +Cc: Sean Christopherson, Jim Mattson, kvm list

> > +     if (insn_size) {
> > +             run->emulation_failure.ndata = 3;
> > +             run->emulation_failure.flags |=
> > +                     KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES;
> > +             run->emulation_failure.insn_size = insn_size;
> > +             memcpy(run->emulation_failure.insn_bytes,
> > +                    ctxt->fetch.data, sizeof(ctxt->fetch.data));
>
> We're relying on the fact that insn_bytes is at least as large as
> fetch.data, which is fine, but worth an assertion?
>
> "Leaking" irrelevant bytes here also seems bad, but I can't immediately
> see a problem as a result.
>

I don't think this is a problem because the instruction bytes stream
has irrelevant bytes in it anyway.  In the test attached I verify that
it receives an flds instruction in userspace that was emulated in the
guest.  In the stream that comes through insn_size is set to 15 and
the instruction is only 2 bytes long, so the stream has irrelevant
bytes in it as far as this instruction is concerned.

> > +     }
> > +}
> > +

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 16:24   ` Aaron Lewis
@ 2021-04-21 17:10     ` David Edmondson
  2021-04-21 19:01       ` Aaron Lewis
  0 siblings, 1 reply; 17+ messages in thread
From: David Edmondson @ 2021-04-21 17:10 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: Sean Christopherson, Jim Mattson, kvm list

On Wednesday, 2021-04-21 at 09:24:13 -07, Aaron Lewis wrote:

>> > +     if (insn_size) {
>> > +             run->emulation_failure.ndata = 3;
>> > +             run->emulation_failure.flags |=
>> > +                     KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES;
>> > +             run->emulation_failure.insn_size = insn_size;
>> > +             memcpy(run->emulation_failure.insn_bytes,
>> > +                    ctxt->fetch.data, sizeof(ctxt->fetch.data));
>>
>> We're relying on the fact that insn_bytes is at least as large as
>> fetch.data, which is fine, but worth an assertion?
>>
>> "Leaking" irrelevant bytes here also seems bad, but I can't immediately
>> see a problem as a result.
>>
>
> I don't think this is a problem because the instruction bytes stream
> has irrelevant bytes in it anyway.  In the test attached I verify that
> it receives an flds instruction in userspace that was emulated in the
> guest.  In the stream that comes through insn_size is set to 15 and
> the instruction is only 2 bytes long, so the stream has irrelevant
> bytes in it as far as this instruction is concerned.

As an experiment I added[1] reporting of the exit reason using flag 2. On
emulation failure (without the instruction bytes flag enabled), one run
of QEMU reported:

> KVM internal error. Suberror: 1
> extra data[0]: 2
> extra data[1]: 4
> extra data[2]: 0
> extra data[3]: 31
> emulation failure

data[1] and data[2] are not indicated as valid, but it seems unfortunate
that I got (not really random) garbage there.

Admittedly, with only your patches applied ndata will never skip past
any bytes, as there is only one flag. As soon as I add another, is it my
job to zero out those unused bytes? Maybe we should be clearing all of
the payload at the top of prepare_emulation_failure_exit().

Footnotes:
[1]  https://disaster-area.hh.sledj.net/tmp/dme-581090/

dme.
-- 
Music has magic, it's good clear syncopation.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 17:10     ` David Edmondson
@ 2021-04-21 19:01       ` Aaron Lewis
  2021-04-22  8:07         ` David Edmondson
  0 siblings, 1 reply; 17+ messages in thread
From: Aaron Lewis @ 2021-04-21 19:01 UTC (permalink / raw)
  To: David Edmondson; +Cc: Sean Christopherson, Jim Mattson, kvm list

> >
> > I don't think this is a problem because the instruction bytes stream
> > has irrelevant bytes in it anyway.  In the test attached I verify that
> > it receives an flds instruction in userspace that was emulated in the
> > guest.  In the stream that comes through insn_size is set to 15 and
> > the instruction is only 2 bytes long, so the stream has irrelevant
> > bytes in it as far as this instruction is concerned.
>
> As an experiment I added[1] reporting of the exit reason using flag 2. On
> emulation failure (without the instruction bytes flag enabled), one run
> of QEMU reported:
>
> > KVM internal error. Suberror: 1
> > extra data[0]: 2
> > extra data[1]: 4
> > extra data[2]: 0
> > extra data[3]: 31
> > emulation failure
>
> data[1] and data[2] are not indicated as valid, but it seems unfortunate
> that I got (not really random) garbage there.
>
> Admittedly, with only your patches applied ndata will never skip past
> any bytes, as there is only one flag. As soon as I add another, is it my
> job to zero out those unused bytes? Maybe we should be clearing all of
> the payload at the top of prepare_emulation_failure_exit().
>

Clearing the bytes at the top of prepare_emulation_failure_exit()
sounds good to me.  That will keep the data more deterministic.
Though, I will say that I don't think that is required.  If the first
flag isn't set the data shouldn't be read, no?

> Footnotes:
> [1]  https://disaster-area.hh.sledj.net/tmp/dme-581090/
>
> dme.
> --
> Music has magic, it's good clear syncopation.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 19:01       ` Aaron Lewis
@ 2021-04-22  8:07         ` David Edmondson
  2021-04-23 15:33           ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: David Edmondson @ 2021-04-22  8:07 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: Sean Christopherson, Jim Mattson, kvm list

On Wednesday, 2021-04-21 at 12:01:21 -07, Aaron Lewis wrote:

>> >
>> > I don't think this is a problem because the instruction bytes stream
>> > has irrelevant bytes in it anyway.  In the test attached I verify that
>> > it receives an flds instruction in userspace that was emulated in the
>> > guest.  In the stream that comes through insn_size is set to 15 and
>> > the instruction is only 2 bytes long, so the stream has irrelevant
>> > bytes in it as far as this instruction is concerned.
>>
>> As an experiment I added[1] reporting of the exit reason using flag 2. On
>> emulation failure (without the instruction bytes flag enabled), one run
>> of QEMU reported:
>>
>> > KVM internal error. Suberror: 1
>> > extra data[0]: 2
>> > extra data[1]: 4
>> > extra data[2]: 0
>> > extra data[3]: 31
>> > emulation failure
>>
>> data[1] and data[2] are not indicated as valid, but it seems unfortunate
>> that I got (not really random) garbage there.
>>
>> Admittedly, with only your patches applied ndata will never skip past
>> any bytes, as there is only one flag. As soon as I add another, is it my
>> job to zero out those unused bytes? Maybe we should be clearing all of
>> the payload at the top of prepare_emulation_failure_exit().
>>
>
> Clearing the bytes at the top of prepare_emulation_failure_exit()
> sounds good to me.  That will keep the data more deterministic.
> Though, I will say that I don't think that is required.  If the first
> flag isn't set the data shouldn't be read, no?

Agreed. As Jim indicated in his other reply, there should be no new data
leaked by not zeroing the bytes.

For now at least, this is not a performance critical path, so clearing
the payload doesn't seem too onerous.

>> Footnotes:
>> [1]  https://disaster-area.hh.sledj.net/tmp/dme-581090/
>>
>> dme.
>> --
>> Music has magic, it's good clear syncopation.

dme.
-- 
Don't you know you're never going to get to France.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-21 12:28 [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors Aaron Lewis
  2021-04-21 12:28 ` [PATCH v2 2/2] selftests: kvm: Allows " Aaron Lewis
  2021-04-21 14:03 ` [PATCH v2 1/2] kvm: x86: Allow " David Edmondson
@ 2021-04-22 12:57 ` Jim Mattson
  2021-04-23  4:14   ` Aaron Lewis
  2 siblings, 1 reply; 17+ messages in thread
From: Jim Mattson @ 2021-04-22 12:57 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: David Edmondson, Sean Christopherson, kvm list

On Wed, Apr 21, 2021 at 5:28 AM Aaron Lewis <aaronlewis@google.com> wrote:
>
> Add a fallback mechanism to the in-kernel instruction emulator that
> allows userspace the opportunity to process an instruction the emulator
> was unable to.  When the in-kernel instruction emulator fails to process
> an instruction it will either inject a #UD into the guest or exit to
> userspace with exit reason KVM_INTERNAL_ERROR.  This is because it does
> not know how to proceed in an appropriate manner.  This feature lets
> userspace get involved to see if it can figure out a better path
> forward.
>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>

The instruction bytes are a good start, but in many cases, they aren't
sufficient context to decode the next instruction. Should we eagerly
provide that information in this exit, or should we just let userspace
gather it via subsequent ioctls if necessary?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-22 12:57 ` Jim Mattson
@ 2021-04-23  4:14   ` Aaron Lewis
  2021-04-23 16:43     ` Jim Mattson
  0 siblings, 1 reply; 17+ messages in thread
From: Aaron Lewis @ 2021-04-23  4:14 UTC (permalink / raw)
  To: Jim Mattson; +Cc: David Edmondson, Sean Christopherson, kvm list

On Thu, Apr 22, 2021 at 5:57 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Wed, Apr 21, 2021 at 5:28 AM Aaron Lewis <aaronlewis@google.com> wrote:
> >
> > Add a fallback mechanism to the in-kernel instruction emulator that
> > allows userspace the opportunity to process an instruction the emulator
> > was unable to.  When the in-kernel instruction emulator fails to process
> > an instruction it will either inject a #UD into the guest or exit to
> > userspace with exit reason KVM_INTERNAL_ERROR.  This is because it does
> > not know how to proceed in an appropriate manner.  This feature lets
> > userspace get involved to see if it can figure out a better path
> > forward.
> >
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>
> The instruction bytes are a good start, but in many cases, they aren't
> sufficient context to decode the next instruction. Should we eagerly
> provide that information in this exit, or should we just let userspace
> gather it via subsequent ioctls if necessary?

Why is there a concern for the next instruction?  This patch is about
userspace helping with the current instruction being emulated.  I'm
not sure why we would be concerned about the next one.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-22  8:07         ` David Edmondson
@ 2021-04-23 15:33           ` Sean Christopherson
  2021-04-23 17:23             ` David Edmondson
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2021-04-23 15:33 UTC (permalink / raw)
  To: David Edmondson; +Cc: Aaron Lewis, Jim Mattson, kvm list

On Thu, Apr 22, 2021, David Edmondson wrote:
> On Wednesday, 2021-04-21 at 12:01:21 -07, Aaron Lewis wrote:
> 
> >> >
> >> > I don't think this is a problem because the instruction bytes stream
> >> > has irrelevant bytes in it anyway.  In the test attached I verify that
> >> > it receives an flds instruction in userspace that was emulated in the
> >> > guest.  In the stream that comes through insn_size is set to 15 and
> >> > the instruction is only 2 bytes long, so the stream has irrelevant
> >> > bytes in it as far as this instruction is concerned.
> >>
> >> As an experiment I added[1] reporting of the exit reason using flag 2. On
> >> emulation failure (without the instruction bytes flag enabled), one run
> >> of QEMU reported:
> >>
> >> > KVM internal error. Suberror: 1
> >> > extra data[0]: 2
> >> > extra data[1]: 4
> >> > extra data[2]: 0
> >> > extra data[3]: 31
> >> > emulation failure
> >>
> >> data[1] and data[2] are not indicated as valid, but it seems unfortunate
> >> that I got (not really random) garbage there.
> >>
> >> Admittedly, with only your patches applied ndata will never skip past
> >> any bytes, as there is only one flag. As soon as I add another, is it my
> >> job to zero out those unused bytes? Maybe we should be clearing all of
> >> the payload at the top of prepare_emulation_failure_exit().
> >>
> >
> > Clearing the bytes at the top of prepare_emulation_failure_exit()
> > sounds good to me.  That will keep the data more deterministic.
> > Though, I will say that I don't think that is required.  If the first
> > flag isn't set the data shouldn't be read, no?
> 
> Agreed. As Jim indicated in his other reply, there should be no new data
> leaked by not zeroing the bytes.
> 
> For now at least, this is not a performance critical path, so clearing
> the payload doesn't seem too onerous.

I feel quite strongly that KVM should _not_ touch the unused bytes.  As Jim
pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will decode to one or
more ADD instructions.  Arguably 0x90, 0xcc, or an undending stream of prefixes
would be more appropriate so that it's less likely for userspace to decode a
bogus instruction.

I don't see any reason why unused insn bytes should be treated any differently
than unused mmio.data[], or unused internal.data[], etc... 

IMO, the better option is to do nothing and let userspace initialize vcpu->run
before KVM_RUN if they want to avoid consuming stale data.  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23  4:14   ` Aaron Lewis
@ 2021-04-23 16:43     ` Jim Mattson
  0 siblings, 0 replies; 17+ messages in thread
From: Jim Mattson @ 2021-04-23 16:43 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: David Edmondson, Sean Christopherson, kvm list

On Thu, Apr 22, 2021 at 9:14 PM Aaron Lewis <aaronlewis@google.com> wrote:
>
> On Thu, Apr 22, 2021 at 5:57 AM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Wed, Apr 21, 2021 at 5:28 AM Aaron Lewis <aaronlewis@google.com> wrote:
> > >
> > > Add a fallback mechanism to the in-kernel instruction emulator that
> > > allows userspace the opportunity to process an instruction the emulator
> > > was unable to.  When the in-kernel instruction emulator fails to process
> > > an instruction it will either inject a #UD into the guest or exit to
> > > userspace with exit reason KVM_INTERNAL_ERROR.  This is because it does
> > > not know how to proceed in an appropriate manner.  This feature lets
> > > userspace get involved to see if it can figure out a better path
> > > forward.
> > >
> > > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> >
> > The instruction bytes are a good start, but in many cases, they aren't
> > sufficient context to decode the next instruction. Should we eagerly
> > provide that information in this exit, or should we just let userspace
> > gather it via subsequent ioctls if necessary?
>
> Why is there a concern for the next instruction?  This patch is about
> userspace helping with the current instruction being emulated.  I'm
> not sure why we would be concerned about the next one.

Sorry; I should have said, "The instruction bytes are a good start,
but in many cases, they aren't sufficient context to decode *the*
instruction." For example, suppose the first byte is 0x48. Without
more context, we don't know if this is a DEC or a REX prefix.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 15:33           ` Sean Christopherson
@ 2021-04-23 17:23             ` David Edmondson
  2021-04-23 17:37               ` Sean Christopherson
  0 siblings, 1 reply; 17+ messages in thread
From: David Edmondson @ 2021-04-23 17:23 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, Jim Mattson, kvm list

On Friday, 2021-04-23 at 15:33:47 GMT, Sean Christopherson wrote:

> On Thu, Apr 22, 2021, David Edmondson wrote:
>> On Wednesday, 2021-04-21 at 12:01:21 -07, Aaron Lewis wrote:
>> 
>> >> >
>> >> > I don't think this is a problem because the instruction bytes stream
>> >> > has irrelevant bytes in it anyway.  In the test attached I verify that
>> >> > it receives an flds instruction in userspace that was emulated in the
>> >> > guest.  In the stream that comes through insn_size is set to 15 and
>> >> > the instruction is only 2 bytes long, so the stream has irrelevant
>> >> > bytes in it as far as this instruction is concerned.
>> >>
>> >> As an experiment I added[1] reporting of the exit reason using flag 2. On
>> >> emulation failure (without the instruction bytes flag enabled), one run
>> >> of QEMU reported:
>> >>
>> >> > KVM internal error. Suberror: 1
>> >> > extra data[0]: 2
>> >> > extra data[1]: 4
>> >> > extra data[2]: 0
>> >> > extra data[3]: 31
>> >> > emulation failure
>> >>
>> >> data[1] and data[2] are not indicated as valid, but it seems unfortunate
>> >> that I got (not really random) garbage there.
>> >>
>> >> Admittedly, with only your patches applied ndata will never skip past
>> >> any bytes, as there is only one flag. As soon as I add another, is it my
>> >> job to zero out those unused bytes? Maybe we should be clearing all of
>> >> the payload at the top of prepare_emulation_failure_exit().
>> >>
>> >
>> > Clearing the bytes at the top of prepare_emulation_failure_exit()
>> > sounds good to me.  That will keep the data more deterministic.
>> > Though, I will say that I don't think that is required.  If the first
>> > flag isn't set the data shouldn't be read, no?
>> 
>> Agreed. As Jim indicated in his other reply, there should be no new data
>> leaked by not zeroing the bytes.
>> 
>> For now at least, this is not a performance critical path, so clearing
>> the payload doesn't seem too onerous.
>
> I feel quite strongly that KVM should _not_ touch the unused bytes.

I'm fine with that, but...

> As Jim pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will
> decode to one or more ADD instructions.  Arguably 0x90, 0xcc, or an
> undending stream of prefixes would be more appropriate so that it's
> less likely for userspace to decode a bogus instruction.

...I don't understand this position. If the user-level instruction
decoder starts interpreting bytes that the kernel did *not* indicate as
valid (by setting insn_size to include them), it's broken.

> I don't see any reason why unused insn bytes should be treated any differently
> than unused mmio.data[], or unused internal.data[], etc... 
>
> IMO, the better option is to do nothing and let userspace initialize vcpu->run
> before KVM_RUN if they want to avoid consuming stale data.  

dme.
-- 
I've still got sand in my shoes.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 17:23             ` David Edmondson
@ 2021-04-23 17:37               ` Sean Christopherson
  2021-04-23 17:55                 ` David Edmondson
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2021-04-23 17:37 UTC (permalink / raw)
  To: David Edmondson; +Cc: Aaron Lewis, Jim Mattson, kvm list

On Fri, Apr 23, 2021, David Edmondson wrote:
> On Friday, 2021-04-23 at 15:33:47 GMT, Sean Christopherson wrote:
> 
> > On Thu, Apr 22, 2021, David Edmondson wrote:
> >> Agreed. As Jim indicated in his other reply, there should be no new data
> >> leaked by not zeroing the bytes.
> >> 
> >> For now at least, this is not a performance critical path, so clearing
> >> the payload doesn't seem too onerous.
> >
> > I feel quite strongly that KVM should _not_ touch the unused bytes.
> 
> I'm fine with that, but...
> 
> > As Jim pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will
> > decode to one or more ADD instructions.  Arguably 0x90, 0xcc, or an
> > undending stream of prefixes would be more appropriate so that it's
> > less likely for userspace to decode a bogus instruction.
> 
> ...I don't understand this position. If the user-level instruction
> decoder starts interpreting bytes that the kernel did *not* indicate as
> valid (by setting insn_size to include them), it's broken.

Yes, so what's the point of clearing the unused bytes?  Doing so won't magically
fix a broken userspace.  That's why I argue that 0x90 or 0xcc would be more
appropriate; there's at least a non-zero chance that it will help userspace avoid
doing something completely broken.

On the other hand, userspace can guard against a broken _KVM_ by initializing
vcpu->run with a known pattern and logging if KVM exits to userspace with
seemingly bogus data.  Crushing the unused bytes to zero defeats userspace's
sanity check, e.g. if the actual memcpy() of the instruction bytes copies the
wrong number of bytes, then userspace's magic pattern will be lost and debugging
the KVM bug will be that much harder.

This is very much not a theoretical problem, I have debugged two separate KVM
bugs in the last few months where KVM completely failed to set
vcpu->run->exit_reason before exiting to userspace.  The exit_reason is a bit of
a special case because it's disturbingly easy for KVM to get confused over return
values and unintentionally exit to userspace, but it's not a big stretch to
imagine a bug where KVM provides incomplete data.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 17:37               ` Sean Christopherson
@ 2021-04-23 17:55                 ` David Edmondson
  2021-04-23 17:57                   ` Jim Mattson
  0 siblings, 1 reply; 17+ messages in thread
From: David Edmondson @ 2021-04-23 17:55 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Aaron Lewis, Jim Mattson, kvm list

On Friday, 2021-04-23 at 17:37:53 GMT, Sean Christopherson wrote:

> On Fri, Apr 23, 2021, David Edmondson wrote:
>> On Friday, 2021-04-23 at 15:33:47 GMT, Sean Christopherson wrote:
>> 
>> > On Thu, Apr 22, 2021, David Edmondson wrote:
>> >> Agreed. As Jim indicated in his other reply, there should be no new data
>> >> leaked by not zeroing the bytes.
>> >> 
>> >> For now at least, this is not a performance critical path, so clearing
>> >> the payload doesn't seem too onerous.
>> >
>> > I feel quite strongly that KVM should _not_ touch the unused bytes.
>> 
>> I'm fine with that, but...
>> 
>> > As Jim pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will
>> > decode to one or more ADD instructions.  Arguably 0x90, 0xcc, or an
>> > undending stream of prefixes would be more appropriate so that it's
>> > less likely for userspace to decode a bogus instruction.
>> 
>> ...I don't understand this position. If the user-level instruction
>> decoder starts interpreting bytes that the kernel did *not* indicate as
>> valid (by setting insn_size to include them), it's broken.
>
> Yes, so what's the point of clearing the unused bytes?

Given that it doesn't prevent any known leakage, it's purely aesthetic,
which is why I'm happy not to bother.

> Doing so won't magically fix a broken userspace.  That's why I argue
> that 0x90 or 0xcc would be more appropriate; there's at least a
> non-zero chance that it will help userspace avoid doing something
> completely broken.

Perhaps an invalid instruction would be more useful in this respect, but
INT03 fills a similar purpose.

> On the other hand, userspace can guard against a broken _KVM_ by initializing
> vcpu->run with a known pattern and logging if KVM exits to userspace with
> seemingly bogus data.  Crushing the unused bytes to zero defeats userspace's
> sanity check, e.g. if the actual memcpy() of the instruction bytes copies the
> wrong number of bytes, then userspace's magic pattern will be lost and debugging
> the KVM bug will be that much harder.
>
> This is very much not a theoretical problem, I have debugged two separate KVM
> bugs in the last few months where KVM completely failed to set
> vcpu->run->exit_reason before exiting to userspace.  The exit_reason is a bit of
> a special case because it's disturbingly easy for KVM to get confused over return
> values and unintentionally exit to userspace, but it's not a big stretch to
> imagine a bug where KVM provides incomplete data.

Understood.

So is the conclusion that KVM should copy only insn_size bytes rather
than the full 15?

dme.
-- 
But they'll laugh at you in Jackson, and I'll be dancin' on a Pony Keg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 17:55                 ` David Edmondson
@ 2021-04-23 17:57                   ` Jim Mattson
  2021-04-23 18:01                     ` Jim Mattson
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Mattson @ 2021-04-23 17:57 UTC (permalink / raw)
  To: David Edmondson; +Cc: Sean Christopherson, Aaron Lewis, kvm list

On Fri, Apr 23, 2021 at 10:55 AM David Edmondson <dme@dme.org> wrote:
>
> On Friday, 2021-04-23 at 17:37:53 GMT, Sean Christopherson wrote:
>
> > On Fri, Apr 23, 2021, David Edmondson wrote:
> >> On Friday, 2021-04-23 at 15:33:47 GMT, Sean Christopherson wrote:
> >>
> >> > On Thu, Apr 22, 2021, David Edmondson wrote:
> >> >> Agreed. As Jim indicated in his other reply, there should be no new data
> >> >> leaked by not zeroing the bytes.
> >> >>
> >> >> For now at least, this is not a performance critical path, so clearing
> >> >> the payload doesn't seem too onerous.
> >> >
> >> > I feel quite strongly that KVM should _not_ touch the unused bytes.
> >>
> >> I'm fine with that, but...
> >>
> >> > As Jim pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will
> >> > decode to one or more ADD instructions.  Arguably 0x90, 0xcc, or an
> >> > undending stream of prefixes would be more appropriate so that it's
> >> > less likely for userspace to decode a bogus instruction.
> >>
> >> ...I don't understand this position. If the user-level instruction
> >> decoder starts interpreting bytes that the kernel did *not* indicate as
> >> valid (by setting insn_size to include them), it's broken.
> >
> > Yes, so what's the point of clearing the unused bytes?
>
> Given that it doesn't prevent any known leakage, it's purely aesthetic,
> which is why I'm happy not to bother.
>
> > Doing so won't magically fix a broken userspace.  That's why I argue
> > that 0x90 or 0xcc would be more appropriate; there's at least a
> > non-zero chance that it will help userspace avoid doing something
> > completely broken.
>
> Perhaps an invalid instruction would be more useful in this respect, but
> INT03 fills a similar purpose.
>
> > On the other hand, userspace can guard against a broken _KVM_ by initializing
> > vcpu->run with a known pattern and logging if KVM exits to userspace with
> > seemingly bogus data.  Crushing the unused bytes to zero defeats userspace's
> > sanity check, e.g. if the actual memcpy() of the instruction bytes copies the
> > wrong number of bytes, then userspace's magic pattern will be lost and debugging
> > the KVM bug will be that much harder.
> >
> > This is very much not a theoretical problem, I have debugged two separate KVM
> > bugs in the last few months where KVM completely failed to set
> > vcpu->run->exit_reason before exiting to userspace.  The exit_reason is a bit of
> > a special case because it's disturbingly easy for KVM to get confused over return
> > values and unintentionally exit to userspace, but it's not a big stretch to
> > imagine a bug where KVM provides incomplete data.
>
> Understood.
>
> So is the conclusion that KVM should copy only insn_size bytes rather
> than the full 15?

Insn_size should almost always be 15. It will only be less when the
emulator hits a page crossing before fetching 15 bytes and it can't
fetch from the second page.

> dme.
> --
> But they'll laugh at you in Jackson, and I'll be dancin' on a Pony Keg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 17:57                   ` Jim Mattson
@ 2021-04-23 18:01                     ` Jim Mattson
  2021-04-23 18:43                       ` Aaron Lewis
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Mattson @ 2021-04-23 18:01 UTC (permalink / raw)
  To: David Edmondson; +Cc: Sean Christopherson, Aaron Lewis, kvm list

On Fri, Apr 23, 2021 at 10:57 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Fri, Apr 23, 2021 at 10:55 AM David Edmondson <dme@dme.org> wrote:
> >
> > On Friday, 2021-04-23 at 17:37:53 GMT, Sean Christopherson wrote:
> >
> > > On Fri, Apr 23, 2021, David Edmondson wrote:
> > >> On Friday, 2021-04-23 at 15:33:47 GMT, Sean Christopherson wrote:
> > >>
> > >> > On Thu, Apr 22, 2021, David Edmondson wrote:
> > >> >> Agreed. As Jim indicated in his other reply, there should be no new data
> > >> >> leaked by not zeroing the bytes.
> > >> >>
> > >> >> For now at least, this is not a performance critical path, so clearing
> > >> >> the payload doesn't seem too onerous.
> > >> >
> > >> > I feel quite strongly that KVM should _not_ touch the unused bytes.
> > >>
> > >> I'm fine with that, but...
> > >>
> > >> > As Jim pointed out, a stream of 0x0 0x0 0x0 ... is not benign, it will
> > >> > decode to one or more ADD instructions.  Arguably 0x90, 0xcc, or an
> > >> > undending stream of prefixes would be more appropriate so that it's
> > >> > less likely for userspace to decode a bogus instruction.
> > >>
> > >> ...I don't understand this position. If the user-level instruction
> > >> decoder starts interpreting bytes that the kernel did *not* indicate as
> > >> valid (by setting insn_size to include them), it's broken.
> > >
> > > Yes, so what's the point of clearing the unused bytes?
> >
> > Given that it doesn't prevent any known leakage, it's purely aesthetic,
> > which is why I'm happy not to bother.
> >
> > > Doing so won't magically fix a broken userspace.  That's why I argue
> > > that 0x90 or 0xcc would be more appropriate; there's at least a
> > > non-zero chance that it will help userspace avoid doing something
> > > completely broken.
> >
> > Perhaps an invalid instruction would be more useful in this respect, but
> > INT03 fills a similar purpose.
> >
> > > On the other hand, userspace can guard against a broken _KVM_ by initializing
> > > vcpu->run with a known pattern and logging if KVM exits to userspace with
> > > seemingly bogus data.  Crushing the unused bytes to zero defeats userspace's
> > > sanity check, e.g. if the actual memcpy() of the instruction bytes copies the
> > > wrong number of bytes, then userspace's magic pattern will be lost and debugging
> > > the KVM bug will be that much harder.
> > >
> > > This is very much not a theoretical problem, I have debugged two separate KVM
> > > bugs in the last few months where KVM completely failed to set
> > > vcpu->run->exit_reason before exiting to userspace.  The exit_reason is a bit of
> > > a special case because it's disturbingly easy for KVM to get confused over return
> > > values and unintentionally exit to userspace, but it's not a big stretch to
> > > imagine a bug where KVM provides incomplete data.
> >
> > Understood.
> >
> > So is the conclusion that KVM should copy only insn_size bytes rather
> > than the full 15?
>
> Insn_size should almost always be 15. It will only be less when the
> emulator hits a page crossing before fetching 15 bytes and it can't
> fetch from the second page.

Oh, or if the CS limit is reached. (cf. AMD's APM, volume 2, section
15.8.4: Nested and intercepted #PF).


> > dme.
> > --
> > But they'll laugh at you in Jackson, and I'll be dancin' on a Pony Keg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors
  2021-04-23 18:01                     ` Jim Mattson
@ 2021-04-23 18:43                       ` Aaron Lewis
  0 siblings, 0 replies; 17+ messages in thread
From: Aaron Lewis @ 2021-04-23 18:43 UTC (permalink / raw)
  To: Jim Mattson; +Cc: David Edmondson, Sean Christopherson, kvm list

> > >
> > > So is the conclusion that KVM should copy only insn_size bytes rather
> > > than the full 15?
> >
> > Insn_size should almost always be 15. It will only be less when the
> > emulator hits a page crossing before fetching 15 bytes and it can't
> > fetch from the second page.
>
> Oh, or if the CS limit is reached. (cf. AMD's APM, volume 2, section
> 15.8.4: Nested and intercepted #PF).

To sum this up as I understand it.  I'm _not_ going to clear
'run->internal.data' to zero.  I'll leave it to userspace to clear
vcpu->run.  I'll copy over 'insn_size' bytes rather than
'sizeof(ctxt->fetch.data)' bytes to
'run->emulation_failure.insn_bytes', and if 'insn_size' < 15, I'll
stamp the remaining bytes with 0x90.

Let me know if I missed anything.

>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-04-23 18:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 12:28 [PATCH v2 1/2] kvm: x86: Allow userspace to handle emulation errors Aaron Lewis
2021-04-21 12:28 ` [PATCH v2 2/2] selftests: kvm: Allows " Aaron Lewis
2021-04-21 14:03 ` [PATCH v2 1/2] kvm: x86: Allow " David Edmondson
2021-04-21 16:24   ` Aaron Lewis
2021-04-21 17:10     ` David Edmondson
2021-04-21 19:01       ` Aaron Lewis
2021-04-22  8:07         ` David Edmondson
2021-04-23 15:33           ` Sean Christopherson
2021-04-23 17:23             ` David Edmondson
2021-04-23 17:37               ` Sean Christopherson
2021-04-23 17:55                 ` David Edmondson
2021-04-23 17:57                   ` Jim Mattson
2021-04-23 18:01                     ` Jim Mattson
2021-04-23 18:43                       ` Aaron Lewis
2021-04-22 12:57 ` Jim Mattson
2021-04-23  4:14   ` Aaron Lewis
2021-04-23 16:43     ` Jim Mattson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.