All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] Allow userspace to manage MSRs
@ 2020-08-18 21:15 Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space Aaron Lewis
                   ` (11 more replies)
  0 siblings, 12 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

This series makes it possible for userspace to manage MSRs by having KVM
forward select MSRs to it when rdmsr and wrmsr are executed in the guest.
Userspace can set this up by calling the ioctl KVM_SET_EXIT_MSRS with a
list of MSRs it wants to manage.  When KVM encounters any of these MSRs
they are forwarded to userspace for processing.  Userspace can then read
from or write to the MSR, or it can also throw a #GP if needed.

This series includes the kernel changes needed to implement this feature
and a test that exercises this behavior.  Also, included is an
implementation of expection handling in selftests, which allows the test
to excercise throwing a #GP.

v1 -> v2:

  - Added support for generic instruction emulator bouncing to userspace when
    rdmsr or wrmsr are called, and userspace has asked to manage the MSR.
    These changes are committed in patch 3, and are based on changes made by
    Alexander Graf <graf@amazon.com>.
  - Added tests to excercise the code paths for em_{rdmsr,wrmsr} and
    emulator_{get,set}_msr.  These changes are committed in patch 8.

v2 -> v3:

  - Added the series by Alexander Graf <graf@amazon.com> to the beginning of
    This series (patches 1-3).  The two have a lot of overlap, so it made sense
    to combine them to simplify merging them both upstream.  Alex's changes
    account for the first 3 commits in this series.  As a result of incorporating
    those changes, commit 05/12 required some refactoring.
  - Split exception handling in selftests into its own commit (patch 09/12).
  - Split the changes to ucall_get() into it's own commit based on Andrew Jones
    suggestion, and added support for aarch64 and s390x.

Aaron Lewis (12):
  KVM: x86: Deflect unknown MSR accesses to user space
  KVM: x86: Introduce allow list for MSR emulation
  KVM: selftests: Add test for user space MSR handling
  KVM: x86: Add ioctl for accepting a userspace provided MSR list
  KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs
  KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  selftests: kvm: Fix the segment descriptor layout to match the actual
    layout
  selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  selftests: kvm: Add exception handling to selftests
  selftests: kvm: Add a test to exercise the userspace MSR list
  selftests: kvm: Add emulated rdmsr, wrmsr tests

 Documentation/virt/kvm/api.rst                | 181 +++++++-
 arch/x86/include/asm/kvm_host.h               |  18 +
 arch/x86/include/uapi/asm/kvm.h               |  15 +
 arch/x86/kvm/emulate.c                        |  18 +-
 arch/x86/kvm/svm/svm.c                        |  93 ++--
 arch/x86/kvm/trace.h                          |  24 +
 arch/x86/kvm/vmx/nested.c                     |   2 +-
 arch/x86/kvm/vmx/vmx.c                        |  94 ++--
 arch/x86/kvm/vmx/vmx.h                        |   2 +-
 arch/x86/kvm/x86.c                            | 379 +++++++++++++++-
 include/trace/events/kvm.h                    |   2 +-
 include/uapi/linux/kvm.h                      |  17 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |  21 +-
 .../selftests/kvm/include/x86_64/processor.h  |  29 +-
 .../testing/selftests/kvm/lib/aarch64/ucall.c |   3 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |  17 +
 .../selftests/kvm/lib/kvm_util_internal.h     |   2 +
 tools/testing/selftests/kvm/lib/s390x/ucall.c |   3 +
 .../selftests/kvm/lib/x86_64/handlers.S       |  81 ++++
 .../selftests/kvm/lib/x86_64/processor.c      | 168 ++++++-
 .../testing/selftests/kvm/lib/x86_64/ucall.c  |   3 +
 .../selftests/kvm/x86_64/user_msr_test.c      | 221 +++++++++
 .../selftests/kvm/x86_64/userspace_msr_exit.c | 421 ++++++++++++++++++
 24 files changed, 1719 insertions(+), 96 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/handlers.S
 create mode 100644 tools/testing/selftests/kvm/x86_64/user_msr_test.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c

-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19  8:42   ` Alexander Graf
  2020-08-18 21:15 ` [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation Aaron Lewis
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

MSRs are weird. Some of them are normal control registers, such as EFER.
Some however are registers that really are model specific, not very
interesting to virtualization workloads, and not performance critical.
Others again are really just windows into package configuration.

Out of these MSRs, only the first category is necessary to implement in
kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against
certain CPU models and MSRs that contain information on the package level
are much better suited for user space to process. However, over time we have
accumulated a lot of MSRs that are not the first category, but still handled
by in-kernel KVM code.

This patch adds a generic interface to handle WRMSR and RDMSR from user
space. With this, any future MSR that is part of the latter categories can
be handled in user space.

Furthermore, it allows us to replace the existing "ignore_msrs" logic with
something that applies per-VM rather than on the full system. That way you
can run productive VMs in parallel to experimental ones where you don't care
about proper MSR handling.

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

---

v1 -> v2:

  - s/ETRAP_TO_USER_SPACE/ENOENT/g
  - deflect all #GP injection events to user space, not just unknown MSRs.
    That was we can also deflect allowlist errors later
  - fix emulator case

v2 -> v3:

  - return r if r == X86EMUL_IO_NEEDED
  - s/KVM_EXIT_RDMSR/KVM_EXIT_X86_RDMSR/g
  - s/KVM_EXIT_WRMSR/KVM_EXIT_X86_WRMSR/g
  - Use complete_userspace_io logic instead of reply field
  - Simplify trapping code

v3 -> v4:

  - Mention exit reasons in re-inter mandatory section of API documentation
  - Clear padding bytes
  - Generalize get/set deflect functions
  - Remove redundant pending_user_msr field

---
 Documentation/virt/kvm/api.rst  |  66 +++++++++++++++++++-
 arch/x86/include/asm/kvm_host.h |   3 +
 arch/x86/kvm/emulate.c          |  18 +++++-
 arch/x86/kvm/x86.c              | 106 ++++++++++++++++++++++++++++++--
 include/trace/events/kvm.h      |   2 +-
 include/uapi/linux/kvm.h        |  10 +++
 6 files changed, 196 insertions(+), 9 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index eb3a1316f03e..aad51c33fcae 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4869,8 +4869,8 @@ to the byte array.
 
 .. note::
 
-      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
-      KVM_EXIT_EPR the corresponding
+      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR,
+      KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
 
 operations are complete (and guest state is consistent) only after userspace
 has re-entered the kernel with KVM_RUN.  The kernel side will first finish
@@ -5163,6 +5163,35 @@ Note that KVM does not skip the faulting instruction as it does for
 KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
 if it decides to decode and emulate the instruction.
 
+::
+
+               /* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
+               struct {
+                       __u8 error;
+                       __u8 pad[3];
+                       __u32 index;
+                       __u64 data;
+               } msr;
+
+Used on x86 systems. When the VM capability KVM_CAP_X86_USER_SPACE_MSR is
+enabled, MSR accesses to registers that would invoke a #GP by KVM kernel code
+will instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR
+exit for writes.
+
+For KVM_EXIT_X86_RDMSR, the "index" field tells user space which MSR the guest
+wants to read. To respond to this request with a successful read, user space
+writes the respective data into the "data" field and must continue guest
+execution to ensure the read data is transferred into guest register state.
+
+If the RDMSR request was unsuccessful, user space indicates that with a "1" in
+the "error" field. This will inject a #GP into the guest when the VCPU is
+executed again.
+
+For KVM_EXIT_X86_WRMSR, the "index" field tells user space which MSR the guest
+wants to write. Once finished processing the event, user space must continue
+vCPU execution. If the MSR write was unsuccessful, user space also sets the
+"error" field to "1".
+
 ::
 
 		/* Fix the size of the union. */
@@ -5852,6 +5881,28 @@ controlled by the kvm module parameter halt_poll_ns. This capability allows
 the maximum halt time to specified on a per-VM basis, effectively overriding
 the module parameter for the target VM.
 
+7.21 KVM_CAP_X86_USER_SPACE_MSR
+-------------------------------
+
+:Architectures: x86
+:Target: VM
+:Parameters: args[0] is 1 if user space MSR handling is enabled, 0 otherwise
+:Returns: 0 on success; -1 on error
+
+This capability enables trapping of #GP invoking RDMSR and WRMSR instructions
+into user space.
+
+When a guest requests to read or write an MSR, KVM may not implement all MSRs
+that are relevant to a respective system. It also does not differentiate by
+CPU type.
+
+To allow more fine grained control over MSR handling, user space may enable
+this capability. With it enabled, MSR accesses that would usually trigger
+a #GP event inside the guest by KVM will instead trigger KVM_EXIT_X86_RDMSR
+and KVM_EXIT_X86_WRMSR exit notifications which user space can then handle to
+implement model specific MSR handling and/or user notifications to inform
+a user that an MSR was not handled.
+
 8. Other capabilities.
 ======================
 
@@ -6159,3 +6210,14 @@ KVM can therefore start protected VMs.
 This capability governs the KVM_S390_PV_COMMAND ioctl and the
 KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected
 guests when the state change is invalid.
+
+8.24 KVM_CAP_X86_USER_SPACE_MSR
+----------------------------
+
+:Architectures: x86
+
+This capability indicates that KVM supports deflection of MSR reads and
+writes to user space. It can be enabled on a VM level. If enabled, MSR
+accesses that would usually trigger a #GP by KVM into the guest will
+instead get bounced to user space through the KVM_EXIT_X86_RDMSR and
+KVM_EXIT_X86_WRMSR exit notifications.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5ab3af7275d8..02a102c60dff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -961,6 +961,9 @@ struct kvm_arch {
 	bool guest_can_read_msr_platform_info;
 	bool exception_payload_enabled;
 
+	/* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
+	bool user_space_msr_enabled;
+
 	struct kvm_pmu_event_filter *pmu_event_filter;
 	struct task_struct *nx_lpage_recovery_thread;
 };
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index d0e2825ae617..744ab9c92b73 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3689,11 +3689,18 @@ static int em_dr_write(struct x86_emulate_ctxt *ctxt)
 
 static int em_wrmsr(struct x86_emulate_ctxt *ctxt)
 {
+	u64 msr_index = reg_read(ctxt, VCPU_REGS_RCX);
 	u64 msr_data;
+	int r;
 
 	msr_data = (u32)reg_read(ctxt, VCPU_REGS_RAX)
 		| ((u64)reg_read(ctxt, VCPU_REGS_RDX) << 32);
-	if (ctxt->ops->set_msr(ctxt, reg_read(ctxt, VCPU_REGS_RCX), msr_data))
+	r = ctxt->ops->set_msr(ctxt, msr_index, msr_data);
+
+	if (r == X86EMUL_IO_NEEDED)
+		return r;
+
+	if (r)
 		return emulate_gp(ctxt, 0);
 
 	return X86EMUL_CONTINUE;
@@ -3701,9 +3708,16 @@ static int em_wrmsr(struct x86_emulate_ctxt *ctxt)
 
 static int em_rdmsr(struct x86_emulate_ctxt *ctxt)
 {
+	u64 msr_index = reg_read(ctxt, VCPU_REGS_RCX);
 	u64 msr_data;
+	int r;
+
+	r = ctxt->ops->get_msr(ctxt, msr_index, &msr_data);
+
+	if (r == X86EMUL_IO_NEEDED)
+		return r;
 
-	if (ctxt->ops->get_msr(ctxt, reg_read(ctxt, VCPU_REGS_RCX), &msr_data))
+	if (r)
 		return emulate_gp(ctxt, 0);
 
 	*reg_write(ctxt, VCPU_REGS_RAX) = (u32)msr_data;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 599d73206299..5d94a95fb66b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1588,12 +1588,75 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data)
 }
 EXPORT_SYMBOL_GPL(kvm_set_msr);
 
+static int complete_emulated_msr(struct kvm_vcpu *vcpu, bool is_read)
+{
+	if (vcpu->run->msr.error) {
+		kvm_inject_gp(vcpu, 0);
+	} else if (is_read) {
+		kvm_rax_write(vcpu, (u32)vcpu->run->msr.data);
+		kvm_rdx_write(vcpu, vcpu->run->msr.data >> 32);
+	}
+
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
+static int complete_emulated_rdmsr(struct kvm_vcpu *vcpu)
+{
+	return complete_emulated_msr(vcpu, true);
+}
+
+static int complete_emulated_wrmsr(struct kvm_vcpu *vcpu)
+{
+	return complete_emulated_msr(vcpu, false);
+}
+
+static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index,
+			      u32 exit_reason, u64 data,
+			      int (*completion)(struct kvm_vcpu *vcpu))
+{
+	if (!vcpu->kvm->arch.user_space_msr_enabled)
+		return 0;
+
+	vcpu->run->exit_reason = exit_reason;
+	vcpu->run->msr.error = 0;
+	vcpu->run->msr.pad[0] = 0;
+	vcpu->run->msr.pad[1] = 0;
+	vcpu->run->msr.pad[2] = 0;
+	vcpu->run->msr.index = index;
+	vcpu->run->msr.data = data;
+	vcpu->arch.complete_userspace_io = completion;
+
+	return 1;
+}
+
+static int kvm_get_msr_user_space(struct kvm_vcpu *vcpu, u32 index)
+{
+	return kvm_msr_user_space(vcpu, index, KVM_EXIT_X86_RDMSR, 0,
+				  complete_emulated_rdmsr);
+}
+
+static int kvm_set_msr_user_space(struct kvm_vcpu *vcpu, u32 index, u64 data)
+{
+	return kvm_msr_user_space(vcpu, index, KVM_EXIT_X86_WRMSR, data,
+				  complete_emulated_wrmsr);
+}
+
 int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
 {
 	u32 ecx = kvm_rcx_read(vcpu);
 	u64 data;
+	int r;
 
-	if (kvm_get_msr(vcpu, ecx, &data)) {
+	r = kvm_get_msr(vcpu, ecx, &data);
+
+	/* MSR read failed? See if we should ask user space */
+	if (r && kvm_get_msr_user_space(vcpu, ecx)) {
+		/* Bounce to user space */
+		return 0;
+	}
+
+	/* MSR read failed? Inject a #GP */
+	if (r) {
 		trace_kvm_msr_read_ex(ecx);
 		kvm_inject_gp(vcpu, 0);
 		return 1;
@@ -1611,8 +1674,18 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
 {
 	u32 ecx = kvm_rcx_read(vcpu);
 	u64 data = kvm_read_edx_eax(vcpu);
+	int r;
+
+	r = kvm_set_msr(vcpu, ecx, data);
+
+	/* MSR write failed? See if we should ask user space */
+	if (r && kvm_set_msr_user_space(vcpu, ecx, data)) {
+		/* Bounce to user space */
+		return 0;
+	}
 
-	if (kvm_set_msr(vcpu, ecx, data)) {
+	/* MSR write failed? Inject a #GP */
+	if (r) {
 		trace_kvm_msr_write_ex(ecx, data);
 		kvm_inject_gp(vcpu, 0);
 		return 1;
@@ -3516,6 +3589,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_EXCEPTION_PAYLOAD:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_LAST_CPU:
+	case KVM_CAP_X86_USER_SPACE_MSR:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -5033,6 +5107,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		kvm->arch.exception_payload_enabled = cap->args[0];
 		r = 0;
 		break;
+	case KVM_CAP_X86_USER_SPACE_MSR:
+		kvm->arch.user_space_msr_enabled = cap->args[0];
+		r = 0;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -6362,13 +6440,33 @@ static void emulator_set_segment(struct x86_emulate_ctxt *ctxt, u16 selector,
 static int emulator_get_msr(struct x86_emulate_ctxt *ctxt,
 			    u32 msr_index, u64 *pdata)
 {
-	return kvm_get_msr(emul_to_vcpu(ctxt), msr_index, pdata);
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	int r;
+
+	r = kvm_get_msr(vcpu, msr_index, pdata);
+
+	if (r && kvm_get_msr_user_space(vcpu, msr_index)) {
+		/* Bounce to user space */
+		return X86EMUL_IO_NEEDED;
+	}
+
+	return r;
 }
 
 static int emulator_set_msr(struct x86_emulate_ctxt *ctxt,
 			    u32 msr_index, u64 data)
 {
-	return kvm_set_msr(emul_to_vcpu(ctxt), msr_index, data);
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	int r;
+
+	r = kvm_set_msr(emul_to_vcpu(ctxt), msr_index, data);
+
+	if (r && kvm_set_msr_user_space(vcpu, msr_index, data)) {
+		/* Bounce to user space */
+		return X86EMUL_IO_NEEDED;
+	}
+
+	return r;
 }
 
 static u64 emulator_get_smbase(struct x86_emulate_ctxt *ctxt)
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 9417a34aad08..26cfb0fa8e7e 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -17,7 +17,7 @@
 	ERSN(NMI), ERSN(INTERNAL_ERROR), ERSN(OSI), ERSN(PAPR_HCALL),	\
 	ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH), ERSN(EPR),\
 	ERSN(SYSTEM_EVENT), ERSN(S390_STSI), ERSN(IOAPIC_EOI),          \
-	ERSN(HYPERV), ERSN(ARM_NISV)
+	ERSN(HYPERV), ERSN(ARM_NISV), ERSN(X86_RDMSR), ERSN(X86_WRMSR)
 
 TRACE_EVENT(kvm_userspace_exit,
 	    TP_PROTO(__u32 reason, int errno),
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6d86033c4fa..6470c0c1e77a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -248,6 +248,8 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_IOAPIC_EOI       26
 #define KVM_EXIT_HYPERV           27
 #define KVM_EXIT_ARM_NISV         28
+#define KVM_EXIT_X86_RDMSR        29
+#define KVM_EXIT_X86_WRMSR        30
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -413,6 +415,13 @@ struct kvm_run {
 			__u64 esr_iss;
 			__u64 fault_ipa;
 		} arm_nisv;
+		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
+		struct {
+			__u8 error;
+			__u8 pad[3];
+			__u32 index;
+			__u64 data;
+		} msr;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
@@ -1035,6 +1044,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_LAST_CPU 184
 #define KVM_CAP_SMALLER_MAXPHYADDR 185
 #define KVM_CAP_S390_DIAG318 186
+#define KVM_CAP_X86_USER_SPACE_MSR 187
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19  8:53   ` Alexander Graf
  2020-08-31 10:39     ` Dan Carpenter
  2020-08-18 21:15 ` [PATCH v3 03/12] KVM: selftests: Add test for user space MSR handling Aaron Lewis
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis, KarimAllah Ahmed

It's not desireable to have all MSRs always handled by KVM kernel space. Some
MSRs would be useful to handle in user space to either emulate behavior (like
uCode updates) or differentiate whether they are valid based on the CPU model.

To allow user space to specify which MSRs it wants to see handled by KVM,
this patch introduces a new ioctl to push allow lists of bitmaps into
KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access.
With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the
denied MSR events to user space to operate on.

If no allowlist is populated, MSR handling stays identical to before.

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Alexander Graf <graf@amazon.com>

---

v2 -> v3:

  - document flags for KVM_X86_ADD_MSR_ALLOWLIST
  - generalize exit path, always unlock when returning
  - s/KVM_CAP_ADD_MSR_ALLOWLIST/KVM_CAP_X86_MSR_ALLOWLIST/g
  - Add KVM_X86_CLEAR_MSR_ALLOWLIST

v3 -> v4:
  - lock allow check and clearing
  - free bitmaps on clear

v4 -> v5:

  - use srcu

---
 Documentation/virt/kvm/api.rst  |  91 ++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  10 ++
 arch/x86/include/uapi/asm/kvm.h |  15 +++
 arch/x86/kvm/x86.c              | 160 ++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   5 +
 5 files changed, 281 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index aad51c33fcae..91ce3e4b5b2e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4704,6 +4704,82 @@ KVM_PV_VM_VERIFY
   Verify the integrity of the unpacked image. Only if this succeeds,
   KVM is allowed to start protected VCPUs.
 
+4.126 KVM_X86_ADD_MSR_ALLOWLIST
+-------------------------------
+
+:Capability: KVM_CAP_X86_MSR_ALLOWLIST
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_msr_allowlist
+:Returns: 0 on success, < 0 on error
+
+::
+
+  struct kvm_msr_allowlist {
+         __u32 flags;
+         __u32 nmsrs; /* number of msrs in bitmap */
+         __u32 base;  /* base address for the MSRs bitmap */
+         __u32 pad;
+
+         __u8 bitmap[0]; /* a set bit allows that the operation set in flags */
+  };
+
+flags values:
+
+KVM_MSR_ALLOW_READ
+
+  Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
+  indicates that a read should immediately fail, while a 1 indicates that
+  a read should be handled by the normal KVM MSR emulation logic.
+
+KVM_MSR_ALLOW_WRITE
+
+  Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
+  indicates that a write should immediately fail, while a 1 indicates that
+  a write should be handled by the normal KVM MSR emulation logic.
+
+KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE
+
+  Filter booth read and write accesses to MSRs using the given bitmap. A 0
+  in the bitmap indicates that both reads and writes should immediately fail,
+  while a 1 indicates that reads and writes should be handled by the normal
+  KVM MSR emulation logic.
+
+This ioctl allows user space to define a set of bitmaps of MSR ranges to
+specify whether a certain MSR access is allowed or not.
+
+If this ioctl has never been invoked, MSR accesses are not guarded and the
+old KVM in-kernel emulation behavior is fully preserved.
+
+As soon as the first allow list was specified, only allowed MSR accesses
+are permitted inside of KVM's MSR code.
+
+Each allowlist specifies a range of MSRs to potentially allow access on.
+The range goes from MSR index [base .. base+nmsrs]. The flags field
+indicates whether reads, writes or both reads and writes are permitted
+by setting a 1 bit in the bitmap for the corresponding MSR index.
+
+If an MSR access is not permitted through the allow list, it generates a
+#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
+allows user space to deflect and potentially handle various MSR accesses
+into user space.
+
+4.124 KVM_X86_CLEAR_MSR_ALLOWLIST
+---------------------------------
+
+:Capability: KVM_CAP_X86_MSR_ALLOWLIST
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
+
+This ioctl resets all internal MSR allow lists. After this call, no allow
+list is present and the guest would execute as if no allow lists were set,
+so all MSRs are considered allowed and thus handled by the in-kernel MSR
+emulation logic.
+
+No vCPU may be in running state when calling this ioctl.
+
 
 5. The kvm_run structure
 ========================
@@ -6221,3 +6297,18 @@ writes to user space. It can be enabled on a VM level. If enabled, MSR
 accesses that would usually trigger a #GP by KVM into the guest will
 instead get bounced to user space through the KVM_EXIT_X86_RDMSR and
 KVM_EXIT_X86_WRMSR exit notifications.
+
+8.25 KVM_CAP_X86_MSR_ALLOWLIST
+------------------------------
+
+:Architectures: x86
+
+This capability indicates that KVM supports emulation of only select MSR
+registers. With this capability exposed, KVM exports two new VM ioctls:
+KVM_X86_ADD_MSR_ALLOWLIST which user space can call to specify bitmaps of MSR
+ranges that KVM should emulate in kernel space and KVM_X86_CLEAR_MSR_ALLOWLIST
+which user space can call to remove all MSR allow lists from the VM context.
+
+In combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to
+trap and emulate MSRs that are outside of the scope of KVM as well as
+limit the attack surface on KVM's MSR emulation code.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 02a102c60dff..1ee8468c913c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -860,6 +860,13 @@ struct kvm_hv {
 	struct kvm_hv_syndbg hv_syndbg;
 };
 
+struct msr_bitmap_range {
+	u32 flags;
+	u32 nmsrs;
+	u32 base;
+	unsigned long *bitmap;
+};
+
 enum kvm_irqchip_mode {
 	KVM_IRQCHIP_NONE,
 	KVM_IRQCHIP_KERNEL,       /* created with KVM_CREATE_IRQCHIP */
@@ -964,6 +971,9 @@ struct kvm_arch {
 	/* Deflect RDMSR and WRMSR to user space when they trigger a #GP */
 	bool user_space_msr_enabled;
 
+	struct msr_bitmap_range msr_allowlist_ranges[10];
+	int msr_allowlist_ranges_count;
+
 	struct kvm_pmu_event_filter *pmu_event_filter;
 	struct task_struct *nx_lpage_recovery_thread;
 };
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 0780f97c1850..c33fb1d72d52 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -192,6 +192,21 @@ struct kvm_msr_list {
 	__u32 indices[0];
 };
 
+#define KVM_MSR_ALLOW_READ  (1 << 0)
+#define KVM_MSR_ALLOW_WRITE (1 << 1)
+
+/* Maximum size of the of the bitmap in bytes */
+#define KVM_MSR_ALLOWLIST_MAX_LEN 0x600
+
+/* for KVM_X86_ADD_MSR_ALLOWLIST */
+struct kvm_msr_allowlist {
+	__u32 flags;
+	__u32 nmsrs; /* number of msrs in bitmap */
+	__u32 base;  /* base address for the MSRs bitmap */
+	__u32 pad;
+
+	__u8 bitmap[0]; /* a set bit allows that the operation set in flags */
+};
 
 struct kvm_cpuid_entry {
 	__u32 function;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5d94a95fb66b..c46a709be532 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1486,6 +1486,39 @@ void kvm_enable_efer_bits(u64 mask)
 }
 EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
 
+static bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
+	u32 count = kvm->arch.msr_allowlist_ranges_count;
+	u32 i;
+	bool r = false;
+	int idx;
+
+	/* MSR allowlist not set up, allow everything */
+	if (!count)
+		return true;
+
+	/* Prevent collision with clear_msr_allowlist */
+	idx = srcu_read_lock(&kvm->srcu);
+
+	for (i = 0; i < count; i++) {
+		u32 start = ranges[i].base;
+		u32 end = start + ranges[i].nmsrs;
+		u32 flags = ranges[i].flags;
+		unsigned long *bitmap = ranges[i].bitmap;
+
+		if ((index >= start) && (index < end) && (flags & type)) {
+			r = !!test_bit(index - start, bitmap);
+			break;
+		}
+	}
+
+	srcu_read_unlock(&kvm->srcu, idx);
+
+	return r;
+}
+
 /*
  * Write @data into the MSR specified by @index.  Select MSR specific fault
  * checks are bypassed if @host_initiated is %true.
@@ -1497,6 +1530,9 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
 {
 	struct msr_data msr;
 
+	if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_ALLOW_WRITE))
+		return -ENOENT;
+
 	switch (index) {
 	case MSR_FS_BASE:
 	case MSR_GS_BASE:
@@ -1553,6 +1589,9 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
 	struct msr_data msr;
 	int ret;
 
+	if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_ALLOW_READ))
+		return -ENOENT;
+
 	msr.index = index;
 	msr.host_initiated = host_initiated;
 
@@ -3590,6 +3629,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_LAST_CPU:
 	case KVM_CAP_X86_USER_SPACE_MSR:
+	case KVM_CAP_X86_MSR_ALLOWLIST:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -5118,6 +5158,116 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static bool msr_range_overlaps(struct kvm *kvm, struct msr_bitmap_range *range)
+{
+	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
+	u32 i, count = kvm->arch.msr_allowlist_ranges_count;
+	bool r = false;
+
+	for (i = 0; i < count; i++) {
+		u32 start = max(range->base, ranges[i].base);
+		u32 end = min(range->base + range->nmsrs,
+			      ranges[i].base + ranges[i].nmsrs);
+
+		if ((start < end) && (range->flags & ranges[i].flags)) {
+			r = true;
+			break;
+		}
+	}
+
+	return r;
+}
+
+static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
+{
+	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
+	struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
+	struct msr_bitmap_range range;
+	struct kvm_msr_allowlist kernel_msr_allowlist;
+	unsigned long *bitmap = NULL;
+	size_t bitmap_size;
+	int r = 0;
+
+	if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
+			   sizeof(kernel_msr_allowlist))) {
+		r = -EFAULT;
+		goto out;
+	}
+
+	bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
+	if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
+		r = -EINVAL;
+		goto out;
+	}
+
+	bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
+	if (IS_ERR(bitmap)) {
+		r = PTR_ERR(bitmap);
+		goto out;
+	}
+
+	range = (struct msr_bitmap_range) {
+		.flags = kernel_msr_allowlist.flags,
+		.base = kernel_msr_allowlist.base,
+		.nmsrs = kernel_msr_allowlist.nmsrs,
+		.bitmap = bitmap,
+	};
+
+	if (range.flags & ~(KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE)) {
+		r = -EINVAL;
+		goto out;
+	}
+
+	/*
+	 * Protect from concurrent calls to this function that could trigger
+	 * a TOCTOU violation on kvm->arch.msr_allowlist_ranges_count.
+	 */
+	mutex_lock(&kvm->lock);
+
+	if (kvm->arch.msr_allowlist_ranges_count >=
+	    ARRAY_SIZE(kvm->arch.msr_allowlist_ranges)) {
+		r = -E2BIG;
+		goto out_locked;
+	}
+
+	if (msr_range_overlaps(kvm, &range)) {
+		r = -EINVAL;
+		goto out_locked;
+	}
+
+	/* Everything ok, add this range identifier to our global pool */
+	ranges[kvm->arch.msr_allowlist_ranges_count] = range;
+	/* Make sure we filled the array before we tell anyone to walk it */
+	smp_wmb();
+	kvm->arch.msr_allowlist_ranges_count++;
+
+out_locked:
+	mutex_unlock(&kvm->lock);
+out:
+	if (r)
+		kfree(bitmap);
+
+	return r;
+}
+
+static int kvm_vm_ioctl_clear_msr_allowlist(struct kvm *kvm)
+{
+	int i;
+	u32 count = kvm->arch.msr_allowlist_ranges_count;
+	struct msr_bitmap_range ranges[10];
+
+	mutex_lock(&kvm->lock);
+	kvm->arch.msr_allowlist_ranges_count = 0;
+	memcpy(ranges, kvm->arch.msr_allowlist_ranges, count * sizeof(ranges[0]));
+	mutex_unlock(&kvm->lock);
+	synchronize_srcu(&kvm->srcu);
+
+	for (i = 0; i < count; i++)
+		kfree(ranges[i].bitmap);
+
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -5424,6 +5574,12 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	case KVM_SET_PMU_EVENT_FILTER:
 		r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
 		break;
+	case KVM_X86_ADD_MSR_ALLOWLIST:
+		r = kvm_vm_ioctl_add_msr_allowlist(kvm, argp);
+		break;
+	case KVM_X86_CLEAR_MSR_ALLOWLIST:
+		r = kvm_vm_ioctl_clear_msr_allowlist(kvm);
+		break;
 	default:
 		r = -ENOTTY;
 	}
@@ -10123,6 +10279,8 @@ void kvm_arch_pre_destroy_vm(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	int i;
+
 	if (current->mm == kvm->mm) {
 		/*
 		 * Free memory regions allocated on behalf of userspace,
@@ -10139,6 +10297,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	}
 	if (kvm_x86_ops.vm_destroy)
 		kvm_x86_ops.vm_destroy(kvm);
+	for (i = 0; i < kvm->arch.msr_allowlist_ranges_count; i++)
+		kfree(kvm->arch.msr_allowlist_ranges[i].bitmap);
 	kvm_pic_destroy(kvm);
 	kvm_ioapic_destroy(kvm);
 	kvm_free_vcpus(kvm);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6470c0c1e77a..374021dc4e61 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1045,6 +1045,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_SMALLER_MAXPHYADDR 185
 #define KVM_CAP_S390_DIAG318 186
 #define KVM_CAP_X86_USER_SPACE_MSR 187
+#define KVM_CAP_X86_MSR_ALLOWLIST 188
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1546,6 +1547,10 @@ struct kvm_pv_cmd {
 /* Available with KVM_CAP_S390_PROTECTED */
 #define KVM_S390_PV_COMMAND		_IOWR(KVMIO, 0xc5, struct kvm_pv_cmd)
 
+/* Available with KVM_CAP_X86_MSR_ALLOWLIST */
+#define KVM_X86_ADD_MSR_ALLOWLIST      _IOW(KVMIO,  0xc6, struct kvm_msr_allowlist)
+#define KVM_X86_CLEAR_MSR_ALLOWLIST    _IO(KVMIO,  0xc7)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 03/12] KVM: selftests: Add test for user space MSR handling
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list Aaron Lewis
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Now that we have the ability to handle MSRs from user space and also to
select which ones we do want to prevent in-kernel KVM code from handling,
let's add a selftest to show case and verify the API.

Signed-off-by: Alexander Graf <graf@amazon.com>

---

v2 -> v3:

  - s/KVM_CAP_ADD_MSR_ALLOWLIST/KVM_CAP_X86_MSR_ALLOWLIST/g
  - Add test to clear whitelist
  - Adjust to reply-less API
  - Fix asserts
  - Actually trap on MSR_IA32_POWER_CTL writes

---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/user_msr_test.c      | 221 ++++++++++++++++++
 2 files changed, 222 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/user_msr_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 4a166588d99f..80d5c348354c 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -55,6 +55,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
 TEST_GEN_PROGS_x86_64 += x86_64/xss_msr_test
 TEST_GEN_PROGS_x86_64 += x86_64/debug_regs
+TEST_GEN_PROGS_x86_64 += x86_64/user_msr_test
 TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/x86_64/user_msr_test.c b/tools/testing/selftests/kvm/x86_64/user_msr_test.c
new file mode 100644
index 000000000000..999544c674be
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/user_msr_test.c
@@ -0,0 +1,221 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tests for KVM_CAP_X86_USER_SPACE_MSR and KVM_X86_ADD_MSR_ALLOWLIST
+ *
+ * Copyright (C) 2020, Amazon Inc.
+ *
+ * This is a functional test to verify that we can deflect MSR events
+ * into user space.
+ */
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID                  5
+
+u32 msr_reads, msr_writes;
+
+struct range_desc {
+       struct kvm_msr_allowlist allow;
+       void (*populate)(struct kvm_msr_allowlist *range);
+};
+
+static void populate_c0000000_read(struct kvm_msr_allowlist *range)
+{
+       u8 *bitmap = range->bitmap;
+       u32 idx = MSR_SYSCALL_MASK & (KVM_MSR_ALLOWLIST_MAX_LEN - 1);
+
+       bitmap[idx / 8] &= ~(1 << (idx % 8));
+}
+
+static void populate_00000000_write(struct kvm_msr_allowlist *range)
+{
+       u8 *bitmap = range->bitmap;
+       u32 idx = MSR_IA32_POWER_CTL & (KVM_MSR_ALLOWLIST_MAX_LEN - 1);
+
+       bitmap[idx / 8] &= ~(1 << (idx % 8));
+}
+
+struct range_desc ranges[] = {
+       {
+               .allow = {
+                       .flags = KVM_MSR_ALLOW_READ,
+                       .base = 0x00000000,
+                       .nmsrs = KVM_MSR_ALLOWLIST_MAX_LEN * BITS_PER_BYTE,
+               },
+       }, {
+               .allow = {
+                       .flags = KVM_MSR_ALLOW_WRITE,
+                       .base = 0x00000000,
+                       .nmsrs = KVM_MSR_ALLOWLIST_MAX_LEN * BITS_PER_BYTE,
+               },
+               .populate = populate_00000000_write,
+       }, {
+               .allow = {
+                       .flags = KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE,
+                       .base = 0x40000000,
+                       .nmsrs = KVM_MSR_ALLOWLIST_MAX_LEN * BITS_PER_BYTE,
+               },
+       }, {
+               .allow = {
+                       .flags = KVM_MSR_ALLOW_READ,
+                       .base = 0xc0000000,
+                       .nmsrs = KVM_MSR_ALLOWLIST_MAX_LEN * BITS_PER_BYTE,
+               },
+               .populate = populate_c0000000_read,
+       }, {
+               .allow = {
+                       .flags = KVM_MSR_ALLOW_WRITE,
+                       .base = 0xc0000000,
+                       .nmsrs = KVM_MSR_ALLOWLIST_MAX_LEN * BITS_PER_BYTE,
+               },
+       },
+};
+
+static void guest_msr_calls(bool trapped)
+{
+       /* This goes into the in-kernel emulation */
+       wrmsr(MSR_SYSCALL_MASK, 0);
+
+       if (trapped) {
+               /* This goes into user space emulation */
+               GUEST_ASSERT(rdmsr(MSR_SYSCALL_MASK) == MSR_SYSCALL_MASK);
+       } else {
+               GUEST_ASSERT(rdmsr(MSR_SYSCALL_MASK) != MSR_SYSCALL_MASK);
+       }
+
+       /* If trapped == true, this goes into user space emulation */
+       wrmsr(MSR_IA32_POWER_CTL, 0x1234);
+
+       /* This goes into the in-kernel emulation */
+       rdmsr(MSR_IA32_POWER_CTL);
+}
+
+static void guest_code(void)
+{
+       guest_msr_calls(true);
+
+       /*
+        * Disable allow listing, so that the kernel
+        * handles everything in the next round
+        */
+       GUEST_SYNC(0);
+
+       guest_msr_calls(false);
+
+       GUEST_DONE();
+}
+
+static int handle_ucall(struct kvm_vm *vm)
+{
+       struct ucall uc;
+
+       switch (get_ucall(vm, VCPU_ID, &uc)) {
+       case UCALL_ABORT:
+               TEST_FAIL("Guest assertion not met");
+               break;
+       case UCALL_SYNC:
+               vm_ioctl(vm, KVM_X86_CLEAR_MSR_ALLOWLIST, NULL);
+               break;
+       case UCALL_DONE:
+               return 1;
+       default:
+               TEST_FAIL("Unknown ucall %lu", uc.cmd);
+       }
+
+       return 0;
+}
+
+static void handle_rdmsr(struct kvm_run *run)
+{
+       run->msr.data = run->msr.index;
+       msr_reads++;
+}
+
+static void handle_wrmsr(struct kvm_run *run)
+{
+       /* ignore */
+       msr_writes++;
+}
+
+int main(int argc, char *argv[])
+{
+       struct kvm_enable_cap cap = {
+               .cap = KVM_CAP_X86_USER_SPACE_MSR,
+               .args[0] = 1,
+       };
+       struct kvm_vm *vm;
+       struct kvm_run *run;
+       int rc;
+       int i;
+
+       /* Tell stdout not to buffer its content */
+       setbuf(stdout, NULL);
+
+       /* Create VM */
+       vm = vm_create_default(VCPU_ID, 0, guest_code);
+       vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+       run = vcpu_state(vm, VCPU_ID);
+
+       rc = kvm_check_cap(KVM_CAP_X86_USER_SPACE_MSR);
+       TEST_ASSERT(rc, "KVM_CAP_X86_USER_SPACE_MSR is available");
+       vm_enable_cap(vm, &cap);
+
+       rc = kvm_check_cap(KVM_CAP_X86_MSR_ALLOWLIST);
+       TEST_ASSERT(rc, "KVM_CAP_X86_MSR_ALLOWLIST is available");
+
+       /* Set up MSR allowlist */
+       for (i = 0; i < ARRAY_SIZE(ranges); i++) {
+               struct kvm_msr_allowlist *a = &ranges[i].allow;
+               u32 bitmap_size = a->nmsrs / BITS_PER_BYTE;
+               struct kvm_msr_allowlist *range = malloc(sizeof(*a) + bitmap_size);
+
+               TEST_ASSERT(range, "range alloc failed (%ld bytes)\n", sizeof(*a) + bitmap_size);
+
+               *range = *a;
+
+               /* Allow everything by default */
+               memset(range->bitmap, 0xff, bitmap_size);
+
+               if (ranges[i].populate)
+                       ranges[i].populate(range);
+
+               vm_ioctl(vm, KVM_X86_ADD_MSR_ALLOWLIST, range);
+       }
+
+       while (1) {
+               rc = _vcpu_run(vm, VCPU_ID);
+
+               TEST_ASSERT(rc == 0, "vcpu_run failed: %d\n", rc);
+
+               switch (run->exit_reason) {
+               case KVM_EXIT_X86_RDMSR:
+                       handle_rdmsr(run);
+                       break;
+               case KVM_EXIT_X86_WRMSR:
+                       handle_wrmsr(run);
+                       break;
+               case KVM_EXIT_IO:
+                       if (handle_ucall(vm))
+                               goto done;
+                       break;
+               }
+
+       }
+
+done:
+       TEST_ASSERT(msr_reads == 1, "Handled 1 rdmsr in user space");
+       TEST_ASSERT(msr_writes == 1, "Handled 1 wrmsr in user space");
+
+       kvm_vm_free(vm);
+
+       return 0;
+}
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (2 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 03/12] KVM: selftests: Add test for user space MSR handling Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19  9:00   ` Alexander Graf
  2020-08-18 21:15 ` [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr Aaron Lewis
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Add KVM_SET_EXIT_MSRS ioctl to allow userspace to pass in a list of MSRs
that force an exit to userspace when rdmsr or wrmsr are used by the
guest.

KVM_SET_EXIT_MSRS will need to be called before any vCPUs are
created to protect the 'user_exit_msrs' list from being mutated while
vCPUs are running.

Add KVM_CAP_SET_MSR_EXITS to identify the feature exists.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst  | 24 +++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/x86.c              | 41 +++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |  2 ++
 4 files changed, 69 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 91ce3e4b5b2e..e3cf1e971d0f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1010,6 +1010,30 @@ such as migration.
 :Parameters: struct kvm_vcpu_event (out)
 :Returns: 0 on success, -1 on error
 
+4.32 KVM_SET_EXIT_MSRS
+------------------
+
+:Capability: KVM_CAP_SET_MSR_EXITS
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_msr_list (in)
+:Returns: 0 on success, -1 on error
+
+Sets the userspace MSR list which is used to track which MSRs KVM should send
+to userspace to be serviced when the guest executes rdmsr or wrmsr.
+
+This ioctl needs to be called before vCPUs are setup otherwise the list of MSRs
+will not be accepted and an EINVAL error will be returned.  Also, if a list of
+MSRs has already been supplied, and this ioctl is called again an EEXIST error
+will be returned.
+
+::
+
+  struct kvm_msr_list {
+  __u32 nmsrs;
+  __u32 indices[0];
+};
+
 X86:
 ^^^^
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1ee8468c913c..6c4c5b972395 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -976,6 +976,8 @@ struct kvm_arch {
 
 	struct kvm_pmu_event_filter *pmu_event_filter;
 	struct task_struct *nx_lpage_recovery_thread;
+
+	struct kvm_msr_list *user_exit_msrs;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c46a709be532..e349d51d5d65 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3570,6 +3570,42 @@ static inline bool kvm_can_mwait_in_guest(void)
 		boot_cpu_has(X86_FEATURE_ARAT);
 }
 
+static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm,
+				      struct kvm_msr_list __user *user_msr_list)
+{
+	struct kvm_msr_list *msr_list, hdr;
+	size_t indices_size;
+
+	if (kvm->arch.user_exit_msrs != NULL)
+		return -EEXIST;
+
+	if (kvm->created_vcpus)
+		return -EINVAL;
+
+	if (copy_from_user(&hdr, user_msr_list, sizeof(hdr)))
+		return -EFAULT;
+
+	if (hdr.nmsrs >= MAX_IO_MSRS)
+		return -E2BIG;
+
+	indices_size = sizeof(hdr.indices[0]) * hdr.nmsrs;
+	msr_list = kvzalloc(sizeof(struct kvm_msr_list) + indices_size,
+			    GFP_KERNEL_ACCOUNT);
+	if (!msr_list)
+		return -ENOMEM;
+	msr_list->nmsrs = hdr.nmsrs;
+
+	if (copy_from_user(msr_list->indices, user_msr_list->indices,
+			   indices_size)) {
+		kvfree(msr_list);
+		return -EFAULT;
+	}
+
+	kvm->arch.user_exit_msrs = msr_list;
+
+	return 0;
+}
+
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	int r = 0;
@@ -3630,6 +3666,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_LAST_CPU:
 	case KVM_CAP_X86_USER_SPACE_MSR:
 	case KVM_CAP_X86_MSR_ALLOWLIST:
+	case KVM_CAP_SET_MSR_EXITS:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -5532,6 +5569,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_SET_EXIT_MSRS: {
+		r = kvm_vm_ioctl_set_exit_msrs(kvm, argp);
+		break;
+	}
 	case KVM_MEMORY_ENCRYPT_OP: {
 		r = -ENOTTY;
 		if (kvm_x86_ops.mem_enc_op)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 374021dc4e61..7d47d518a5d4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1046,6 +1046,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_DIAG318 186
 #define KVM_CAP_X86_USER_SPACE_MSR 187
 #define KVM_CAP_X86_MSR_ALLOWLIST 188
+#define KVM_CAP_SET_MSR_EXITS 189
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1386,6 +1387,7 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_PMU_EVENT_FILTER */
 #define KVM_SET_PMU_EVENT_FILTER  _IOW(KVMIO,  0xb2, struct kvm_pmu_event_filter)
 #define KVM_PPC_SVM_OFF		  _IO(KVMIO,  0xb3)
+#define KVM_SET_EXIT_MSRS	_IOW(KVMIO, 0xb4, struct kvm_msr_list)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (3 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19 10:25   ` Alexander Graf
  2020-08-20 18:17   ` Jim Mattson
  2020-08-18 21:15 ` [PATCH v3 06/12] KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs Aaron Lewis
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Add support for exiting to userspace on a rdmsr or wrmsr instruction if
the MSR being read from or written to is in the user_exit_msrs list.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v2 -> v3

  - Refactored commit based on Alexander Graf's changes in the first commit
    in this series.  Changes made were:
      - Updated member 'inject_gp' to 'error' based on struct msr in kvm_run.
      - Move flag 'vcpu->kvm->arch.user_space_msr_enabled' out of
        kvm_msr_user_space() to allow it to work with both methods that bounce
        to userspace (msr list and #GP fallback).  Updated caller functions
        to account for this change.
      - trace_kvm_msr has been moved up and combine with a previous call in
        complete_emulated_msr() based on the suggestion made by Alexander
        Graf <graf@amazon.com>.

---
 arch/x86/kvm/trace.h | 24 ++++++++++++++
 arch/x86/kvm/x86.c   | 76 ++++++++++++++++++++++++++++++++++++++------
 2 files changed, 90 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index b66432b015d2..755610befbb5 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -367,6 +367,30 @@ TRACE_EVENT(kvm_msr,
 #define trace_kvm_msr_read_ex(ecx)         trace_kvm_msr(0, ecx, 0, true)
 #define trace_kvm_msr_write_ex(ecx, data)  trace_kvm_msr(1, ecx, data, true)
 
+TRACE_EVENT(kvm_userspace_msr,
+	TP_PROTO(bool is_write, u8 error, u32 index, u64 data),
+	TP_ARGS(is_write, error, index, data),
+
+	TP_STRUCT__entry(
+		__field(bool,	is_write)
+		__field(u8,	error)
+		__field(u32,	index)
+		__field(u64,	data)
+	),
+
+	TP_fast_assign(
+		__entry->is_write	= is_write;
+		__entry->error	= error;
+		__entry->index		= index;
+		__entry->data		= data;
+	),
+
+	TP_printk("userspace %s %x = 0x%llx, %s",
+		  __entry->is_write ? "wrmsr" : "rdmsr",
+		  __entry->index, __entry->data,
+		  __entry->error ? "error" : "no_error")
+);
+
 /*
  * Tracepoint for guest CR access.
  */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e349d51d5d65..b370b3f4b4f3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -109,6 +109,8 @@ static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
 
+bool kvm_msr_user_exit(struct kvm *kvm, u32 index);
+
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
 
@@ -1629,11 +1631,19 @@ EXPORT_SYMBOL_GPL(kvm_set_msr);
 
 static int complete_emulated_msr(struct kvm_vcpu *vcpu, bool is_read)
 {
-	if (vcpu->run->msr.error) {
+	u32 ecx = vcpu->run->msr.index;
+	u64 data = vcpu->run->msr.data;
+	u8 error = vcpu->run->msr.error;
+
+	trace_kvm_userspace_msr(!is_read, error, ecx, data);
+	trace_kvm_msr(!is_read, ecx, data, !!error);
+
+	if (error) {
 		kvm_inject_gp(vcpu, 0);
+		return 1;
 	} else if (is_read) {
-		kvm_rax_write(vcpu, (u32)vcpu->run->msr.data);
-		kvm_rdx_write(vcpu, vcpu->run->msr.data >> 32);
+		kvm_rax_write(vcpu, (u32)data);
+		kvm_rdx_write(vcpu, data >> 32);
 	}
 
 	return kvm_skip_emulated_instruction(vcpu);
@@ -1653,9 +1663,6 @@ static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index,
 			      u32 exit_reason, u64 data,
 			      int (*completion)(struct kvm_vcpu *vcpu))
 {
-	if (!vcpu->kvm->arch.user_space_msr_enabled)
-		return 0;
-
 	vcpu->run->exit_reason = exit_reason;
 	vcpu->run->msr.error = 0;
 	vcpu->run->msr.pad[0] = 0;
@@ -1686,10 +1693,18 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
 	u64 data;
 	int r;
 
+	if (kvm_msr_user_exit(vcpu->kvm, ecx)) {
+		kvm_get_msr_user_space(vcpu, ecx);
+		/* Bounce to user space */
+		return 0;
+	}
+
+
 	r = kvm_get_msr(vcpu, ecx, &data);
 
 	/* MSR read failed? See if we should ask user space */
-	if (r && kvm_get_msr_user_space(vcpu, ecx)) {
+	if (r && vcpu->kvm->arch.user_space_msr_enabled) {
+		kvm_get_msr_user_space(vcpu, ecx);
 		/* Bounce to user space */
 		return 0;
 	}
@@ -1715,10 +1730,17 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
 	u64 data = kvm_read_edx_eax(vcpu);
 	int r;
 
+	if (kvm_msr_user_exit(vcpu->kvm, ecx)) {
+		kvm_set_msr_user_space(vcpu, ecx, data);
+		/* Bounce to user space */
+		return 0;
+	}
+
 	r = kvm_set_msr(vcpu, ecx, data);
 
 	/* MSR write failed? See if we should ask user space */
-	if (r && kvm_set_msr_user_space(vcpu, ecx, data)) {
+	if (r && vcpu->kvm->arch.user_space_msr_enabled) {
+		kvm_set_msr_user_space(vcpu, ecx, data);
 		/* Bounce to user space */
 		return 0;
 	}
@@ -3606,6 +3628,25 @@ static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm,
 	return 0;
 }
 
+bool kvm_msr_user_exit(struct kvm *kvm, u32 index)
+{
+	struct kvm_msr_list *exit_msrs;
+	int i;
+
+	exit_msrs = kvm->arch.user_exit_msrs;
+
+	if (!exit_msrs)
+		return false;
+
+	for (i = 0; i < exit_msrs->nmsrs; ++i) {
+		if (exit_msrs->indices[i] == index)
+			return true;
+	}
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_msr_user_exit);
+
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	int r = 0;
@@ -6640,9 +6681,16 @@ static int emulator_get_msr(struct x86_emulate_ctxt *ctxt,
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	int r;
 
+	if (kvm_msr_user_exit(vcpu->kvm, msr_index)) {
+		kvm_get_msr_user_space(vcpu, msr_index);
+		/* Bounce to user space */
+		return X86EMUL_IO_NEEDED;
+	}
+
 	r = kvm_get_msr(vcpu, msr_index, pdata);
 
-	if (r && kvm_get_msr_user_space(vcpu, msr_index)) {
+	if (r && vcpu->kvm->arch.user_space_msr_enabled) {
+		kvm_get_msr_user_space(vcpu, msr_index);
 		/* Bounce to user space */
 		return X86EMUL_IO_NEEDED;
 	}
@@ -6656,9 +6704,16 @@ static int emulator_set_msr(struct x86_emulate_ctxt *ctxt,
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	int r;
 
+	if (kvm_msr_user_exit(vcpu->kvm, msr_index)) {
+		kvm_set_msr_user_space(vcpu, msr_index, data);
+		/* Bounce to user space */
+		return X86EMUL_IO_NEEDED;
+	}
+
 	r = kvm_set_msr(emul_to_vcpu(ctxt), msr_index, data);
 
-	if (r && kvm_set_msr_user_space(vcpu, msr_index, data)) {
+	if (r && vcpu->kvm->arch.user_space_msr_enabled) {
+		kvm_set_msr_user_space(vcpu, msr_index, data);
 		/* Bounce to user space */
 		return X86EMUL_IO_NEEDED;
 	}
@@ -11090,3 +11145,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_unaccelerated_access);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_incomplete_ipi);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_ga_log);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_apicv_update_request);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_userspace_msr);
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 06/12] KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (4 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Prepare vmx and svm for a subsequent change that ensures the MSR permission
bitmap is set to allow an MSR that userspace is tracking to force a vmx_vmexit
in the guest.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/svm/svm.c    | 48 +++++++++++-----------
 arch/x86/kvm/vmx/nested.c |  2 +-
 arch/x86/kvm/vmx/vmx.c    | 83 +++++++++++++++++++--------------------
 arch/x86/kvm/vmx/vmx.h    |  2 +-
 4 files changed, 67 insertions(+), 68 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 03dd7bac8034..56e9cf284c2a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -564,7 +564,7 @@ static bool valid_msr_intercept(u32 index)
 	return false;
 }
 
-static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr)
+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
 {
 	u8 bit_write;
 	unsigned long tmp;
@@ -583,9 +583,11 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr)
 	return !!test_bit(bit_write,  &tmp);
 }
 
-static void set_msr_interception(u32 *msrpm, unsigned msr,
-				 int read, int write)
+static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
+				 int write)
 {
+	struct vcpu_svm *svm = to_svm(vcpu);
+	u32 *msrpm = svm->msrpm;
 	u8 bit_read, bit_write;
 	unsigned long tmp;
 	u32 offset;
@@ -609,7 +611,7 @@ static void set_msr_interception(u32 *msrpm, unsigned msr,
 	msrpm[offset] = tmp;
 }
 
-static void svm_vcpu_init_msrpm(u32 *msrpm)
+static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm)
 {
 	int i;
 
@@ -619,7 +621,7 @@ static void svm_vcpu_init_msrpm(u32 *msrpm)
 		if (!direct_access_msrs[i].always)
 			continue;
 
-		set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
+		set_msr_interception(vcpu, direct_access_msrs[i].index, 1, 1);
 	}
 }
 
@@ -666,26 +668,26 @@ static void init_msrpm_offsets(void)
 	}
 }
 
-static void svm_enable_lbrv(struct vcpu_svm *svm)
+static void svm_enable_lbrv(struct kvm_vcpu *vcpu)
 {
-	u32 *msrpm = svm->msrpm;
+	struct vcpu_svm *svm = to_svm(vcpu);
 
 	svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
-	set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1);
-	set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1);
-	set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 1, 1);
-	set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
+	set_msr_interception(vcpu, MSR_IA32_LASTBRANCHFROMIP, 1, 1);
+	set_msr_interception(vcpu, MSR_IA32_LASTBRANCHTOIP, 1, 1);
+	set_msr_interception(vcpu, MSR_IA32_LASTINTFROMIP, 1, 1);
+	set_msr_interception(vcpu, MSR_IA32_LASTINTTOIP, 1, 1);
 }
 
-static void svm_disable_lbrv(struct vcpu_svm *svm)
+static void svm_disable_lbrv(struct kvm_vcpu *vcpu)
 {
-	u32 *msrpm = svm->msrpm;
+	struct vcpu_svm *svm = to_svm(vcpu);
 
 	svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
-	set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0);
-	set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0);
-	set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 0, 0);
-	set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0);
+	set_msr_interception(vcpu, MSR_IA32_LASTBRANCHFROMIP, 0, 0);
+	set_msr_interception(vcpu, MSR_IA32_LASTBRANCHTOIP, 0, 0);
+	set_msr_interception(vcpu, MSR_IA32_LASTINTFROMIP, 0, 0);
+	set_msr_interception(vcpu, MSR_IA32_LASTINTTOIP, 0, 0);
 }
 
 void disable_nmi_singlestep(struct vcpu_svm *svm)
@@ -1211,10 +1213,10 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	clear_page(svm->nested.hsave);
 
 	svm->msrpm = page_address(msrpm_pages);
-	svm_vcpu_init_msrpm(svm->msrpm);
+	svm_vcpu_init_msrpm(vcpu, svm->msrpm);
 
 	svm->nested.msrpm = page_address(nested_msrpm_pages);
-	svm_vcpu_init_msrpm(svm->nested.msrpm);
+	svm_vcpu_init_msrpm(vcpu, svm->nested.msrpm);
 
 	svm->vmcb = page_address(page);
 	clear_page(svm->vmcb);
@@ -2556,7 +2558,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		 * We update the L1 MSR bit as well since it will end up
 		 * touching the MSR anyway now.
 		 */
-		set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1);
+		set_msr_interception(vcpu, MSR_IA32_SPEC_CTRL, 1, 1);
 		break;
 	case MSR_IA32_PRED_CMD:
 		if (!msr->host_initiated &&
@@ -2571,7 +2573,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 			break;
 
 		wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
-		set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1);
+		set_msr_interception(vcpu, MSR_IA32_PRED_CMD, 0, 1);
 		break;
 	case MSR_AMD64_VIRT_SPEC_CTRL:
 		if (!msr->host_initiated &&
@@ -2635,9 +2637,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		svm->vmcb->save.dbgctl = data;
 		vmcb_mark_dirty(svm->vmcb, VMCB_LBR);
 		if (data & (1ULL<<0))
-			svm_enable_lbrv(svm);
+			svm_enable_lbrv(vcpu);
 		else
-			svm_disable_lbrv(svm);
+			svm_disable_lbrv(vcpu);
 		break;
 	case MSR_VM_HSAVE_PA:
 		svm->nested.hsave_msr = data;
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 23b58c28a1c9..d50eaaf36f70 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4752,7 +4752,7 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu)
 
 	if (vmx_pt_mode_is_host_guest()) {
 		vmx->pt_desc.guest.ctl = 0;
-		pt_update_intercept_for_msr(vmx);
+		pt_update_intercept_for_msr(vcpu);
 	}
 
 	return 0;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 46ba2e03a892..de03df72e742 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -343,7 +343,7 @@ module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
 
 static bool guest_state_valid(struct kvm_vcpu *vcpu);
 static u32 vmx_segment_access_rights(struct kvm_segment *var);
-static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu,
 							  u32 msr, int type);
 
 void vmx_vmexit(void);
@@ -2082,7 +2082,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * in the merging. We update the vmcs01 here for L1 as well
 		 * since it will end up touching the MSR anyway now.
 		 */
-		vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap,
+		vmx_disable_intercept_for_msr(vcpu,
 					      MSR_IA32_SPEC_CTRL,
 					      MSR_TYPE_RW);
 		break;
@@ -2118,8 +2118,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * vmcs02.msr_bitmap here since it gets completely overwritten
 		 * in the merging.
 		 */
-		vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
-					      MSR_TYPE_W);
+		vmx_disable_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W);
 		break;
 	case MSR_IA32_CR_PAT:
 		if (!kvm_pat_valid(data))
@@ -2169,7 +2168,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		vmcs_write64(GUEST_IA32_RTIT_CTL, data);
 		vmx->pt_desc.guest.ctl = data;
-		pt_update_intercept_for_msr(vmx);
+		pt_update_intercept_for_msr(vcpu);
 		break;
 	case MSR_IA32_RTIT_STATUS:
 		if (!pt_can_write_msr(vmx))
@@ -3688,9 +3687,11 @@ void free_vpid(int vpid)
 	spin_unlock(&vmx_vpid_lock);
 }
 
-static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu,
 							  u32 msr, int type)
 {
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
 	int f = sizeof(unsigned long);
 
 	if (!cpu_has_vmx_msr_bitmap())
@@ -3726,9 +3727,11 @@ static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bit
 	}
 }
 
-static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
+static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu,
 							 u32 msr, int type)
 {
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
 	int f = sizeof(unsigned long);
 
 	if (!cpu_has_vmx_msr_bitmap())
@@ -3764,13 +3767,13 @@ static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitm
 	}
 }
 
-static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap,
-			     			      u32 msr, int type, bool value)
+static __always_inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu,
+						      u32 msr, int type, bool value)
 {
 	if (value)
-		vmx_enable_intercept_for_msr(msr_bitmap, msr, type);
+		vmx_enable_intercept_for_msr(vcpu, msr, type);
 	else
-		vmx_disable_intercept_for_msr(msr_bitmap, msr, type);
+		vmx_disable_intercept_for_msr(vcpu, msr, type);
 }
 
 static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
@@ -3788,8 +3791,8 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
 	return mode;
 }
 
-static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap,
-					 u8 mode)
+static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu,
+					 unsigned long *msr_bitmap, u8 mode)
 {
 	int msr;
 
@@ -3804,11 +3807,11 @@ static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap,
 		 * TPR reads and writes can be virtualized even if virtual interrupt
 		 * delivery is not in use.
 		 */
-		vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW);
 		if (mode & MSR_BITMAP_MODE_X2APIC_APICV) {
-			vmx_enable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
-			vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
-			vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
+			vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW);
+			vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
+			vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
 		}
 	}
 }
@@ -3824,30 +3827,24 @@ void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
 		return;
 
 	if (changed & (MSR_BITMAP_MODE_X2APIC | MSR_BITMAP_MODE_X2APIC_APICV))
-		vmx_update_msr_bitmap_x2apic(msr_bitmap, mode);
+		vmx_update_msr_bitmap_x2apic(vcpu, msr_bitmap, mode);
 
 	vmx->msr_bitmap_mode = mode;
 }
 
-void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
+void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
 {
-	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	bool flag = !(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN);
 	u32 i;
 
-	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS,
-							MSR_TYPE_RW, flag);
-	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE,
-							MSR_TYPE_RW, flag);
-	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK,
-							MSR_TYPE_RW, flag);
-	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH,
-							MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_STATUS, MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_OUTPUT_BASE, MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_OUTPUT_MASK, MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_CR3_MATCH, MSR_TYPE_RW, flag);
 	for (i = 0; i < vmx->pt_desc.addr_range; i++) {
-		vmx_set_intercept_for_msr(msr_bitmap,
-			MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag);
-		vmx_set_intercept_for_msr(msr_bitmap,
-			MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag);
 	}
 }
 
@@ -6980,18 +6977,18 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 		goto free_pml;
 
 	msr_bitmap = vmx->vmcs01.msr_bitmap;
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_TSC, MSR_TYPE_R);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
-	vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_TSC, MSR_TYPE_R);
+	vmx_disable_intercept_for_msr(vcpu, MSR_FS_BASE, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_GS_BASE, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
+	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
 	if (kvm_cstate_in_guest(vcpu->kvm)) {
-		vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R);
-		vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
-		vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
-		vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
 	}
 	vmx->msr_bitmap_mode = 0;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 26175a4759fa..8767d5c30bf1 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -351,7 +351,7 @@ bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu);
 void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
 void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
 struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr);
-void pt_update_intercept_for_msr(struct vcpu_vmx *vmx);
+void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
 void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
 int vmx_find_msr_index(struct vmx_msrs *m, u32 msr);
 int vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int r,
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (5 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 06/12] KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19  1:12     ` kernel test robot
                     ` (3 more replies)
  2020-08-18 21:15 ` [PATCH v3 08/12] selftests: kvm: Fix the segment descriptor layout to match the actual layout Aaron Lewis
                   ` (4 subsequent siblings)
  11 siblings, 4 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
intercepts" describe MSR permission bitmaps.  Permission bitmaps are
used to control whether an execution of rdmsr or wrmsr will cause a
vm exit.  For userspace tracked MSRs it is required they cause a vm
exit, so the host is able to forward the MSR to userspace.  This change
adds vmx/svm support to ensure the permission bitmap is properly set to
cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
userspace tracked MSRs.  Also, to avoid repeatedly setting them,
kvm_make_request() is used to coalesce these into a single call.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ++
 arch/x86/kvm/svm/svm.c          | 49 ++++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmx.c          | 13 ++++++++-
 arch/x86/kvm/x86.c              | 16 +++++++++++
 4 files changed, 70 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6c4c5b972395..65e9dcc19cc2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -87,6 +87,7 @@
 #define KVM_REQ_HV_TLB_FLUSH \
 	KVM_ARCH_REQ_FLAGS(27, KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_APF_READY		KVM_ARCH_REQ(28)
+#define KVM_REQ_USER_MSR_UPDATE KVM_ARCH_REQ(29)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1242,6 +1243,8 @@ struct kvm_x86_ops {
 	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
 
 	void (*migrate_timers)(struct kvm_vcpu *vcpu);
+
+	void (*set_user_msr_intercept)(struct kvm_vcpu *vcpu, u32 msr);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 56e9cf284c2a..c49d121ee102 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -583,13 +583,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
 	return !!test_bit(bit_write,  &tmp);
 }
 
+static void __set_msr_interception(u32 *msrpm, u32 msr, int read, int write,
+				   u32 offset)
+{
+	u8 bit_read, bit_write;
+	unsigned long tmp;
+
+	bit_read  = 2 * (msr & 0x0f);
+	bit_write = 2 * (msr & 0x0f) + 1;
+	tmp       = msrpm[offset];
+
+	read  ? clear_bit(bit_read,  &tmp) : set_bit(bit_read,  &tmp);
+	write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp);
+
+	msrpm[offset] = tmp;
+}
+
 static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
 				 int write)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u32 *msrpm = svm->msrpm;
-	u8 bit_read, bit_write;
-	unsigned long tmp;
 	u32 offset;
 
 	/*
@@ -598,17 +612,30 @@ static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
 	 */
 	WARN_ON(!valid_msr_intercept(msr));
 
-	offset    = svm_msrpm_offset(msr);
-	bit_read  = 2 * (msr & 0x0f);
-	bit_write = 2 * (msr & 0x0f) + 1;
-	tmp       = msrpm[offset];
-
+	offset = svm_msrpm_offset(msr);
 	BUG_ON(offset == MSR_INVALID);
 
-	read  ? clear_bit(bit_read,  &tmp) : set_bit(bit_read,  &tmp);
-	write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp);
+	__set_msr_interception(msrpm, msr, read, write, offset);
 
-	msrpm[offset] = tmp;
+	if (read || write)
+		kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu);
+}
+
+static void set_user_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
+				      int write)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	u32 *msrpm = svm->msrpm;
+	u32 offset;
+
+	offset = svm_msrpm_offset(msr);
+	if (offset != MSR_INVALID)
+		__set_msr_interception(msrpm, msr, read, write, offset);
+}
+
+void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+{
+	set_user_msr_interception(vcpu, msr, 0, 0);
 }
 
 static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm)
@@ -4153,6 +4180,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.need_emulation_on_page_fault = svm_need_emulation_on_page_fault,
 
 	.apic_init_signal_blocked = svm_apic_init_signal_blocked,
+
+	.set_user_msr_intercept = svm_set_user_msr_intercept,
 };
 
 static struct kvm_x86_init_ops svm_init_ops __initdata = {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index de03df72e742..12478ea7aac7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3725,6 +3725,10 @@ static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu,
 			__clear_bit(msr, msr_bitmap + 0xc00 / f);
 
 	}
+
+	if (type & MSR_TYPE_R || type & MSR_TYPE_W) {
+		kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu);
+	}
 }
 
 static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu,
@@ -3792,7 +3796,7 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
 }
 
 static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu,
-					 unsigned long *msr_bitmap, u8 mode)
+					unsigned long *msr_bitmap, u8 mode)
 {
 	int msr;
 
@@ -3816,6 +3820,11 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu,
 	}
 }
 
+void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+{
+	vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW);
+}
+
 void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -8002,6 +8011,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
 	.apic_init_signal_blocked = vmx_apic_init_signal_blocked,
 	.migrate_timers = vmx_migrate_timers,
+
+	.set_user_msr_intercept = vmx_set_user_msr_intercept,
 };
 
 static __init int hardware_setup(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b370b3f4b4f3..44cbcf22ec36 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3647,6 +3647,19 @@ bool kvm_msr_user_exit(struct kvm *kvm, u32 index)
 }
 EXPORT_SYMBOL_GPL(kvm_msr_user_exit);
 
+static void kvm_set_user_msr_intercepts(struct kvm_vcpu *vcpu)
+{
+	struct kvm_msr_list *msr_list = vcpu->kvm->arch.user_exit_msrs;
+	u32 i, msr;
+
+	if (msr_list) {
+		for (i = 0; i < msr_list->nmsrs; i++) {
+			msr = msr_list->indices[i];
+			kvm_x86_ops.set_user_msr_intercept(vcpu, msr);
+		}
+	}
+}
+
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	int r = 0;
@@ -8823,6 +8836,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_vcpu_update_apicv(vcpu);
 		if (kvm_check_request(KVM_REQ_APF_READY, vcpu))
 			kvm_check_async_pf_completion(vcpu);
+
+		if (kvm_check_request(KVM_REQ_USER_MSR_UPDATE, vcpu))
+			kvm_set_user_msr_intercepts(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 08/12] selftests: kvm: Fix the segment descriptor layout to match the actual layout
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (6 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported Aaron Lewis
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Fix the layout of 'struct desc64' to match the layout described in the
SDM Vol 3, 3.4.5 Segment Descriptors, Figure 3-8.  The test added later
in this series relies on this and crashes if this layout is not correct.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v2 -> v3

  - Pulled changes to kvm_seg_fill_gdt_64bit() from a subsequent commit to
    here.

---
 tools/testing/selftests/kvm/include/x86_64/processor.h | 2 +-
 tools/testing/selftests/kvm/lib/x86_64/processor.c     | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 82b7fe16a824..0a65e7bb5249 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -59,7 +59,7 @@ struct gpr64_regs {
 struct desc64 {
 	uint16_t limit0;
 	uint16_t base0;
-	unsigned base1:8, s:1, type:4, dpl:2, p:1;
+	unsigned base1:8, type:4, s:1, dpl:2, p:1;
 	unsigned limit1:4, avl:1, l:1, db:1, g:1, base2:8;
 	uint32_t base3;
 	uint32_t zero1;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index f6eb34eaa0d2..1ccf6c9b3476 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -392,11 +392,12 @@ static void kvm_seg_fill_gdt_64bit(struct kvm_vm *vm, struct kvm_segment *segp)
 	desc->limit0 = segp->limit & 0xFFFF;
 	desc->base0 = segp->base & 0xFFFF;
 	desc->base1 = segp->base >> 16;
-	desc->s = segp->s;
 	desc->type = segp->type;
+	desc->s = segp->s;
 	desc->dpl = segp->dpl;
 	desc->p = segp->present;
 	desc->limit1 = segp->limit >> 16;
+	desc->avl = segp->avl;
 	desc->l = segp->l;
 	desc->db = segp->db;
 	desc->g = segp->g;
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (7 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 08/12] selftests: kvm: Fix the segment descriptor layout to match the actual layout Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-19  9:13   ` Andrew Jones
  2020-08-18 21:15 ` [PATCH v3 10/12] selftests: kvm: Add exception handling to selftests Aaron Lewis
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Ensure the out value 'uc' in get_ucall() is properly reporting
UCALL_NONE if the call fails.  The return value will be correctly
reported, however, the out parameter 'uc' will not be.  Clear the struct
to ensure the correct value is being reported in the out parameter.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v2 -> v3

 - This commit is new to the series.  This was added to have the ucall changes
   separate from the exception handling changes and the addition of the test.
 - Added support on aarch64 and s390x as well.

---
 tools/testing/selftests/kvm/lib/aarch64/ucall.c | 3 +++
 tools/testing/selftests/kvm/lib/s390x/ucall.c   | 3 +++
 tools/testing/selftests/kvm/lib/x86_64/ucall.c  | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/tools/testing/selftests/kvm/lib/aarch64/ucall.c b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
index c8e0ec20d3bf..2f37b90ee1a9 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/ucall.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
@@ -94,6 +94,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
 	struct kvm_run *run = vcpu_state(vm, vcpu_id);
 	struct ucall ucall = {};
 
+	if (uc)
+		memset(uc, 0, sizeof(*uc));
+
 	if (run->exit_reason == KVM_EXIT_MMIO &&
 	    run->mmio.phys_addr == (uint64_t)ucall_exit_mmio_addr) {
 		vm_vaddr_t gva;
diff --git a/tools/testing/selftests/kvm/lib/s390x/ucall.c b/tools/testing/selftests/kvm/lib/s390x/ucall.c
index fd589dc9bfab..9d3b0f15249a 100644
--- a/tools/testing/selftests/kvm/lib/s390x/ucall.c
+++ b/tools/testing/selftests/kvm/lib/s390x/ucall.c
@@ -38,6 +38,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
 	struct kvm_run *run = vcpu_state(vm, vcpu_id);
 	struct ucall ucall = {};
 
+	if (uc)
+		memset(uc, 0, sizeof(*uc));
+
 	if (run->exit_reason == KVM_EXIT_S390_SIEIC &&
 	    run->s390_sieic.icptcode == 4 &&
 	    (run->s390_sieic.ipa >> 8) == 0x83 &&    /* 0x83 means DIAGNOSE */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
index da4d89ad5419..a3489973e290 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
@@ -40,6 +40,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
 	struct kvm_run *run = vcpu_state(vm, vcpu_id);
 	struct ucall ucall = {};
 
+	if (uc)
+		memset(uc, 0, sizeof(*uc));
+
 	if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) {
 		struct kvm_regs regs;
 
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 10/12] selftests: kvm: Add exception handling to selftests
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (8 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 11/12] selftests: kvm: Add a test to exercise the userspace MSR list Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 12/12] selftests: kvm: Add emulated rdmsr, wrmsr tests Aaron Lewis
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Add the infrastructure needed to enable exception handling in selftests.
This allows any of the exception and interrupt vectors to be overridden
in the guest.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v2 -> v3

  - This commit is new to the series.  It was added to have all the
    infrastructure changes needed to support excpetion handling alone
    in one commit.  The selftest that was included with this change is now
    in another commit. 
  - Removed 'dummy' variable.  This was added to match other regs structs, but
    wasn't needed.  Removed stack adjustment for this in handlers.S as well.

---
 tools/testing/selftests/kvm/Makefile          |  19 ++--
 .../selftests/kvm/include/x86_64/processor.h  |  24 +++++
 tools/testing/selftests/kvm/lib/kvm_util.c    |  15 +++
 .../selftests/kvm/lib/kvm_util_internal.h     |   2 +
 .../selftests/kvm/lib/x86_64/handlers.S       |  81 ++++++++++++++
 .../selftests/kvm/lib/x86_64/processor.c      | 100 +++++++++++++++++-
 6 files changed, 232 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/handlers.S

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 80d5c348354c..6ba4f61a9765 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -34,7 +34,7 @@ ifeq ($(ARCH),s390)
 endif
 
 LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c
-LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c
+LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c
 
@@ -109,14 +109,21 @@ LDFLAGS += -pthread $(no-pie-option) $(pgste-option)
 include ../lib.mk
 
 STATIC_LIBS := $(OUTPUT)/libkvm.a
-LIBKVM_OBJ := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBKVM))
-EXTRA_CLEAN += $(LIBKVM_OBJ) $(STATIC_LIBS) cscope.*
+LIBKVM_C := $(filter %.c,$(LIBKVM))
+LIBKVM_S := $(filter %.S,$(LIBKVM))
+LIBKVM_C_OBJ := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBKVM_C))
+LIBKVM_S_OBJ := $(patsubst %.S, $(OUTPUT)/%.o, $(LIBKVM_S))
+EXTRA_CLEAN += $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ) $(STATIC_LIBS) cscope.*
+
+x := $(shell mkdir -p $(sort $(dir $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ))))
+$(LIBKVM_C_OBJ): $(OUTPUT)/%.o: %.c
+	$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
 
-x := $(shell mkdir -p $(sort $(dir $(LIBKVM_OBJ))))
-$(LIBKVM_OBJ): $(OUTPUT)/%.o: %.c
+$(LIBKVM_S_OBJ): $(OUTPUT)/%.o: %.S
 	$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@
 
-$(OUTPUT)/libkvm.a: $(LIBKVM_OBJ)
+LIBKVM_OBJS = $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ)
+$(OUTPUT)/libkvm.a: $(LIBKVM_OBJS)
 	$(AR) crs $@ $^
 
 x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 0a65e7bb5249..02530dc6339b 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -36,6 +36,8 @@
 #define X86_CR4_SMAP		(1ul << 21)
 #define X86_CR4_PKE		(1ul << 22)
 
+#define UNEXPECTED_VECTOR_PORT 0xfff0u
+
 /* General Registers in 64-Bit Mode */
 struct gpr64_regs {
 	u64 rax;
@@ -239,6 +241,11 @@ static inline struct desc_ptr get_idt(void)
 	return idt;
 }
 
+static inline void outl(uint16_t port, uint32_t value)
+{
+	__asm__ __volatile__("outl %%eax, %%dx" : : "d"(port), "a"(value));
+}
+
 #define SET_XMM(__var, __xmm) \
 	asm volatile("movq %0, %%"#__xmm : : "r"(__var) : #__xmm)
 
@@ -338,6 +345,23 @@ uint32_t kvm_get_cpuid_max_basic(void);
 uint32_t kvm_get_cpuid_max_extended(void);
 void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits);
 
+struct ex_regs {
+	uint64_t rax, rcx, rdx, rbx;
+	uint64_t rbp, rsi, rdi;
+	uint64_t r8, r9, r10, r11;
+	uint64_t r12, r13, r14, r15;
+	uint64_t vector;
+	uint64_t error_code;
+	uint64_t rip;
+	uint64_t cs;
+	uint64_t rflags;
+};
+
+void vm_init_descriptor_tables(struct kvm_vm *vm);
+void vcpu_init_descriptor_tables(struct kvm_vm *vm, uint32_t vcpuid);
+void vm_handle_exception(struct kvm_vm *vm, int vector,
+			void (*handler)(struct ex_regs *));
+
 /*
  * Basic CPU control in CR0
  */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 74776ee228f2..9eed3fc21c39 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1195,6 +1195,21 @@ int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid)
 	do {
 		rc = ioctl(vcpu->fd, KVM_RUN, NULL);
 	} while (rc == -1 && errno == EINTR);
+
+#ifdef __x86_64__
+	if (vcpu_state(vm, vcpuid)->exit_reason == KVM_EXIT_IO
+		&& vcpu_state(vm, vcpuid)->io.port == UNEXPECTED_VECTOR_PORT
+		&& vcpu_state(vm, vcpuid)->io.size == 4) {
+		/* Grab pointer to io data */
+		uint32_t *data = (void *)vcpu_state(vm, vcpuid)
+			+ vcpu_state(vm, vcpuid)->io.data_offset;
+
+		TEST_ASSERT(false,
+			    "Unexpected vectored event in guest (vector:0x%x)",
+			    *data);
+	}
+#endif
+
 	return rc;
 }
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util_internal.h b/tools/testing/selftests/kvm/lib/kvm_util_internal.h
index 2ef446520748..f07d383d03a1 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util_internal.h
+++ b/tools/testing/selftests/kvm/lib/kvm_util_internal.h
@@ -50,6 +50,8 @@ struct kvm_vm {
 	vm_paddr_t pgd;
 	vm_vaddr_t gdt;
 	vm_vaddr_t tss;
+	vm_vaddr_t idt;
+	vm_vaddr_t handlers;
 };
 
 struct vcpu *vcpu_find(struct kvm_vm *vm, uint32_t vcpuid);
diff --git a/tools/testing/selftests/kvm/lib/x86_64/handlers.S b/tools/testing/selftests/kvm/lib/x86_64/handlers.S
new file mode 100644
index 000000000000..aaf7bc7d2ce1
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/x86_64/handlers.S
@@ -0,0 +1,81 @@
+handle_exception:
+	push %r15
+	push %r14
+	push %r13
+	push %r12
+	push %r11
+	push %r10
+	push %r9
+	push %r8
+
+	push %rdi
+	push %rsi
+	push %rbp
+	push %rbx
+	push %rdx
+	push %rcx
+	push %rax
+	mov %rsp, %rdi
+
+	call route_exception
+
+	pop %rax
+	pop %rcx
+	pop %rdx
+	pop %rbx
+	pop %rbp
+	pop %rsi
+	pop %rdi
+	pop %r8
+	pop %r9
+	pop %r10
+	pop %r11
+	pop %r12
+	pop %r13
+	pop %r14
+	pop %r15
+
+	/* Discard vector and error code. */
+	add $16, %rsp
+	iretq
+
+/*
+ * Build the handle_exception wrappers which push the vector/error code on the
+ * stack and an array of pointers to those wrappers.
+ */
+.pushsection .rodata
+.globl idt_handlers
+idt_handlers:
+.popsection
+
+.macro HANDLERS has_error from to
+	vector = \from
+	.rept \to - \from + 1
+	.align 8
+
+	/* Fetch current address and append it to idt_handlers. */
+	current_handler = .
+.pushsection .rodata
+.quad current_handler
+.popsection
+
+	.if ! \has_error
+	pushq $0
+	.endif
+	pushq $vector
+	jmp handle_exception
+	vector = vector + 1
+	.endr
+.endm
+
+.global idt_handler_code
+idt_handler_code:
+	HANDLERS has_error=0 from=0  to=7
+	HANDLERS has_error=1 from=8  to=8
+	HANDLERS has_error=0 from=9  to=9
+	HANDLERS has_error=1 from=10 to=14
+	HANDLERS has_error=0 from=15 to=16
+	HANDLERS has_error=1 from=17 to=17
+	HANDLERS has_error=0 from=18 to=255
+
+.section        .note.GNU-stack, "", %progbits
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 1ccf6c9b3476..c15817b36267 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -12,6 +12,13 @@
 #include "../kvm_util_internal.h"
 #include "processor.h"
 
+#ifndef NUM_INTERRUPTS
+#define NUM_INTERRUPTS 256
+#endif
+
+#define DEFAULT_CODE_SELECTOR 0x8
+#define DEFAULT_DATA_SELECTOR 0x10
+
 /* Minimum physical address used for virtual translation tables. */
 #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000
 
@@ -557,9 +564,9 @@ static void vcpu_setup(struct kvm_vm *vm, int vcpuid, int pgd_memslot, int gdt_m
 		sregs.efer |= (EFER_LME | EFER_LMA | EFER_NX);
 
 		kvm_seg_set_unusable(&sregs.ldt);
-		kvm_seg_set_kernel_code_64bit(vm, 0x8, &sregs.cs);
-		kvm_seg_set_kernel_data_64bit(vm, 0x10, &sregs.ds);
-		kvm_seg_set_kernel_data_64bit(vm, 0x10, &sregs.es);
+		kvm_seg_set_kernel_code_64bit(vm, DEFAULT_CODE_SELECTOR, &sregs.cs);
+		kvm_seg_set_kernel_data_64bit(vm, DEFAULT_DATA_SELECTOR, &sregs.ds);
+		kvm_seg_set_kernel_data_64bit(vm, DEFAULT_DATA_SELECTOR, &sregs.es);
 		kvm_setup_tss_64bit(vm, &sregs.tr, 0x18, gdt_memslot, pgd_memslot);
 		break;
 
@@ -1119,3 +1126,90 @@ void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits)
 		*va_bits = (entry->eax >> 8) & 0xff;
 	}
 }
+
+struct idt_entry {
+	uint16_t offset0;
+	uint16_t selector;
+	uint16_t ist : 3;
+	uint16_t : 5;
+	uint16_t type : 4;
+	uint16_t : 1;
+	uint16_t dpl : 2;
+	uint16_t p : 1;
+	uint16_t offset1;
+	uint32_t offset2; uint32_t reserved;
+};
+
+static void set_idt_entry(struct kvm_vm *vm, int vector, unsigned long addr,
+			  int dpl, unsigned short selector)
+{
+	struct idt_entry *base =
+		(struct idt_entry *)addr_gva2hva(vm, vm->idt);
+	struct idt_entry *e = &base[vector];
+
+	memset(e, 0, sizeof(*e));
+	e->offset0 = addr;
+	e->selector = selector;
+	e->ist = 0;
+	e->type = 14;
+	e->dpl = dpl;
+	e->p = 1;
+	e->offset1 = addr >> 16;
+	e->offset2 = addr >> 32;
+}
+
+void kvm_exit_unexpected_vector(uint32_t value)
+{
+	outl(UNEXPECTED_VECTOR_PORT, value);
+}
+
+void route_exception(struct ex_regs *regs)
+{
+	typedef void(*handler)(struct ex_regs *);
+	handler *handlers;
+
+	handlers = (handler *)rdmsr(MSR_GS_BASE);
+
+	if (handlers[regs->vector]) {
+		handlers[regs->vector](regs);
+		return;
+	}
+
+	kvm_exit_unexpected_vector(regs->vector);
+}
+
+void vm_init_descriptor_tables(struct kvm_vm *vm)
+{
+	extern void *idt_handlers;
+	int i;
+
+	vm->idt = vm_vaddr_alloc(vm, getpagesize(), 0x2000, 0, 0);
+	vm->handlers = vm_vaddr_alloc(vm, 256 * sizeof(void *), 0x2000, 0, 0);
+	/* Handlers have the same address in both address spaces.*/
+	for (i = 0; i < NUM_INTERRUPTS; i++)
+		set_idt_entry(vm, i, (unsigned long)(&idt_handlers)[i], 0,
+			DEFAULT_CODE_SELECTOR);
+}
+
+void vcpu_init_descriptor_tables(struct kvm_vm *vm, uint32_t vcpuid)
+{
+	struct kvm_sregs sregs;
+
+	vcpu_sregs_get(vm, vcpuid, &sregs);
+	sregs.idt.base = vm->idt;
+	sregs.idt.limit = NUM_INTERRUPTS * sizeof(struct idt_entry) - 1;
+	sregs.gdt.base = vm->gdt;
+	sregs.gdt.limit = getpagesize() - 1;
+	/* Use GS Base to pass the pointer to the handlers to the guest.*/
+	kvm_seg_set_kernel_data_64bit(NULL, DEFAULT_DATA_SELECTOR, &sregs.gs);
+	sregs.gs.base = (unsigned long) vm->handlers;
+	vcpu_sregs_set(vm, vcpuid, &sregs);
+}
+
+void vm_handle_exception(struct kvm_vm *vm, int vector,
+			 void (*handler)(struct ex_regs *))
+{
+	vm_vaddr_t *handlers = (vm_vaddr_t *)addr_gva2hva(vm, vm->handlers);
+
+	handlers[vector] = (vm_vaddr_t)handler;
+}
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 11/12] selftests: kvm: Add a test to exercise the userspace MSR list
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (9 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 10/12] selftests: kvm: Add exception handling to selftests Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  2020-08-18 21:15 ` [PATCH v3 12/12] selftests: kvm: Add emulated rdmsr, wrmsr tests Aaron Lewis
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Add a selftest to test that when ioctl KVM_SET_EXIT_MSRS is called with
an MSR list the guest exits to the host and then to userspace when an
MSR in that list is read from or written to.

This test uses 3 MSRs to test these new features:
  1. MSR_IA32_XSS, an MSR the kernel knows about.
  2. MSR_IA32_FLUSH_CMD, an MSR the kernel does not know about.
  3. MSR_NON_EXISTENT, an MSR invented in this test for the purposes of
     passing a fake MSR from the guest to userspace and having the guest
     be able to read from and write to it, with userspace handling it.
     KVM just acts as a pass through.

Userspace is also able to inject a #GP.  This is demonstrated when
MSR_IA32_XSS and MSR_IA32_FLUSH_CMD are misused in the test.  When this
happens a #GP is initiated in userspace to be thrown in the guest.  The
#GP exception has been overridden in the test so it can be handled
gracefully.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v2 -> v3

 - Simplified this change by removing exception handling support from it.
   This commit now just implements the test needed to verify the changes made
   in this series.

---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/processor.h  |   3 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |   2 +
 .../selftests/kvm/lib/x86_64/processor.c      |  65 ++++
 .../selftests/kvm/x86_64/userspace_msr_exit.c | 279 ++++++++++++++++++
 6 files changed, 351 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 452787152748..33619f915857 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -14,6 +14,7 @@
 /x86_64/vmx_preemption_timer_test
 /x86_64/svm_vmcall_test
 /x86_64/sync_regs_test
+/x86_64/userspace_msr_exit
 /x86_64/vmx_close_while_nested_test
 /x86_64/vmx_dirty_log_test
 /x86_64/vmx_set_nested_state_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 6ba4f61a9765..15536d98fe02 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -49,6 +49,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/state_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_preemption_timer_test
 TEST_GEN_PROGS_x86_64 += x86_64/svm_vmcall_test
 TEST_GEN_PROGS_x86_64 += x86_64/sync_regs_test
+TEST_GEN_PROGS_x86_64 += x86_64/userspace_msr_exit
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 02530dc6339b..df3ceb1af166 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -341,6 +341,9 @@ int _vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index,
 void vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index,
 	  	  uint64_t msr_value);
 
+void kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs,
+	uint32_t msr_indices[]);
+
 uint32_t kvm_get_cpuid_max_basic(void);
 uint32_t kvm_get_cpuid_max_extended(void);
 void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 9eed3fc21c39..f8dde1cdbef0 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1605,6 +1605,8 @@ static struct exit_reason {
 	{KVM_EXIT_INTERNAL_ERROR, "INTERNAL_ERROR"},
 	{KVM_EXIT_OSI, "OSI"},
 	{KVM_EXIT_PAPR_HCALL, "PAPR_HCALL"},
+	{KVM_EXIT_X86_RDMSR, "RDMSR"},
+	{KVM_EXIT_X86_WRMSR, "WRMSR"},
 #ifdef KVM_EXIT_MEMORY_NOT_PRESENT
 	{KVM_EXIT_MEMORY_NOT_PRESENT, "MEMORY_NOT_PRESENT"},
 #endif
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index c15817b36267..7022528fd938 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -851,6 +851,71 @@ void vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index,
 		"  rc: %i errno: %i", r, errno);
 }
 
+/*
+ * __KVM Set Exit MSR
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   nmsrs - Number of msrs in msr_indices
+ *   msr_indices[] - List of msrs.
+ *
+ * Output Args: None
+ *
+ * Return: The result of KVM_SET_EXIT_MSRS.
+ *
+ * Sets a list of MSRs that will force an exit to userspace when
+ * any of them are read from or written to by the guest.
+ */
+int __kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs,
+	uint32_t msr_indices[])
+{
+	const uint32_t max_nmsrs = 256;
+	struct kvm_msr_list *msr_list;
+	uint32_t i;
+	int r;
+
+	TEST_ASSERT(nmsrs <= max_nmsrs,
+		"'nmsrs' is too large.  Max is %u, currently %u.\n",
+		max_nmsrs, nmsrs);
+	uint32_t msr_list_byte_size = sizeof(struct kvm_msr_list) +
+							     (sizeof(msr_list->indices[0]) * nmsrs);
+	msr_list = alloca(msr_list_byte_size);
+	memset(msr_list, 0, msr_list_byte_size);
+
+	msr_list->nmsrs = nmsrs;
+	for (i = 0; i < nmsrs; i++)
+		msr_list->indices[i] = msr_indices[i];
+
+	r = ioctl(vm->fd, KVM_SET_EXIT_MSRS, msr_list);
+
+	return r;
+}
+
+/*
+ * KVM Set Exit MSR
+ *
+ * Input Args:
+ *   vm - Virtual Machine
+ *   nmsrs - Number of msrs in msr_indices
+ *   msr_indices[] - List of msrs.
+ *
+ * Output Args: None
+ *
+ * Return: None
+ *
+ * Sets a list of MSRs that will force an exit to userspace when
+ * any of them are read from or written to by the guest.
+ */
+void kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs,
+	uint32_t msr_indices[])
+{
+	int r;
+
+	r = __kvm_set_exit_msrs(vm, nmsrs, msr_indices);
+	TEST_ASSERT(r == 0, "KVM_SET_EXIT_MSRS IOCTL failed,\n"
+		"  rc: %i errno: %i", r, errno);
+}
+
 void vcpu_args_set(struct kvm_vm *vm, uint32_t vcpuid, unsigned int num, ...)
 {
 	va_list ap;
diff --git a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c
new file mode 100644
index 000000000000..79acfe004e78
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c
@@ -0,0 +1,279 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020, Google LLC.
+ *
+ * Tests for exiting into userspace on registered MSRs
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "vmx.h"
+
+#define VCPU_ID	      1
+
+#define MSR_NON_EXISTENT 0x474f4f00
+
+uint32_t msrs[] = {
+	/* Test an MSR the kernel knows about. */
+	MSR_IA32_XSS,
+	/* Test an MSR the kernel doesn't know about. */
+	MSR_IA32_FLUSH_CMD,
+	/* Test a fabricated MSR that no one knows about. */
+	MSR_NON_EXISTENT,
+};
+uint32_t nmsrs = ARRAY_SIZE(msrs);
+
+uint64_t msr_non_existent_data;
+int guest_exception_count;
+
+/*
+ * Note: Force test_rdmsr() to not be inlined to prevent the labels,
+ * rdmsr_start and rdmsr_end, from being defined multiple times.
+ */
+static noinline uint64_t test_rdmsr(uint32_t msr)
+{
+	uint32_t a, d;
+
+	guest_exception_count = 0;
+
+	__asm__ __volatile__("rdmsr_start: rdmsr; rdmsr_end:" :
+			"=a"(a), "=d"(d) : "c"(msr) : "memory");
+
+	return a | ((uint64_t) d << 32);
+}
+
+/*
+ * Note: Force test_wrmsr() to not be inlined to prevent the labels,
+ * wrmsr_start and wrmsr_end, from being defined multiple times.
+ */
+static noinline void test_wrmsr(uint32_t msr, uint64_t value)
+{
+	uint32_t a = value;
+	uint32_t d = value >> 32;
+
+	guest_exception_count = 0;
+
+	__asm__ __volatile__("wrmsr_start: wrmsr; wrmsr_end:" ::
+			"a"(a), "d"(d), "c"(msr) : "memory");
+}
+
+extern char rdmsr_start, rdmsr_end;
+extern char wrmsr_start, wrmsr_end;
+
+
+static void guest_code(void)
+{
+	uint64_t data;
+
+	/*
+	 * Test userspace intercepting rdmsr / wrmsr for MSR_IA32_XSS.
+	 *
+	 * A GP is thrown if anything other than 0 is written to
+	 * MSR_IA32_XSS.
+	 */
+	data = test_rdmsr(MSR_IA32_XSS);
+	GUEST_ASSERT(data == 0);
+	GUEST_ASSERT(guest_exception_count == 0);
+
+	test_wrmsr(MSR_IA32_XSS, 0);
+	GUEST_ASSERT(guest_exception_count == 0);
+
+	test_wrmsr(MSR_IA32_XSS, 1);
+	GUEST_ASSERT(guest_exception_count == 1);
+
+	/*
+	 * Test userspace intercepting rdmsr / wrmsr for MSR_IA32_FLUSH_CMD.
+	 *
+	 * A GP is thrown if MSR_IA32_FLUSH_CMD is read
+	 * from or if a value other than 1 is written to it.
+	 */
+	test_rdmsr(MSR_IA32_FLUSH_CMD);
+	GUEST_ASSERT(guest_exception_count == 1);
+
+	test_wrmsr(MSR_IA32_FLUSH_CMD, 0);
+	GUEST_ASSERT(guest_exception_count == 1);
+
+	test_wrmsr(MSR_IA32_FLUSH_CMD, 1);
+	GUEST_ASSERT(guest_exception_count == 0);
+
+	/*
+	 * Test userspace intercepting rdmsr / wrmsr for MSR_NON_EXISTENT.
+	 *
+	 * Test that a fabricated MSR can pass through the kernel
+	 * and be handled in userspace.
+	 */
+	test_wrmsr(MSR_NON_EXISTENT, 2);
+	GUEST_ASSERT(guest_exception_count == 0);
+
+	data = test_rdmsr(MSR_NON_EXISTENT);
+	GUEST_ASSERT(data == 2);
+	GUEST_ASSERT(guest_exception_count == 0);
+
+	GUEST_DONE();
+}
+
+static void guest_gp_handler(struct ex_regs *regs)
+{
+	if (regs->rip == (uintptr_t)&rdmsr_start) {
+		regs->rip = (uintptr_t)&rdmsr_end;
+		regs->rax = 0;
+		regs->rdx = 0;
+	} else if (regs->rip == (uintptr_t)&wrmsr_start) {
+		regs->rip = (uintptr_t)&wrmsr_end;
+	} else {
+		GUEST_ASSERT(!"RIP is at an unknown location!");
+	}
+
+	++guest_exception_count;
+}
+
+static void run_guest(struct kvm_vm *vm)
+{
+	int rc;
+
+	rc = _vcpu_run(vm, VCPU_ID);
+	TEST_ASSERT(rc == 0, "vcpu_run failed: %d\n", rc);
+}
+
+static void check_for_guest_assert(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc;
+
+	if (run->exit_reason == KVM_EXIT_IO &&
+		get_ucall(vm, VCPU_ID, &uc) == UCALL_ABORT) {
+			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
+				__FILE__, uc.args[1]);
+	}
+}
+
+static void process_rdmsr(struct kvm_vm *vm, uint32_t msr_index)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+
+	check_for_guest_assert(vm);
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_X86_RDMSR,
+		    "Unexpected exit reason: %u (%s),\n",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+	TEST_ASSERT(run->msr.index == msr_index,
+			"Unexpected msr (0x%04x), expected 0x%04x",
+			run->msr.index, msr_index);
+
+	switch (run->msr.index) {
+	case MSR_IA32_XSS:
+		run->msr.data = 0;
+		break;
+	case MSR_IA32_FLUSH_CMD:
+		run->msr.error = 1;
+		break;
+	case MSR_NON_EXISTENT:
+		run->msr.data = msr_non_existent_data;
+		break;
+	default:
+		TEST_ASSERT(false, "Unexpected MSR: 0x%04x", run->msr.index);
+	}
+}
+
+static void process_wrmsr(struct kvm_vm *vm, uint32_t msr_index)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+
+	check_for_guest_assert(vm);
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_X86_WRMSR,
+		    "Unexpected exit reason: %u (%s),\n",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+	TEST_ASSERT(run->msr.index == msr_index,
+			"Unexpected msr (0x%04x), expected 0x%04x",
+			run->msr.index, msr_index);
+
+	switch (run->msr.index) {
+	case MSR_IA32_XSS:
+		if (run->msr.data != 0)
+			run->msr.error = 1;
+		break;
+	case MSR_IA32_FLUSH_CMD:
+		if (run->msr.data != 1)
+			run->msr.error = 1;
+		break;
+	case MSR_NON_EXISTENT:
+		msr_non_existent_data = run->msr.data;
+		break;
+	default:
+		TEST_ASSERT(false, "Unexpected MSR: 0x%04x", run->msr.index);
+	}
+}
+
+static void process_ucall_done(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc;
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Unexpected exit reason: %u (%s)",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+
+	TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc) == UCALL_DONE,
+		    "Unexpected ucall command: %lu, expected UCALL_DONE (%d)",
+		    uc.cmd, UCALL_DONE);
+}
+
+static void run_guest_then_process_rdmsr(struct kvm_vm *vm, uint32_t msr_index)
+{
+	run_guest(vm);
+	process_rdmsr(vm, msr_index);
+}
+
+static void run_guest_then_process_wrmsr(struct kvm_vm *vm, uint32_t msr_index)
+{
+	run_guest(vm);
+	process_wrmsr(vm, msr_index);
+}
+
+static void run_guest_then_process_ucall_done(struct kvm_vm *vm)
+{
+	run_guest(vm);
+	process_ucall_done(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	struct kvm_vm *vm;
+
+	vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
+	kvm_vm_elf_load(vm, program_invocation_name, 0, 0);
+	vm_create_irqchip(vm);
+
+	kvm_set_exit_msrs(vm, nmsrs, msrs);
+
+	vm_vcpu_add_default(vm, VCPU_ID, guest_code);
+
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vm, VCPU_ID);
+
+	vm_handle_exception(vm, GP_VECTOR, guest_gp_handler);
+
+	/* Process guest code userspace exits */
+	run_guest_then_process_rdmsr(vm, MSR_IA32_XSS);
+	run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
+	run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
+
+	run_guest_then_process_rdmsr(vm, MSR_IA32_FLUSH_CMD);
+	run_guest_then_process_wrmsr(vm, MSR_IA32_FLUSH_CMD);
+	run_guest_then_process_wrmsr(vm, MSR_IA32_FLUSH_CMD);
+
+	run_guest_then_process_wrmsr(vm, MSR_NON_EXISTENT);
+	run_guest_then_process_rdmsr(vm, MSR_NON_EXISTENT);
+
+	run_guest_then_process_ucall_done(vm);
+
+	kvm_vm_free(vm);
+	return 0;
+}
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 12/12] selftests: kvm: Add emulated rdmsr, wrmsr tests
  2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
                   ` (10 preceding siblings ...)
  2020-08-18 21:15 ` [PATCH v3 11/12] selftests: kvm: Add a test to exercise the userspace MSR list Aaron Lewis
@ 2020-08-18 21:15 ` Aaron Lewis
  11 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-18 21:15 UTC (permalink / raw)
  To: jmattson, graf; +Cc: pshier, oupton, kvm, Aaron Lewis

Add tests to exercise the code paths for em_{rdmsr,wrmsr} and
emulator_{get,set}_msr.  For the generic instruction emulator to work
the module parameter kvm.force_emulation_prefix=1 has to be enabled.  If
it isn't the tests will be skipped.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---

v1 -> v2

 - This commit was added to test the changes made to rdmsr and wrmsr when 
   going through the generic instruction emulator which were made by Alexander
   Graf <graf@amazon.com>.

---
 .../selftests/kvm/x86_64/userspace_msr_exit.c | 158 +++++++++++++++++-
 1 file changed, 150 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c
index 79acfe004e78..04e695eb8ed5 100644
--- a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c
+++ b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c
@@ -12,8 +12,12 @@
 #include "kvm_util.h"
 #include "vmx.h"
 
-#define VCPU_ID	      1
+/* Forced emulation prefix, used to invoke the emulator unconditionally. */
+#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"
+#define KVM_FEP_LENGTH 5
+static int fep_available = 1;
 
+#define VCPU_ID	      1
 #define MSR_NON_EXISTENT 0x474f4f00
 
 uint32_t msrs[] = {
@@ -63,6 +67,39 @@ static noinline void test_wrmsr(uint32_t msr, uint64_t value)
 extern char rdmsr_start, rdmsr_end;
 extern char wrmsr_start, wrmsr_end;
 
+/*
+ * Note: Force test_em_rdmsr() to not be inlined to prevent the labels,
+ * rdmsr_start and rdmsr_end, from being defined multiple times.
+ */
+static noinline uint64_t test_em_rdmsr(uint32_t msr)
+{
+	uint32_t a, d;
+
+	guest_exception_count = 0;
+
+	__asm__ __volatile__(KVM_FEP "em_rdmsr_start: rdmsr; em_rdmsr_end:" :
+			"=a"(a), "=d"(d) : "c"(msr) : "memory");
+
+	return a | ((uint64_t) d << 32);
+}
+
+/*
+ * Note: Force test_em_wrmsr() to not be inlined to prevent the labels,
+ * wrmsr_start and wrmsr_end, from being defined multiple times.
+ */
+static noinline void test_em_wrmsr(uint32_t msr, uint64_t value)
+{
+	uint32_t a = value;
+	uint32_t d = value >> 32;
+
+	guest_exception_count = 0;
+
+	__asm__ __volatile__(KVM_FEP "em_wrmsr_start: wrmsr; em_wrmsr_end:" ::
+			"a"(a), "d"(d), "c"(msr) : "memory");
+}
+
+extern char em_rdmsr_start, em_rdmsr_end;
+extern char em_wrmsr_start, em_wrmsr_end;
 
 static void guest_code(void)
 {
@@ -112,17 +149,55 @@ static void guest_code(void)
 	GUEST_ASSERT(data == 2);
 	GUEST_ASSERT(guest_exception_count == 0);
 
+	/*
+	 * Test to see if the instruction emulator is available (ie: the module
+	 * parameter 'kvm.force_emulation_prefix=1' is set).  This instruction
+	 * will #UD if it isn't available.
+	 */
+	__asm__ __volatile__(KVM_FEP "nop");
+
+	if (fep_available) {
+		/* Let userspace know we aren't done. */
+		GUEST_SYNC(0);
+
+		/*
+		 * Now run the same tests with the instruction emulator.
+		 */
+		data = test_em_rdmsr(MSR_IA32_XSS);
+		GUEST_ASSERT(data == 0);
+		GUEST_ASSERT(guest_exception_count == 0);
+		test_em_wrmsr(MSR_IA32_XSS, 0);
+		GUEST_ASSERT(guest_exception_count == 0);
+		test_em_wrmsr(MSR_IA32_XSS, 1);
+		GUEST_ASSERT(guest_exception_count == 1);
+
+		test_em_rdmsr(MSR_IA32_FLUSH_CMD);
+		GUEST_ASSERT(guest_exception_count == 1);
+		test_em_wrmsr(MSR_IA32_FLUSH_CMD, 0);
+		GUEST_ASSERT(guest_exception_count == 1);
+		test_em_wrmsr(MSR_IA32_FLUSH_CMD, 1);
+		GUEST_ASSERT(guest_exception_count == 0);
+
+		test_em_wrmsr(MSR_NON_EXISTENT, 2);
+		GUEST_ASSERT(guest_exception_count == 0);
+		data = test_em_rdmsr(MSR_NON_EXISTENT);
+		GUEST_ASSERT(data == 2);
+		GUEST_ASSERT(guest_exception_count == 0);
+	}
+
 	GUEST_DONE();
 }
 
-static void guest_gp_handler(struct ex_regs *regs)
+static void __guest_gp_handler(struct ex_regs *regs,
+			       char *r_start, char *r_end,
+			       char *w_start, char *w_end)
 {
-	if (regs->rip == (uintptr_t)&rdmsr_start) {
-		regs->rip = (uintptr_t)&rdmsr_end;
+	if (regs->rip == (uintptr_t)r_start) {
+		regs->rip = (uintptr_t)r_end;
 		regs->rax = 0;
 		regs->rdx = 0;
-	} else if (regs->rip == (uintptr_t)&wrmsr_start) {
-		regs->rip = (uintptr_t)&wrmsr_end;
+	} else if (regs->rip == (uintptr_t)w_start) {
+		regs->rip = (uintptr_t)w_end;
 	} else {
 		GUEST_ASSERT(!"RIP is at an unknown location!");
 	}
@@ -130,6 +205,24 @@ static void guest_gp_handler(struct ex_regs *regs)
 	++guest_exception_count;
 }
 
+static void guest_gp_handler(struct ex_regs *regs)
+{
+	__guest_gp_handler(regs, &rdmsr_start, &rdmsr_end,
+			   &wrmsr_start, &wrmsr_end);
+}
+
+static void guest_fep_gp_handler(struct ex_regs *regs)
+{
+	__guest_gp_handler(regs, &em_rdmsr_start, &em_rdmsr_end,
+			   &em_wrmsr_start, &em_wrmsr_end);
+}
+
+static void guest_ud_handler(struct ex_regs *regs)
+{
+	fep_available = 0;
+	regs->rip += KVM_FEP_LENGTH;
+}
+
 static void run_guest(struct kvm_vm *vm)
 {
 	int rc;
@@ -225,6 +318,32 @@ static void process_ucall_done(struct kvm_vm *vm)
 		    uc.cmd, UCALL_DONE);
 }
 
+static uint64_t process_ucall(struct kvm_vm *vm)
+{
+	struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+	struct ucall uc = {};
+
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Unexpected exit reason: %u (%s)",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+
+	switch (get_ucall(vm, VCPU_ID, &uc)) {
+	case UCALL_SYNC:
+		break;
+	case UCALL_ABORT:
+		check_for_guest_assert(vm);
+		break;
+	case UCALL_DONE:
+		process_ucall_done(vm);
+		break;
+	default:
+		TEST_ASSERT(false, "Unexpected ucall");
+	}
+
+	return uc.cmd;
+}
+
 static void run_guest_then_process_rdmsr(struct kvm_vm *vm, uint32_t msr_index)
 {
 	run_guest(vm);
@@ -260,7 +379,7 @@ int main(int argc, char *argv[])
 
 	vm_handle_exception(vm, GP_VECTOR, guest_gp_handler);
 
-	/* Process guest code userspace exits */
+	/* Process guest code userspace exits. */
 	run_guest_then_process_rdmsr(vm, MSR_IA32_XSS);
 	run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
 	run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
@@ -272,7 +391,30 @@ int main(int argc, char *argv[])
 	run_guest_then_process_wrmsr(vm, MSR_NON_EXISTENT);
 	run_guest_then_process_rdmsr(vm, MSR_NON_EXISTENT);
 
-	run_guest_then_process_ucall_done(vm);
+	vm_handle_exception(vm, UD_VECTOR, guest_ud_handler);
+	run_guest(vm);
+	vm_handle_exception(vm, UD_VECTOR, NULL);
+
+	if (process_ucall(vm) != UCALL_DONE) {
+		vm_handle_exception(vm, GP_VECTOR, guest_fep_gp_handler);
+
+		/* Process emulated rdmsr and wrmsr instructions. */
+		run_guest_then_process_rdmsr(vm, MSR_IA32_XSS);
+		run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
+		run_guest_then_process_wrmsr(vm, MSR_IA32_XSS);
+
+		run_guest_then_process_rdmsr(vm, MSR_IA32_FLUSH_CMD);
+		run_guest_then_process_wrmsr(vm, MSR_IA32_FLUSH_CMD);
+		run_guest_then_process_wrmsr(vm, MSR_IA32_FLUSH_CMD);
+
+		run_guest_then_process_wrmsr(vm, MSR_NON_EXISTENT);
+		run_guest_then_process_rdmsr(vm, MSR_NON_EXISTENT);
+
+		/* Confirm the guest completed without issues. */
+		run_guest_then_process_ucall_done(vm);
+	} else {
+		printf("To run the instruction emulated tests set the module parameter 'kvm.force_emulation_prefix=1'\n");
+	}
 
 	kvm_vm_free(vm);
 	return 0;
-- 
2.28.0.220.ged08abb693-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
@ 2020-08-19  1:12     ` kernel test robot
  2020-08-19  1:12     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-19  1:12 UTC (permalink / raw)
  To: Aaron Lewis, jmattson, graf; +Cc: kbuild-all, pshier, oupton, kvm, Aaron Lewis

[-- Attachment #1: Type: text/plain, Size: 12871 bytes --]

Hi Aaron,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v5.9-rc1 next-20200818]
[cannot apply to kvms390/next vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: i386-randconfig-s001-20200818 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.2-183-gaa6ede3b-dirty
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> arch/x86/kvm/vmx/vmx.c:3823:6: sparse: sparse: symbol 'vmx_set_user_msr_intercept' was not declared. Should it be static?
   arch/x86/kvm/vmx/vmx.c: note: in included file:
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110011 becomes 11)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110011 becomes 11)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100110 becomes 110)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100310 becomes 310)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100510 becomes 510)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100410 becomes 410)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100310 becomes 310)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100510 becomes 510)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100410 becomes 410)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30203 becomes 203)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30203 becomes 203)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30283 becomes 283)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30283 becomes 283)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b019b becomes 19b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b021b becomes 21b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b029b becomes 29b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b031b becomes 31b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b041b becomes 41b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80c88 becomes c88)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110311 becomes 311)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120992 becomes 992)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120992 becomes 992)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100610 becomes 610)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100690 becomes 690)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100590 becomes 590)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80408 becomes 408)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120a92 becomes a92)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a099a becomes 99a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a091a becomes 91a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a048a becomes 48a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120a92 becomes a92)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a099a becomes 99a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a091a becomes 91a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a048a becomes 48a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a010a becomes 10a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a050a becomes 50a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a071a becomes 71a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a079a becomes 79a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a001a becomes 1a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a009a becomes 9a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180198 becomes 198)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a051a becomes 51a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120392 becomes 392)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120892 becomes 892)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a028a becomes 28a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a030a becomes 30a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a038a becomes 38a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a040a becomes 40a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a028a becomes 28a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a030a becomes 30a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a038a becomes 38a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a040a becomes 40a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100090 becomes 90)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100090 becomes 90)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180118 becomes 118)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a001a becomes 1a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80688 becomes 688)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a009a becomes 9a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100790 becomes 790)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100790 becomes 790)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180198 becomes 198)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120492 becomes 492)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a061a becomes 61a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120492 becomes 492)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a061a becomes 61a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120412 becomes 412)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a059a becomes 59a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120412 becomes 412)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a059a becomes 59a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (20402 becomes 402)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b001b becomes 1b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b009b becomes 9b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b011b becomes 11b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30083 becomes 83)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30183 becomes 183)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30003 becomes 3)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30103 becomes 103)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30303 becomes 303)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: too many warnings
--
>> arch/x86/kvm/svm/svm.c:636:6: sparse: sparse: symbol 'svm_set_user_msr_intercept' was not declared. Should it be static?
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38955 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
@ 2020-08-19  1:12     ` kernel test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-19  1:12 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 13012 bytes --]

Hi Aaron,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v5.9-rc1 next-20200818]
[cannot apply to kvms390/next vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: i386-randconfig-s001-20200818 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.2-183-gaa6ede3b-dirty
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> arch/x86/kvm/vmx/vmx.c:3823:6: sparse: sparse: symbol 'vmx_set_user_msr_intercept' was not declared. Should it be static?
   arch/x86/kvm/vmx/vmx.c: note: in included file:
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110011 becomes 11)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110011 becomes 11)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100110 becomes 110)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100310 becomes 310)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100510 becomes 510)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100410 becomes 410)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100310 becomes 310)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100510 becomes 510)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100410 becomes 410)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30203 becomes 203)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30203 becomes 203)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30283 becomes 283)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30283 becomes 283)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b019b becomes 19b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b021b becomes 21b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b029b becomes 29b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b031b becomes 31b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b041b becomes 41b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80c88 becomes c88)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120912 becomes 912)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (110311 becomes 311)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120992 becomes 992)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120992 becomes 992)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100610 becomes 610)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100690 becomes 690)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100590 becomes 590)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80408 becomes 408)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120a92 becomes a92)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a099a becomes 99a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a091a becomes 91a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a048a becomes 48a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120a92 becomes a92)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a099a becomes 99a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a091a becomes 91a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a048a becomes 48a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a010a becomes 10a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a050a becomes 50a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a071a becomes 71a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a079a becomes 79a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a001a becomes 1a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a009a becomes 9a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180198 becomes 198)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a051a becomes 51a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120392 becomes 392)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120892 becomes 892)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a081a becomes 81a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100490 becomes 490)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a028a becomes 28a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a030a becomes 30a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a038a becomes 38a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a040a becomes 40a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a028a becomes 28a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a030a becomes 30a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a038a becomes 38a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a040a becomes 40a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100090 becomes 90)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100090 becomes 90)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180118 becomes 118)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a001a becomes 1a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (80688 becomes 688)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a009a becomes 9a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100790 becomes 790)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (100790 becomes 790)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (180198 becomes 198)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a011a becomes 11a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120492 becomes 492)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a061a becomes 61a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120492 becomes 492)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a061a becomes 61a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120412 becomes 412)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a059a becomes 59a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (120412 becomes 412)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1a059a becomes 59a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (20402 becomes 402)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b001b becomes 1b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b009b becomes 9b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (1b011b becomes 11b)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30083 becomes 83)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30183 becomes 183)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30003 becomes 3)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30103 becomes 103)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (30303 becomes 303)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: too many warnings
--
>> arch/x86/kvm/svm/svm.c:636:6: sparse: sparse: symbol 'svm_set_user_msr_intercept' was not declared. Should it be static?
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)
   arch/x86/kvm/svm/svm.c:471:17: sparse: sparse: cast truncates bits from constant value (100000000 becomes 0)

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 38955 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH] KVM: x86: vmx_set_user_msr_intercept() can be static
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
@ 2020-08-19  1:12     ` kernel test robot
  2020-08-19  1:12     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-19  1:12 UTC (permalink / raw)
  To: Aaron Lewis, jmattson, graf; +Cc: kbuild-all, pshier, oupton, kvm, Aaron Lewis


Signed-off-by: kernel test robot <lkp@intel.com>
---
 svm/svm.c |    2 +-
 vmx/vmx.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c49d121ee1021..144724c0b4111 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -633,7 +633,7 @@ static void set_user_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
 		__set_msr_interception(msrpm, msr, read, write, offset);
 }
 
-void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+static void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
 {
 	set_user_msr_interception(vcpu, msr, 0, 0);
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 12478ea7aac71..20f432c698bcd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3820,7 +3820,7 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu,
 	}
 }
 
-void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+static void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
 {
 	vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW);
 }

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [RFC PATCH] KVM: x86: vmx_set_user_msr_intercept() can be static
@ 2020-08-19  1:12     ` kernel test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-19  1:12 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]


Signed-off-by: kernel test robot <lkp@intel.com>
---
 svm/svm.c |    2 +-
 vmx/vmx.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c49d121ee1021..144724c0b4111 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -633,7 +633,7 @@ static void set_user_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read,
 		__set_msr_interception(msrpm, msr, read, write, offset);
 }
 
-void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+static void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
 {
 	set_user_msr_interception(vcpu, msr, 0, 0);
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 12478ea7aac71..20f432c698bcd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3820,7 +3820,7 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu,
 	}
 }
 
-void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
+static void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
 {
 	vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW);
 }

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space
  2020-08-18 21:15 ` [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space Aaron Lewis
@ 2020-08-19  8:42   ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2020-08-19  8:42 UTC (permalink / raw)
  To: Aaron Lewis, jmattson; +Cc: pshier, oupton, kvm



On 18.08.20 23:15, Aaron Lewis wrote:
> 
> MSRs are weird. Some of them are normal control registers, such as EFER.
> Some however are registers that really are model specific, not very
> interesting to virtualization workloads, and not performance critical.
> Others again are really just windows into package configuration.
> 
> Out of these MSRs, only the first category is necessary to implement in
> kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against
> certain CPU models and MSRs that contain information on the package level
> are much better suited for user space to process. However, over time we have
> accumulated a lot of MSRs that are not the first category, but still handled
> by in-kernel KVM code.
> 
> This patch adds a generic interface to handle WRMSR and RDMSR from user
> space. With this, any future MSR that is part of the latter categories can
> be handled in user space.
> 
> Furthermore, it allows us to replace the existing "ignore_msrs" logic with
> something that applies per-VM rather than on the full system. That way you
> can run productive VMs in parallel to experimental ones where you don't care
> about proper MSR handling.
> 
> Signed-off-by: Alexander Graf <graf@amazon.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>

You need to add your Signed-off-by line here as well :).


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
  2020-08-18 21:15 ` [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation Aaron Lewis
@ 2020-08-19  8:53   ` Alexander Graf
  2020-08-31 10:39     ` Dan Carpenter
  1 sibling, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2020-08-19  8:53 UTC (permalink / raw)
  To: Aaron Lewis, jmattson; +Cc: pshier, oupton, kvm, KarimAllah Ahmed



On 18.08.20 23:15, Aaron Lewis wrote:
> 
> It's not desireable to have all MSRs always handled by KVM kernel space. Some
> MSRs would be useful to handle in user space to either emulate behavior (like
> uCode updates) or differentiate whether they are valid based on the CPU model.
> 
> To allow user space to specify which MSRs it wants to see handled by KVM,
> this patch introduces a new ioctl to push allow lists of bitmaps into
> KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access.
> With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the
> denied MSR events to user space to operate on.
> 
> If no allowlist is populated, MSR handling stays identical to before.
> 
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> Signed-off-by: Alexander Graf <graf@amazon.com>

Same here, SoB line is missing.

I also see that you didn't address the nits you had on this patch:

[...]

 >> +  Filter booth read and write accesses to MSRs using the given 
bitmap. A 0
 >> +  in the bitmap indicates that both reads and writes should 
immediately fail,
 >> +  while a 1 indicates that reads and writes should be handled by 
the normal
 >> +  KVM MSR emulation logic.
 >
 > nit: Filter both

[...]

 >> +/* Maximum size of the of the bitmap in bytes */
 >
 > nit: "of the" is repeated twice



Feel free to change them in your patch setand add a note between the SoB 
lines:

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de
Signed-off-by: Alexander Graf <graf@amazon.com>
[aaronlewis: s/of the of the/of the/, s/booth/both/]
Signed-off-by: Aaron Lewis <aaronlewis@google.com>


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list
  2020-08-18 21:15 ` [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list Aaron Lewis
@ 2020-08-19  9:00   ` Alexander Graf
  2020-08-20 17:30     ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-19  9:00 UTC (permalink / raw)
  To: Aaron Lewis, jmattson; +Cc: pshier, oupton, kvm



On 18.08.20 23:15, Aaron Lewis wrote:
> 
> Add KVM_SET_EXIT_MSRS ioctl to allow userspace to pass in a list of MSRs
> that force an exit to userspace when rdmsr or wrmsr are used by the
> guest.
> 
> KVM_SET_EXIT_MSRS will need to be called before any vCPUs are
> created to protect the 'user_exit_msrs' list from being mutated while
> vCPUs are running.
> 
> Add KVM_CAP_SET_MSR_EXITS to identify the feature exists.
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> Reviewed-by: Oliver Upton <oupton@google.com>

Why would we still need this with the allow list and user space #GP 
deflection logic in place?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  2020-08-18 21:15 ` [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported Aaron Lewis
@ 2020-08-19  9:13   ` Andrew Jones
  0 siblings, 0 replies; 50+ messages in thread
From: Andrew Jones @ 2020-08-19  9:13 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: jmattson, graf, pshier, oupton, kvm

On Tue, Aug 18, 2020 at 02:15:31PM -0700, Aaron Lewis wrote:
> Ensure the out value 'uc' in get_ucall() is properly reporting
> UCALL_NONE if the call fails.  The return value will be correctly
> reported, however, the out parameter 'uc' will not be.  Clear the struct
> to ensure the correct value is being reported in the out parameter.
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
> 
> v2 -> v3
> 
>  - This commit is new to the series.  This was added to have the ucall changes
>    separate from the exception handling changes and the addition of the test.
>  - Added support on aarch64 and s390x as well.
> 
> ---
>  tools/testing/selftests/kvm/lib/aarch64/ucall.c | 3 +++
>  tools/testing/selftests/kvm/lib/s390x/ucall.c   | 3 +++
>  tools/testing/selftests/kvm/lib/x86_64/ucall.c  | 3 +++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/ucall.c b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
> index c8e0ec20d3bf..2f37b90ee1a9 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/ucall.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
> @@ -94,6 +94,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
>  	struct kvm_run *run = vcpu_state(vm, vcpu_id);
>  	struct ucall ucall = {};
>  
> +	if (uc)
> +		memset(uc, 0, sizeof(*uc));
> +
>  	if (run->exit_reason == KVM_EXIT_MMIO &&
>  	    run->mmio.phys_addr == (uint64_t)ucall_exit_mmio_addr) {
>  		vm_vaddr_t gva;
> diff --git a/tools/testing/selftests/kvm/lib/s390x/ucall.c b/tools/testing/selftests/kvm/lib/s390x/ucall.c
> index fd589dc9bfab..9d3b0f15249a 100644
> --- a/tools/testing/selftests/kvm/lib/s390x/ucall.c
> +++ b/tools/testing/selftests/kvm/lib/s390x/ucall.c
> @@ -38,6 +38,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
>  	struct kvm_run *run = vcpu_state(vm, vcpu_id);
>  	struct ucall ucall = {};
>  
> +	if (uc)
> +		memset(uc, 0, sizeof(*uc));
> +
>  	if (run->exit_reason == KVM_EXIT_S390_SIEIC &&
>  	    run->s390_sieic.icptcode == 4 &&
>  	    (run->s390_sieic.ipa >> 8) == 0x83 &&    /* 0x83 means DIAGNOSE */
> diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
> index da4d89ad5419..a3489973e290 100644
> --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c
> +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
> @@ -40,6 +40,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
>  	struct kvm_run *run = vcpu_state(vm, vcpu_id);
>  	struct ucall ucall = {};
>  
> +	if (uc)
> +		memset(uc, 0, sizeof(*uc));
> +
>  	if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) {
>  		struct kvm_regs regs;
>  
> -- 
> 2.28.0.220.ged08abb693-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-18 21:15 ` [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr Aaron Lewis
@ 2020-08-19 10:25   ` Alexander Graf
  2020-08-20 18:17   ` Jim Mattson
  1 sibling, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2020-08-19 10:25 UTC (permalink / raw)
  To: Aaron Lewis, jmattson; +Cc: pshier, oupton, kvm



On 18.08.20 23:15, Aaron Lewis wrote:
> 
> Add support for exiting to userspace on a rdmsr or wrmsr instruction if
> the MSR being read from or written to is in the user_exit_msrs list.
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>

Again, this patch should be redundant with the allow list?

Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
  2020-08-19  1:12     ` kernel test robot
  2020-08-19  1:12     ` kernel test robot
@ 2020-08-19 15:26   ` Alexander Graf
  2020-08-20  0:18     ` Aaron Lewis
  2020-08-26 15:48     ` kernel test robot
  3 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-19 15:26 UTC (permalink / raw)
  To: Aaron Lewis, jmattson; +Cc: pshier, oupton, kvm



On 18.08.20 23:15, Aaron Lewis wrote:
> 
> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
> used to control whether an execution of rdmsr or wrmsr will cause a
> vm exit.  For userspace tracked MSRs it is required they cause a vm
> exit, so the host is able to forward the MSR to userspace.  This change
> adds vmx/svm support to ensure the permission bitmap is properly set to
> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
> kvm_make_request() is used to coalesce these into a single call.
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> Reviewed-by: Oliver Upton <oupton@google.com>

This is incomplete, as it doesn't cover all of the x2apic registers. 
There are also a few MSRs that IIRC are handled differently from this 
logic, such as EFER.

I'm really curious if this is worth the effort? I would be inclined to 
say that MSRs that KVM has direct access for need special handling one 
way or another.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-19 15:26   ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs Alexander Graf
@ 2020-08-20  0:18     ` Aaron Lewis
  2020-08-20 22:04       ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Aaron Lewis @ 2020-08-20  0:18 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Jim Mattson, kvm list

On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
>
>
>
> On 18.08.20 23:15, Aaron Lewis wrote:
> >
> > SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
> > intercepts" describe MSR permission bitmaps.  Permission bitmaps are
> > used to control whether an execution of rdmsr or wrmsr will cause a
> > vm exit.  For userspace tracked MSRs it is required they cause a vm
> > exit, so the host is able to forward the MSR to userspace.  This change
> > adds vmx/svm support to ensure the permission bitmap is properly set to
> > cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
> > userspace tracked MSRs.  Also, to avoid repeatedly setting them,
> > kvm_make_request() is used to coalesce these into a single call.
> >
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > Reviewed-by: Oliver Upton <oupton@google.com>
>
> This is incomplete, as it doesn't cover all of the x2apic registers.
> There are also a few MSRs that IIRC are handled differently from this
> logic, such as EFER.
>
> I'm really curious if this is worth the effort? I would be inclined to
> say that MSRs that KVM has direct access for need special handling one
> way or another.
>

Can you please elaborate on this?  It was my understanding that the
permission bitmap covers the x2apic registers.  Also, I’m not sure how
EFER is handled differently, but I see there is a separate
conversation on that.

This effort does seem worthwhile as it ensures userspace is able to
manage the MSRs it is requesting, and will remain that way in the
future.


>
> Alex
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list
  2020-08-19  9:00   ` Alexander Graf
@ 2020-08-20 17:30     ` Jim Mattson
  2020-08-20 21:49       ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-20 17:30 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Wed, Aug 19, 2020 at 2:00 AM Alexander Graf <graf@amazon.com> wrote:

> Why would we still need this with the allow list and user space #GP
> deflection logic in place?

Conversion to an allow list is cumbersome when you have a short deny
list. Suppose that I want to implement the following deny list:
{IA32_ARCH_CAPABILITIES, HV_X64_MSR_REFERENCE_TSC,
MSR_GOOGLE_TRUE_TIME, MSR_GOOGLE_FDR_TRACE, MSR_GOOGLE_HBI}. What
would the corresponding deny list look like? Given your current
implementation, I don't think the corresponding allow list can
actually be constructed. I want to allow 2^32-5 MSRs, but I can allow
at most 122880, if I've done the math correctly. (10 ranges, each
spanning at most 0x600 bytes worth of bitmap.)

Perhaps we should adopt allow/deny rules similar to those accepted by
most firewalls. Instead of ports, we have MSR indices. Instead of
protocols, we have READ, WRITE, or READ/WRITE. Suppose that we
supported up to <n> rules of the form: {start index, end index, access
modes, allow or deny}? Rules would be processed in the order given,
and the first rule that matched a given access would take precedence.
Finally, userspace could specify the default behavior (either allow or
deny) for any MSR access that didn't match any of the rules.

Thoughts?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-18 21:15 ` [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr Aaron Lewis
  2020-08-19 10:25   ` Alexander Graf
@ 2020-08-20 18:17   ` Jim Mattson
  2020-08-20 21:59     ` Alexander Graf
  1 sibling, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-20 18:17 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: Alexander Graf, Peter Shier, Oliver Upton, kvm list

On Tue, Aug 18, 2020 at 2:16 PM Aaron Lewis <aaronlewis@google.com> wrote:
>
> Add support for exiting to userspace on a rdmsr or wrmsr instruction if
> the MSR being read from or written to is in the user_exit_msrs list.
>
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>
> v2 -> v3
>
>   - Refactored commit based on Alexander Graf's changes in the first commit
>     in this series.  Changes made were:
>       - Updated member 'inject_gp' to 'error' based on struct msr in kvm_run.
>       - Move flag 'vcpu->kvm->arch.user_space_msr_enabled' out of
>         kvm_msr_user_space() to allow it to work with both methods that bounce
>         to userspace (msr list and #GP fallback).  Updated caller functions
>         to account for this change.
>       - trace_kvm_msr has been moved up and combine with a previous call in
>         complete_emulated_msr() based on the suggestion made by Alexander
>         Graf <graf@amazon.com>.
>
> ---

> @@ -1653,9 +1663,6 @@ static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index,
>                               u32 exit_reason, u64 data,
>                               int (*completion)(struct kvm_vcpu *vcpu))
>  {
> -       if (!vcpu->kvm->arch.user_space_msr_enabled)
> -               return 0;
> -
>         vcpu->run->exit_reason = exit_reason;
>         vcpu->run->msr.error = 0;
>         vcpu->run->msr.pad[0] = 0;
> @@ -1686,10 +1693,18 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
>         u64 data;
>         int r;
>
> +       if (kvm_msr_user_exit(vcpu->kvm, ecx)) {
> +               kvm_get_msr_user_space(vcpu, ecx);
> +               /* Bounce to user space */
> +               return 0;
> +       }
> +
> +
>         r = kvm_get_msr(vcpu, ecx, &data);
>
>         /* MSR read failed? See if we should ask user space */
> -       if (r && kvm_get_msr_user_space(vcpu, ecx)) {
> +       if (r && vcpu->kvm->arch.user_space_msr_enabled) {
> +               kvm_get_msr_user_space(vcpu, ecx);
>                 /* Bounce to user space */
>                 return 0;
>         }

The before and after bounce to userspace is unfortunate. If we can
consolidate the allow/deny list checks at the top of kvm_get_msr(),
and we can tell why kvm_get_msr() failed (e.g. -EPERM=disallowed,
-ENOENT=unknown MSR, -EINVAL=illegal access), then we can eliminate
the first bounce to userspace above. -EPERM would always go to
userspace. -ENOENT would go to userspace if userspace asked to handle
unknown MSRs. -EINVAL would go to userspace if userspace asked to
handle all #GPs. (Yes; I'd still like to be able to distinguish
between "unknown MSR" and "illegal value." Otherwise, it seems
impossible for userspace to know how to proceed.)

(You may ask, "why would you get -EINVAL on a RDMSR?" This would be
the case if you tried to read a write-only MSR, like IA32_FLUSH_CMD.)

> @@ -1715,10 +1730,17 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
>         u64 data = kvm_read_edx_eax(vcpu);
>         int r;
>
> +       if (kvm_msr_user_exit(vcpu->kvm, ecx)) {
> +               kvm_set_msr_user_space(vcpu, ecx, data);
> +               /* Bounce to user space */
> +               return 0;
> +       }
> +
>         r = kvm_set_msr(vcpu, ecx, data);
>
>         /* MSR write failed? See if we should ask user space */
> -       if (r && kvm_set_msr_user_space(vcpu, ecx, data)) {
> +       if (r && vcpu->kvm->arch.user_space_msr_enabled) {
> +               kvm_set_msr_user_space(vcpu, ecx, data);
>                 /* Bounce to user space */
>                 return 0;
>         }

Same idea as above.

> @@ -3606,6 +3628,25 @@ static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm,
>         return 0;
>  }
>
> +bool kvm_msr_user_exit(struct kvm *kvm, u32 index)
> +{
> +       struct kvm_msr_list *exit_msrs;
> +       int i;
> +
> +       exit_msrs = kvm->arch.user_exit_msrs;
> +
> +       if (!exit_msrs)
> +               return false;
> +
> +       for (i = 0; i < exit_msrs->nmsrs; ++i) {
> +               if (exit_msrs->indices[i] == index)
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +EXPORT_SYMBOL_GPL(kvm_msr_user_exit);

I think this should probably be scrapped, along with Alexander's
allow-list check, in favor of a rule-based allow/deny list approach as
I outlined in an earlier message today.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list
  2020-08-20 17:30     ` Jim Mattson
@ 2020-08-20 21:49       ` Alexander Graf
  2020-08-20 22:28         ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-20 21:49 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

Hi Jim,

On 20.08.20 19:30, Jim Mattson wrote:
> 
> 
> On Wed, Aug 19, 2020 at 2:00 AM Alexander Graf <graf@amazon.com> wrote:
> 
>> Why would we still need this with the allow list and user space #GP
>> deflection logic in place?
> 
> Conversion to an allow list is cumbersome when you have a short deny
> list. Suppose that I want to implement the following deny list:
> {IA32_ARCH_CAPABILITIES, HV_X64_MSR_REFERENCE_TSC,
> MSR_GOOGLE_TRUE_TIME, MSR_GOOGLE_FDR_TRACE, MSR_GOOGLE_HBI}. What
> would the corresponding deny list look like? Given your current
> implementation, I don't think the corresponding allow list can
> actually be constructed. I want to allow 2^32-5 MSRs, but I can allow
> at most 122880, if I've done the math correctly. (10 ranges, each
> spanning at most 0x600 bytes worth of bitmap.)

There are only very few MSR ranges that actually data. So in your case, 
to allow all MSRs that Linux knows about in msr-index.h, you would need

   [0x00000000 - 0x00002000]
   [0x40000000 - 0x40000200]
   [0x4b564d00 - 0x4b564e00]
   [0x80868000 - 0x80868020]
   [0xc0000000 - 0xc0000200]
   [0xc0010000 - 0xc0012000]
   [0xc0020000 - 0xc0020010]

which are 7 regions. For good measure, you can probably pad every one of 
them to the full 0x3000 MSRs they can span.

For MSRs that KVM actually handles in-kernel (others don't need to be 
allowed), the list shrinks to 5:

   [0x00000000 - 0x00001000]
   [0x40000000 - 0x40000200]
   [0x4b564d00 - 0x4b564e00]
   [0xc0000000 - 0xc0000200]
   [0xc0010000 - 0xc0012000]

Let's extend them a bit to make reasoning easier:

   [0x00000000 - 0x00003000]
   [0x40000000 - 0x40003000]
   [0x4b564d00 - 0x4b567000]
   [0xc0000000 - 0xc0003000]
   [0xc0010000 - 0xc0013000]

What are the odds that you will want to implicitly (without a new CAP, 
that would need user space adjustments anyway) have a random new MSR 
handled in-kernel with an identifier that is outside of those ranges?

I'm fairly confident that trends towards 0.

The only real downside I can see is that we just wasted ~8kb of RAM. 
Nothing I would really get hung up on though.

> Perhaps we should adopt allow/deny rules similar to those accepted by
> most firewalls. Instead of ports, we have MSR indices. Instead of
> protocols, we have READ, WRITE, or READ/WRITE. Suppose that we
> supported up to <n> rules of the form: {start index, end index, access
> modes, allow or deny}? Rules would be processed in the order given,
> and the first rule that matched a given access would take precedence.
> Finally, userspace could specify the default behavior (either allow or
> deny) for any MSR access that didn't match any of the rules.
> 
> Thoughts?

That wouldn't scale well if you want to allow all architecturally useful 
MSRs in a purely allow list fashion. You'd have to create hundreds of 
rules - or at least a few dozen if you combine contiguous ranges.

If you really desperately believe a deny list is a better fit for your 
use case, we could redesign the interface differently:

struct msr_set_accesslist {
#define MSR_ACCESSLIST_DEFAULT_ALLOW 0
#define MSR_ACCESSLIST_DEFAULT_DENY  1
     u32 flags;
     struct {
         u32 flags;
         u32 nmsrs; /* MSRs in bitmap */
         u32 base; /* first MSR address to bitmap */
         void *bitmap; /* pointer to bitmap, 1 means allow, 0 deny */
     } lists[10];
};

which means in your use case, you can do

u64 deny = 0;
struct msr_set_accesslist access = {
     .flags = MSR_ACCESSLIST_DEFAULT_ALLOW,
     .lists = {
         {
             .nmsrs = 1,
             .base = IA32_ARCH_CAPABILITIES,
             .bitmap = &deny,
         }, {
         {
             .nmsrs = 1,
             .base = HV_X64_MSR_REFERENCE_TSC,
             .bitmap = &deny,
         }, {
         {
             .nmsrs = 1,
             /* can probably be combined with the ones below? */
             .base = MSR_GOOGLE_TRUE_TIME,
             .bitmap = &deny,
         }, {
         {
             .nmsrs = 1,
             .base = MSR_GOOGLE_FDR_TRACE,
             .bitmap = &deny,
         }, {
         {
             .nmsrs = 1,
             .base = MSR_GOOGLE_HBI,
             .bitmap = &deny,
         },
     }
};

msr_set_accesslist(kvm_fd, &access);

while I can do the same dance as before, but with a single call rather 
than multiple ones.

What do you think?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-20 18:17   ` Jim Mattson
@ 2020-08-20 21:59     ` Alexander Graf
  2020-08-20 22:55       ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-20 21:59 UTC (permalink / raw)
  To: Jim Mattson, Aaron Lewis; +Cc: Peter Shier, Oliver Upton, kvm list



On 20.08.20 20:17, Jim Mattson wrote:
> 
> On Tue, Aug 18, 2020 at 2:16 PM Aaron Lewis <aaronlewis@google.com> wrote:
>>
>> Add support for exiting to userspace on a rdmsr or wrmsr instruction if
>> the MSR being read from or written to is in the user_exit_msrs list.
>>
>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>> ---
>>
>> v2 -> v3
>>
>>    - Refactored commit based on Alexander Graf's changes in the first commit
>>      in this series.  Changes made were:
>>        - Updated member 'inject_gp' to 'error' based on struct msr in kvm_run.
>>        - Move flag 'vcpu->kvm->arch.user_space_msr_enabled' out of
>>          kvm_msr_user_space() to allow it to work with both methods that bounce
>>          to userspace (msr list and #GP fallback).  Updated caller functions
>>          to account for this change.
>>        - trace_kvm_msr has been moved up and combine with a previous call in
>>          complete_emulated_msr() based on the suggestion made by Alexander
>>          Graf <graf@amazon.com>.
>>
>> ---
> 
>> @@ -1653,9 +1663,6 @@ static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index,
>>                                u32 exit_reason, u64 data,
>>                                int (*completion)(struct kvm_vcpu *vcpu))
>>   {
>> -       if (!vcpu->kvm->arch.user_space_msr_enabled)
>> -               return 0;
>> -
>>          vcpu->run->exit_reason = exit_reason;
>>          vcpu->run->msr.error = 0;
>>          vcpu->run->msr.pad[0] = 0;
>> @@ -1686,10 +1693,18 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
>>          u64 data;
>>          int r;
>>
>> +       if (kvm_msr_user_exit(vcpu->kvm, ecx)) {
>> +               kvm_get_msr_user_space(vcpu, ecx);
>> +               /* Bounce to user space */
>> +               return 0;
>> +       }
>> +
>> +
>>          r = kvm_get_msr(vcpu, ecx, &data);
>>
>>          /* MSR read failed? See if we should ask user space */
>> -       if (r && kvm_get_msr_user_space(vcpu, ecx)) {
>> +       if (r && vcpu->kvm->arch.user_space_msr_enabled) {
>> +               kvm_get_msr_user_space(vcpu, ecx);
>>                  /* Bounce to user space */
>>                  return 0;
>>          }
> 
> The before and after bounce to userspace is unfortunate. If we can
> consolidate the allow/deny list checks at the top of kvm_get_msr(),
> and we can tell why kvm_get_msr() failed (e.g. -EPERM=disallowed,
> -ENOENT=unknown MSR, -EINVAL=illegal access), then we can eliminate
> the first bounce to userspace above. -EPERM would always go to
> userspace. -ENOENT would go to userspace if userspace asked to handle
> unknown MSRs. -EINVAL would go to userspace if userspace asked to
> handle all #GPs. (Yes; I'd still like to be able to distinguish
> between "unknown MSR" and "illegal value." Otherwise, it seems
> impossible for userspace to know how to proceed.)
> 
> (You may ask, "why would you get -EINVAL on a RDMSR?" This would be
> the case if you tried to read a write-only MSR, like IA32_FLUSH_CMD.)

Do we really need to do all of this dance of differentiating in kernel 
space between an exit that's there because user space asked for the exit 
and an MSR access that would just generate a #GP?

At the end of the day, user space *knows* which MSRs it asked to 
receive. It can filter for them super easily.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-20  0:18     ` Aaron Lewis
@ 2020-08-20 22:04       ` Alexander Graf
  2020-08-20 22:35         ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-20 22:04 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: Jim Mattson, kvm list



On 20.08.20 02:18, Aaron Lewis wrote:
> 
> On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
>>
>>
>>
>> On 18.08.20 23:15, Aaron Lewis wrote:
>>>
>>> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
>>> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
>>> used to control whether an execution of rdmsr or wrmsr will cause a
>>> vm exit.  For userspace tracked MSRs it is required they cause a vm
>>> exit, so the host is able to forward the MSR to userspace.  This change
>>> adds vmx/svm support to ensure the permission bitmap is properly set to
>>> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
>>> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
>>> kvm_make_request() is used to coalesce these into a single call.
>>>
>>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>>> Reviewed-by: Oliver Upton <oupton@google.com>
>>
>> This is incomplete, as it doesn't cover all of the x2apic registers.
>> There are also a few MSRs that IIRC are handled differently from this
>> logic, such as EFER.
>>
>> I'm really curious if this is worth the effort? I would be inclined to
>> say that MSRs that KVM has direct access for need special handling one
>> way or another.
>>
> 
> Can you please elaborate on this?  It was my understanding that the
> permission bitmap covers the x2apic registers.  Also, I’m not sure how

So x2apic MSR passthrough is configured specially:

 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/vmx/vmx.c#n3796

and I think not handled by this patch?

> EFER is handled differently, but I see there is a separate
> conversation on that.

EFER is a really special beast in VT.

> This effort does seem worthwhile as it ensures userspace is able to
> manage the MSRs it is requesting, and will remain that way in the
> future.

I couldn't see why any of the passthrough MSRs are relevant to user 
space, but I tend to agree that it makes everything more consistent.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list
  2020-08-20 21:49       ` Alexander Graf
@ 2020-08-20 22:28         ` Jim Mattson
  0 siblings, 0 replies; 50+ messages in thread
From: Jim Mattson @ 2020-08-20 22:28 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Thu, Aug 20, 2020 at 2:49 PM Alexander Graf <graf@amazon.com> wrote:

> The only real downside I can see is that we just wasted ~8kb of RAM.
> Nothing I would really get hung up on though.

I also suspect that the MSR permission bitmap modifications are going
to be a bit more expensive with 4kb (6kb on AMD) of pertinent
allow-bitmaps than they would be with a few bytes of pertinent
deny-bitmaps.

> If you really desperately believe a deny list is a better fit for your
> use case, we could redesign the interface differently:
>
> struct msr_set_accesslist {
> #define MSR_ACCESSLIST_DEFAULT_ALLOW 0
> #define MSR_ACCESSLIST_DEFAULT_DENY  1
>      u32 flags;
>      struct {
>          u32 flags;
>          u32 nmsrs; /* MSRs in bitmap */
>          u32 base; /* first MSR address to bitmap */
>          void *bitmap; /* pointer to bitmap, 1 means allow, 0 deny */
>      } lists[10];
> };
>
> which means in your use case, you can do
>
> u64 deny = 0;
> struct msr_set_accesslist access = {
>      .flags = MSR_ACCESSLIST_DEFAULT_ALLOW,
>      .lists = {
>          {
>              .nmsrs = 1,
>              .base = IA32_ARCH_CAPABILITIES,
>              .bitmap = &deny,
>          }, {
>          {
>              .nmsrs = 1,
>              .base = HV_X64_MSR_REFERENCE_TSC,
>              .bitmap = &deny,
>          }, {
>          {
>              .nmsrs = 1,
>              /* can probably be combined with the ones below? */
>              .base = MSR_GOOGLE_TRUE_TIME,
>              .bitmap = &deny,
>          }, {
>          {
>              .nmsrs = 1,
>              .base = MSR_GOOGLE_FDR_TRACE,
>              .bitmap = &deny,
>          }, {
>          {
>              .nmsrs = 1,
>              .base = MSR_GOOGLE_HBI,
>              .bitmap = &deny,
>          },
>      }
> };
>
> msr_set_accesslist(kvm_fd, &access);
>
> while I can do the same dance as before, but with a single call rather
> than multiple ones.
>
> What do you think?

I like it. I think this suits our use case well.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-20 22:04       ` Alexander Graf
@ 2020-08-20 22:35         ` Jim Mattson
  2020-08-21 14:27           ` Aaron Lewis
  0 siblings, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-20 22:35 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, kvm list

On Thu, Aug 20, 2020 at 3:04 PM Alexander Graf <graf@amazon.com> wrote:
>
>
>
> On 20.08.20 02:18, Aaron Lewis wrote:
> >
> > On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
> >>
> >>
> >>
> >> On 18.08.20 23:15, Aaron Lewis wrote:
> >>>
> >>> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
> >>> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
> >>> used to control whether an execution of rdmsr or wrmsr will cause a
> >>> vm exit.  For userspace tracked MSRs it is required they cause a vm
> >>> exit, so the host is able to forward the MSR to userspace.  This change
> >>> adds vmx/svm support to ensure the permission bitmap is properly set to
> >>> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
> >>> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
> >>> kvm_make_request() is used to coalesce these into a single call.
> >>>
> >>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> >>> Reviewed-by: Oliver Upton <oupton@google.com>
> >>
> >> This is incomplete, as it doesn't cover all of the x2apic registers.
> >> There are also a few MSRs that IIRC are handled differently from this
> >> logic, such as EFER.
> >>
> >> I'm really curious if this is worth the effort? I would be inclined to
> >> say that MSRs that KVM has direct access for need special handling one
> >> way or another.
> >>
> >
> > Can you please elaborate on this?  It was my understanding that the
> > permission bitmap covers the x2apic registers.  Also, I’m not sure how
>
> So x2apic MSR passthrough is configured specially:
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/vmx/vmx.c#n3796
>
> and I think not handled by this patch?

By happenstance only, I think, since there is also a call there to
vmx_disable_intercept_for_msr() for the TPR when x2APIC is enabled.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-20 21:59     ` Alexander Graf
@ 2020-08-20 22:55       ` Jim Mattson
  2020-08-21 17:58         ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-20 22:55 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:

> Do we really need to do all of this dance of differentiating in kernel
> space between an exit that's there because user space asked for the exit
> and an MSR access that would just generate a #GP?
>
> At the end of the day, user space *knows* which MSRs it asked to
> receive. It can filter for them super easily.

If no one else has an opinion, I can let this go. :-)

However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
(without the unfortunate before and after checks that Aaron added),
kvm_{get,set}_msr should at least distinguish between "permission
denied" and "raise #GP," so I can provide a deny list without asking
for userspace exits on #GP.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-20 22:35         ` Jim Mattson
@ 2020-08-21 14:27           ` Aaron Lewis
  2020-08-21 16:07             ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Aaron Lewis @ 2020-08-21 14:27 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Alexander Graf, kvm list

On Thu, Aug 20, 2020 at 3:35 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Thu, Aug 20, 2020 at 3:04 PM Alexander Graf <graf@amazon.com> wrote:
> >
> >
> >
> > On 20.08.20 02:18, Aaron Lewis wrote:
> > >
> > > On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
> > >>
> > >>
> > >>
> > >> On 18.08.20 23:15, Aaron Lewis wrote:
> > >>>
> > >>> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
> > >>> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
> > >>> used to control whether an execution of rdmsr or wrmsr will cause a
> > >>> vm exit.  For userspace tracked MSRs it is required they cause a vm
> > >>> exit, so the host is able to forward the MSR to userspace.  This change
> > >>> adds vmx/svm support to ensure the permission bitmap is properly set to
> > >>> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
> > >>> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
> > >>> kvm_make_request() is used to coalesce these into a single call.
> > >>>
> > >>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > >>> Reviewed-by: Oliver Upton <oupton@google.com>
> > >>
> > >> This is incomplete, as it doesn't cover all of the x2apic registers.
> > >> There are also a few MSRs that IIRC are handled differently from this
> > >> logic, such as EFER.
> > >>
> > >> I'm really curious if this is worth the effort? I would be inclined to
> > >> say that MSRs that KVM has direct access for need special handling one
> > >> way or another.
> > >>
> > >
> > > Can you please elaborate on this?  It was my understanding that the
> > > permission bitmap covers the x2apic registers.  Also, I’m not sure how
> >
> > So x2apic MSR passthrough is configured specially:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/vmx/vmx.c#n3796
> >
> > and I think not handled by this patch?
>
> By happenstance only, I think, since there is also a call there to
> vmx_disable_intercept_for_msr() for the TPR when x2APIC is enabled.

If we want to be more explicit about it we could add
kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu) After the bitmap is
modified, but that doesn't seem to be necessary as Jim pointed out as
there are calls to vmx_disable_intercept_for_msr() there which will
set the update request for us.  And we only have to worry about that
if the bitmap is cleared which means MSR_BITMAP_MODE_X2APIC_APICV is
set, and that flag can only be set if MSR_BITMAP_MODE_X2APIC is set.
So, AFAICT that is covered by my changes.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-21 14:27           ` Aaron Lewis
@ 2020-08-21 16:07             ` Alexander Graf
  2020-08-21 16:43               ` Aaron Lewis
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-21 16:07 UTC (permalink / raw)
  To: Aaron Lewis, Jim Mattson; +Cc: kvm list



On 21.08.20 16:27, Aaron Lewis wrote:
> 
> On Thu, Aug 20, 2020 at 3:35 PM Jim Mattson <jmattson@google.com> wrote:
>>
>> On Thu, Aug 20, 2020 at 3:04 PM Alexander Graf <graf@amazon.com> wrote:
>>>
>>>
>>>
>>> On 20.08.20 02:18, Aaron Lewis wrote:
>>>>
>>>> On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 18.08.20 23:15, Aaron Lewis wrote:
>>>>>>
>>>>>> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
>>>>>> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
>>>>>> used to control whether an execution of rdmsr or wrmsr will cause a
>>>>>> vm exit.  For userspace tracked MSRs it is required they cause a vm
>>>>>> exit, so the host is able to forward the MSR to userspace.  This change
>>>>>> adds vmx/svm support to ensure the permission bitmap is properly set to
>>>>>> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
>>>>>> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
>>>>>> kvm_make_request() is used to coalesce these into a single call.
>>>>>>
>>>>>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
>>>>>> Reviewed-by: Oliver Upton <oupton@google.com>
>>>>>
>>>>> This is incomplete, as it doesn't cover all of the x2apic registers.
>>>>> There are also a few MSRs that IIRC are handled differently from this
>>>>> logic, such as EFER.
>>>>>
>>>>> I'm really curious if this is worth the effort? I would be inclined to
>>>>> say that MSRs that KVM has direct access for need special handling one
>>>>> way or another.
>>>>>
>>>>
>>>> Can you please elaborate on this?  It was my understanding that the
>>>> permission bitmap covers the x2apic registers.  Also, I’m not sure how
>>>
>>> So x2apic MSR passthrough is configured specially:
>>>
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/vmx/vmx.c#n3796
>>>
>>> and I think not handled by this patch?
>>
>> By happenstance only, I think, since there is also a call there to
>> vmx_disable_intercept_for_msr() for the TPR when x2APIC is enabled.
> 
> If we want to be more explicit about it we could add
> kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu) After the bitmap is
> modified, but that doesn't seem to be necessary as Jim pointed out as
> there are calls to vmx_disable_intercept_for_msr() there which will
> set the update request for us.  And we only have to worry about that
> if the bitmap is cleared which means MSR_BITMAP_MODE_X2APIC_APICV is
> set, and that flag can only be set if MSR_BITMAP_MODE_X2APIC is set.
> So, AFAICT that is covered by my changes.
> 

I don't understand - for most x2APIC MSRs, 
vmx_{en,dis}able_intercept_for_msr() never gets called, no?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-21 16:07             ` Alexander Graf
@ 2020-08-21 16:43               ` Aaron Lewis
  0 siblings, 0 replies; 50+ messages in thread
From: Aaron Lewis @ 2020-08-21 16:43 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Jim Mattson, kvm list

On Fri, Aug 21, 2020 at 9:08 AM Alexander Graf <graf@amazon.com> wrote:
>
>
>
> On 21.08.20 16:27, Aaron Lewis wrote:
> >
> > On Thu, Aug 20, 2020 at 3:35 PM Jim Mattson <jmattson@google.com> wrote:
> >>
> >> On Thu, Aug 20, 2020 at 3:04 PM Alexander Graf <graf@amazon.com> wrote:
> >>>
> >>>
> >>>
> >>> On 20.08.20 02:18, Aaron Lewis wrote:
> >>>>
> >>>> On Wed, Aug 19, 2020 at 8:26 AM Alexander Graf <graf@amazon.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 18.08.20 23:15, Aaron Lewis wrote:
> >>>>>>
> >>>>>> SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS
> >>>>>> intercepts" describe MSR permission bitmaps.  Permission bitmaps are
> >>>>>> used to control whether an execution of rdmsr or wrmsr will cause a
> >>>>>> vm exit.  For userspace tracked MSRs it is required they cause a vm
> >>>>>> exit, so the host is able to forward the MSR to userspace.  This change
> >>>>>> adds vmx/svm support to ensure the permission bitmap is properly set to
> >>>>>> cause a vm_exit to the host when rdmsr or wrmsr is used by one of the
> >>>>>> userspace tracked MSRs.  Also, to avoid repeatedly setting them,
> >>>>>> kvm_make_request() is used to coalesce these into a single call.
> >>>>>>
> >>>>>> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> >>>>>> Reviewed-by: Oliver Upton <oupton@google.com>
> >>>>>
> >>>>> This is incomplete, as it doesn't cover all of the x2apic registers.
> >>>>> There are also a few MSRs that IIRC are handled differently from this
> >>>>> logic, such as EFER.
> >>>>>
> >>>>> I'm really curious if this is worth the effort? I would be inclined to
> >>>>> say that MSRs that KVM has direct access for need special handling one
> >>>>> way or another.
> >>>>>
> >>>>
> >>>> Can you please elaborate on this?  It was my understanding that the
> >>>> permission bitmap covers the x2apic registers.  Also, I’m not sure how
> >>>
> >>> So x2apic MSR passthrough is configured specially:
> >>>
> >>>
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/vmx/vmx.c#n3796
> >>>
> >>> and I think not handled by this patch?
> >>
> >> By happenstance only, I think, since there is also a call there to
> >> vmx_disable_intercept_for_msr() for the TPR when x2APIC is enabled.
> >
> > If we want to be more explicit about it we could add
> > kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu) After the bitmap is
> > modified, but that doesn't seem to be necessary as Jim pointed out as
> > there are calls to vmx_disable_intercept_for_msr() there which will
> > set the update request for us.  And we only have to worry about that
> > if the bitmap is cleared which means MSR_BITMAP_MODE_X2APIC_APICV is
> > set, and that flag can only be set if MSR_BITMAP_MODE_X2APIC is set.
> > So, AFAICT that is covered by my changes.
> >
>
> I don't understand - for most x2APIC MSRs,
> vmx_{en,dis}able_intercept_for_msr() never gets called, no?
>

Sorry, to be clear.  We just need it to be called once for us to be
covered.  When it's invoked we set a flag, so the next time vm enter
is called we run over the list of MSRs userspace cares about and
ensure they are all set correctly.  So, if the x2APIC permission
bitmaps are modified directly and cleared, we just need for
vmx_disable_intercept_for_msr() to be called at least once to tell us
to run over the list and set the bits for all MSRs userspace is
tracking.

>
> Alex
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-20 22:55       ` Jim Mattson
@ 2020-08-21 17:58         ` Jim Mattson
  2020-08-24  1:35           ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-21 17:58 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:
>
> > Do we really need to do all of this dance of differentiating in kernel
> > space between an exit that's there because user space asked for the exit
> > and an MSR access that would just generate a #GP?
> >
> > At the end of the day, user space *knows* which MSRs it asked to
> > receive. It can filter for them super easily.
>
> If no one else has an opinion, I can let this go. :-)
>
> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
> (without the unfortunate before and after checks that Aaron added),
> kvm_{get,set}_msr should at least distinguish between "permission
> denied" and "raise #GP," so I can provide a deny list without asking
> for userspace exits on #GP.

Actually, I think this whole discussion is moot. You no longer need
the first ioctl (ask for a userspace exit on #GP). The allow/deny list
is sufficient. Moreover, the allow/deny list checks can be in
kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we
needn't be concerned with distinguishable error values either.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-21 17:58         ` Jim Mattson
@ 2020-08-24  1:35           ` Alexander Graf
  2020-08-24 17:23             ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-24  1:35 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list



On 21.08.20 19:58, Jim Mattson wrote:
> 
> On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@google.com> wrote:
>>
>> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:
>>
>>> Do we really need to do all of this dance of differentiating in kernel
>>> space between an exit that's there because user space asked for the exit
>>> and an MSR access that would just generate a #GP?
>>>
>>> At the end of the day, user space *knows* which MSRs it asked to
>>> receive. It can filter for them super easily.
>>
>> If no one else has an opinion, I can let this go. :-)
>>
>> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
>> (without the unfortunate before and after checks that Aaron added),
>> kvm_{get,set}_msr should at least distinguish between "permission
>> denied" and "raise #GP," so I can provide a deny list without asking
>> for userspace exits on #GP.
> 
> Actually, I think this whole discussion is moot. You no longer need
> the first ioctl (ask for a userspace exit on #GP). The allow/deny list
> is sufficient. Moreover, the allow/deny list checks can be in
> kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we
> needn't be concerned with distinguishable error values either.
> 

I also care about cases where I allow in-kernel handling, but for 
whatever reason there still would be a #GP injected into the guest. I 
want to record those events and be able to later have data that tell me 
why something went wrong.

So yes, for your use case you do not care about the distinction between 
"deny MSR access" and "report invalid MSR access". However, I do care :).

My stance on this is again that it's trivial to handle a few invalid MSR 
#GPs from user space and just not report anything. It should come at 
almost negligible performance cost, no?

As for your argumentation above, we have a second call chain into 
kvm_{get,set}_msr from the x86 emulator which you'd also need to cover.

One thing we could do I guess is to add a parameter to ENABLE_CAP on 
KVM_CAP_X86_USER_SPACE_MSR so that it only bounces on certain return 
values, such as -ENOENT. I still fail to see cases where that's 
genuinely beneficial though.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-24  1:35           ` Alexander Graf
@ 2020-08-24 17:23             ` Jim Mattson
  2020-08-24 18:09               ` Alexander Graf
  0 siblings, 1 reply; 50+ messages in thread
From: Jim Mattson @ 2020-08-24 17:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Sun, Aug 23, 2020 at 6:35 PM Alexander Graf <graf@amazon.com> wrote:
>
>
>
> On 21.08.20 19:58, Jim Mattson wrote:
> >
> > On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@google.com> wrote:
> >>
> >> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:
> >>
> >>> Do we really need to do all of this dance of differentiating in kernel
> >>> space between an exit that's there because user space asked for the exit
> >>> and an MSR access that would just generate a #GP?
> >>>
> >>> At the end of the day, user space *knows* which MSRs it asked to
> >>> receive. It can filter for them super easily.
> >>
> >> If no one else has an opinion, I can let this go. :-)
> >>
> >> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
> >> (without the unfortunate before and after checks that Aaron added),
> >> kvm_{get,set}_msr should at least distinguish between "permission
> >> denied" and "raise #GP," so I can provide a deny list without asking
> >> for userspace exits on #GP.
> >
> > Actually, I think this whole discussion is moot. You no longer need
> > the first ioctl (ask for a userspace exit on #GP). The allow/deny list
> > is sufficient. Moreover, the allow/deny list checks can be in
> > kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we
> > needn't be concerned with distinguishable error values either.
> >
>
> I also care about cases where I allow in-kernel handling, but for
> whatever reason there still would be a #GP injected into the guest. I
> want to record those events and be able to later have data that tell me
> why something went wrong.
>
> So yes, for your use case you do not care about the distinction between
> "deny MSR access" and "report invalid MSR access". However, I do care :).

In that case, I'm going to continue to hold a hard line on the
distinction between a #GP for an invalid MSR access and the #GP for an
unknown MSR. If, for instance, you wanted to implement ignore_msrs in
userspace, as you've proposed in the past, this would be extremely
helpful. Without it, userspace gets an exit because (1) the MSR access
isn't in the allow list, (2) the MSR access is invalid, or (3) the MSR
is unknown to kvm. As you've pointed out, it is easy for userspace to
distinguish (1) from the others, since it provided the allow/deny list
in the first place. But how do you distinguish (2) from (3) without
replicating the logic in the kernel?

> My stance on this is again that it's trivial to handle a few invalid MSR
> #GPs from user space and just not report anything. It should come at
> almost negligible performance cost, no?

Yes, the performance cost should be negligible, but what is the point?
We're trying to design a good API here, aren't we?

> As for your argumentation above, we have a second call chain into
> kvm_{get,set}_msr from the x86 emulator which you'd also need to cover.
>
> One thing we could do I guess is to add a parameter to ENABLE_CAP on
> KVM_CAP_X86_USER_SPACE_MSR so that it only bounces on certain return
> values, such as -ENOENT. I still fail to see cases where that's
> genuinely beneficial though.

I'd like to see two completely independent APIs, so that I can just
request a bounce on -EPERM through a deny list.  I think it's useful
to distinguish between -ENOENT and -EINVAL, but I have no issues wih
both causing an exit to userspace, if userspace has requested exits on
MSR #GPs.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-24 17:23             ` Jim Mattson
@ 2020-08-24 18:09               ` Alexander Graf
  2020-08-24 18:34                 ` Jim Mattson
  0 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-08-24 18:09 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list



On 24.08.20 19:23, Jim Mattson wrote:
> 
> On Sun, Aug 23, 2020 at 6:35 PM Alexander Graf <graf@amazon.com> wrote:
>>
>>
>>
>> On 21.08.20 19:58, Jim Mattson wrote:
>>>
>>> On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@google.com> wrote:
>>>>
>>>> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:
>>>>
>>>>> Do we really need to do all of this dance of differentiating in kernel
>>>>> space between an exit that's there because user space asked for the exit
>>>>> and an MSR access that would just generate a #GP?
>>>>>
>>>>> At the end of the day, user space *knows* which MSRs it asked to
>>>>> receive. It can filter for them super easily.
>>>>
>>>> If no one else has an opinion, I can let this go. :-)
>>>>
>>>> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
>>>> (without the unfortunate before and after checks that Aaron added),
>>>> kvm_{get,set}_msr should at least distinguish between "permission
>>>> denied" and "raise #GP," so I can provide a deny list without asking
>>>> for userspace exits on #GP.
>>>
>>> Actually, I think this whole discussion is moot. You no longer need
>>> the first ioctl (ask for a userspace exit on #GP). The allow/deny list
>>> is sufficient. Moreover, the allow/deny list checks can be in
>>> kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we
>>> needn't be concerned with distinguishable error values either.
>>>
>>
>> I also care about cases where I allow in-kernel handling, but for
>> whatever reason there still would be a #GP injected into the guest. I
>> want to record those events and be able to later have data that tell me
>> why something went wrong.
>>
>> So yes, for your use case you do not care about the distinction between
>> "deny MSR access" and "report invalid MSR access". However, I do care :).
> 
> In that case, I'm going to continue to hold a hard line on the
> distinction between a #GP for an invalid MSR access and the #GP for an
> unknown MSR. If, for instance, you wanted to implement ignore_msrs in
> userspace, as you've proposed in the past, this would be extremely
> helpful. Without it, userspace gets an exit because (1) the MSR access
> isn't in the allow list, (2) the MSR access is invalid, or (3) the MSR
> is unknown to kvm. As you've pointed out, it is easy for userspace to
> distinguish (1) from the others, since it provided the allow/deny list
> in the first place. But how do you distinguish (2) from (3) without
> replicating the logic in the kernel?
> 
>> My stance on this is again that it's trivial to handle a few invalid MSR
>> #GPs from user space and just not report anything. It should come at
>> almost negligible performance cost, no?
> 
> Yes, the performance cost should be negligible, but what is the point?
> We're trying to design a good API here, aren't we?
> 
>> As for your argumentation above, we have a second call chain into
>> kvm_{get,set}_msr from the x86 emulator which you'd also need to cover.
>>
>> One thing we could do I guess is to add a parameter to ENABLE_CAP on
>> KVM_CAP_X86_USER_SPACE_MSR so that it only bounces on certain return
>> values, such as -ENOENT. I still fail to see cases where that's
>> genuinely beneficial though.
> 
> I'd like to see two completely independent APIs, so that I can just
> request a bounce on -EPERM through a deny list.  I think it's useful

Where would that bounce to? Which user space event does that trigger? 
Yet another one? Wouldn't 4 exit reasons just for MSR traps be a bit 
much? :)

> to distinguish between -ENOENT and -EINVAL, but I have no issues wih
> both causing an exit to userspace, if userspace has requested exits on
> MSR #GPs.

So imagine we took the first argument to ENABLE_CAP as filter:

   (1<<0) REPORT_ENOENT
   (1<<1) REPORT_EINVAL
   (1<<2) REPORT_EPERM
   (1<<31) REPORT_ANY

Then we also add the reason to the kvm_run exit response and user space 
can differentiate easily between the different events.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr
  2020-08-24 18:09               ` Alexander Graf
@ 2020-08-24 18:34                 ` Jim Mattson
  0 siblings, 0 replies; 50+ messages in thread
From: Jim Mattson @ 2020-08-24 18:34 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Aaron Lewis, Peter Shier, Oliver Upton, kvm list

On Mon, Aug 24, 2020 at 11:09 AM Alexander Graf <graf@amazon.com> wrote:
>
>
>
> On 24.08.20 19:23, Jim Mattson wrote:
> >
> > On Sun, Aug 23, 2020 at 6:35 PM Alexander Graf <graf@amazon.com> wrote:
> >>
> >>
> >>
> >> On 21.08.20 19:58, Jim Mattson wrote:
> >>>
> >>> On Thu, Aug 20, 2020 at 3:55 PM Jim Mattson <jmattson@google.com> wrote:
> >>>>
> >>>> On Thu, Aug 20, 2020 at 2:59 PM Alexander Graf <graf@amazon.com> wrote:
> >>>>
> >>>>> Do we really need to do all of this dance of differentiating in kernel
> >>>>> space between an exit that's there because user space asked for the exit
> >>>>> and an MSR access that would just generate a #GP?
> >>>>>
> >>>>> At the end of the day, user space *knows* which MSRs it asked to
> >>>>> receive. It can filter for them super easily.
> >>>>
> >>>> If no one else has an opinion, I can let this go. :-)
> >>>>
> >>>> However, to make the right decision in kvm_emulate_{rdmsr,wrmsr}
> >>>> (without the unfortunate before and after checks that Aaron added),
> >>>> kvm_{get,set}_msr should at least distinguish between "permission
> >>>> denied" and "raise #GP," so I can provide a deny list without asking
> >>>> for userspace exits on #GP.
> >>>
> >>> Actually, I think this whole discussion is moot. You no longer need
> >>> the first ioctl (ask for a userspace exit on #GP). The allow/deny list
> >>> is sufficient. Moreover, the allow/deny list checks can be in
> >>> kvm_emulate_{rdmsr,wrmsr} before the call to kvm_{get,set}_msr, so we
> >>> needn't be concerned with distinguishable error values either.
> >>>
> >>
> >> I also care about cases where I allow in-kernel handling, but for
> >> whatever reason there still would be a #GP injected into the guest. I
> >> want to record those events and be able to later have data that tell me
> >> why something went wrong.
> >>
> >> So yes, for your use case you do not care about the distinction between
> >> "deny MSR access" and "report invalid MSR access". However, I do care :).
> >
> > In that case, I'm going to continue to hold a hard line on the
> > distinction between a #GP for an invalid MSR access and the #GP for an
> > unknown MSR. If, for instance, you wanted to implement ignore_msrs in
> > userspace, as you've proposed in the past, this would be extremely
> > helpful. Without it, userspace gets an exit because (1) the MSR access
> > isn't in the allow list, (2) the MSR access is invalid, or (3) the MSR
> > is unknown to kvm. As you've pointed out, it is easy for userspace to
> > distinguish (1) from the others, since it provided the allow/deny list
> > in the first place. But how do you distinguish (2) from (3) without
> > replicating the logic in the kernel?
> >
> >> My stance on this is again that it's trivial to handle a few invalid MSR
> >> #GPs from user space and just not report anything. It should come at
> >> almost negligible performance cost, no?
> >
> > Yes, the performance cost should be negligible, but what is the point?
> > We're trying to design a good API here, aren't we?
> >
> >> As for your argumentation above, we have a second call chain into
> >> kvm_{get,set}_msr from the x86 emulator which you'd also need to cover.
> >>
> >> One thing we could do I guess is to add a parameter to ENABLE_CAP on
> >> KVM_CAP_X86_USER_SPACE_MSR so that it only bounces on certain return
> >> values, such as -ENOENT. I still fail to see cases where that's
> >> genuinely beneficial though.
> >
> > I'd like to see two completely independent APIs, so that I can just
> > request a bounce on -EPERM through a deny list.  I think it's useful
>
> Where would that bounce to? Which user space event does that trigger?
> Yet another one? Wouldn't 4 exit reasons just for MSR traps be a bit
> much? :)

All of the exits are either KVM_EXIT_X86_RDMSR or KVM_EXIT_X86_WRMSR.
Or, we could put the direction in the msr struct and just have one
exit reason.

> > to distinguish between -ENOENT and -EINVAL, but I have no issues wih
> > both causing an exit to userspace, if userspace has requested exits on
> > MSR #GPs.
>
> So imagine we took the first argument to ENABLE_CAP as filter:
>
>    (1<<0) REPORT_ENOENT
>    (1<<1) REPORT_EINVAL
>    (1<<2) REPORT_EPERM
>    (1<<31) REPORT_ANY
>
> Then we also add the reason to the kvm_run exit response and user space
> can differentiate easily between the different events.

I think this works well. I still have to call both APIs to satisfy my
use case, but I'm willing to cave on that request. (I just realized
that there is a very good use case for an allow/deny list *without*
exits to userspace: prohibiting kvm from doing cross-vendor MSR
emulation.)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
  2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
@ 2020-08-26 15:48     ` kernel test robot
  2020-08-19  1:12     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-26 15:48 UTC (permalink / raw)
  To: Aaron Lewis, jmattson, graf
  Cc: kbuild-all, clang-built-linux, pshier, oupton, kvm, Aaron Lewis

[-- Attachment #1: Type: text/plain, Size: 2403 bytes --]

Hi Aaron,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v5.9-rc2 next-20200826]
[cannot apply to kvms390/next vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-randconfig-a001-20200826 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 7cfcecece0e0430937cf529ce74d3a071a4dedc6)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/kvm/vmx/vmx.c:3823:6: warning: no previous prototype for function 'vmx_set_user_msr_intercept' [-Wmissing-prototypes]
   void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
        ^
   arch/x86/kvm/vmx/vmx.c:3823:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
   ^
   static 
   1 warning generated.

# https://github.com/0day-ci/linux/commit/e78e2c7f2ae3e9e6be9768f1616b043406ae24dd
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout e78e2c7f2ae3e9e6be9768f1616b043406ae24dd
vim +/vmx_set_user_msr_intercept +3823 arch/x86/kvm/vmx/vmx.c

  3822	
> 3823	void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
  3824	{
  3825		vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW);
  3826	}
  3827	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35987 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs
@ 2020-08-26 15:48     ` kernel test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-26 15:48 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2458 bytes --]

Hi Aaron,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v5.9-rc2 next-20200826]
[cannot apply to kvms390/next vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-randconfig-a001-20200826 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 7cfcecece0e0430937cf529ce74d3a071a4dedc6)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/kvm/vmx/vmx.c:3823:6: warning: no previous prototype for function 'vmx_set_user_msr_intercept' [-Wmissing-prototypes]
   void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
        ^
   arch/x86/kvm/vmx/vmx.c:3823:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
   ^
   static 
   1 warning generated.

# https://github.com/0day-ci/linux/commit/e78e2c7f2ae3e9e6be9768f1616b043406ae24dd
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout e78e2c7f2ae3e9e6be9768f1616b043406ae24dd
vim +/vmx_set_user_msr_intercept +3823 arch/x86/kvm/vmx/vmx.c

  3822	
> 3823	void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr)
  3824	{
  3825		vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW);
  3826	}
  3827	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35987 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
  2020-08-18 21:15 ` [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation Aaron Lewis
  2020-08-19  8:53   ` Alexander Graf
@ 2020-08-31 10:39     ` Dan Carpenter
  1 sibling, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-08-31 10:39 UTC (permalink / raw)
  To: kbuild, Aaron Lewis, jmattson, graf
  Cc: lkp, kbuild-all, pshier, oupton, kvm, Aaron Lewis, KarimAllah Ahmed

[-- Attachment #1: Type: text/plain, Size: 6537 bytes --]

Hi Aaron,

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-randconfig-m001-20200827 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()

# https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
vim +/bitmap +5248 arch/x86/kvm/x86.c

107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
107c87325cf461 Aaron Lewis 2020-08-18  5182  {
107c87325cf461 Aaron Lewis 2020-08-18  5183  	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
107c87325cf461 Aaron Lewis 2020-08-18  5184  	struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
107c87325cf461 Aaron Lewis 2020-08-18  5185  	struct msr_bitmap_range range;
107c87325cf461 Aaron Lewis 2020-08-18  5186  	struct kvm_msr_allowlist kernel_msr_allowlist;
107c87325cf461 Aaron Lewis 2020-08-18  5187  	unsigned long *bitmap = NULL;
107c87325cf461 Aaron Lewis 2020-08-18  5188  	size_t bitmap_size;
107c87325cf461 Aaron Lewis 2020-08-18  5189  	int r = 0;
107c87325cf461 Aaron Lewis 2020-08-18  5190  
107c87325cf461 Aaron Lewis 2020-08-18  5191  	if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
107c87325cf461 Aaron Lewis 2020-08-18  5192  			   sizeof(kernel_msr_allowlist))) {
107c87325cf461 Aaron Lewis 2020-08-18  5193  		r = -EFAULT;
107c87325cf461 Aaron Lewis 2020-08-18  5194  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5195  	}
107c87325cf461 Aaron Lewis 2020-08-18  5196  
107c87325cf461 Aaron Lewis 2020-08-18  5197  	bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
On 32 bit systems the BITS_TO_LONGS() can integer overflow if
kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
that case bitmap_size is zero.

107c87325cf461 Aaron Lewis 2020-08-18  5198  	if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
107c87325cf461 Aaron Lewis 2020-08-18  5199  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5200  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5201  	}
107c87325cf461 Aaron Lewis 2020-08-18  5202  
107c87325cf461 Aaron Lewis 2020-08-18  5203  	bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
107c87325cf461 Aaron Lewis 2020-08-18  5204  	if (IS_ERR(bitmap)) {
107c87325cf461 Aaron Lewis 2020-08-18  5205  		r = PTR_ERR(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5206  		goto out;
                                                        ^^^^^^^^
"out" is always a vague label name.  It's better style to return
directly instead of doing a complicated no-op.

	if (IS_ERR(bitmap))
		return PTR_ERR(bitmap);

107c87325cf461 Aaron Lewis 2020-08-18  5207  	}
107c87325cf461 Aaron Lewis 2020-08-18  5208  
107c87325cf461 Aaron Lewis 2020-08-18  5209  	range = (struct msr_bitmap_range) {
107c87325cf461 Aaron Lewis 2020-08-18  5210  		.flags = kernel_msr_allowlist.flags,
107c87325cf461 Aaron Lewis 2020-08-18  5211  		.base = kernel_msr_allowlist.base,
107c87325cf461 Aaron Lewis 2020-08-18  5212  		.nmsrs = kernel_msr_allowlist.nmsrs,
107c87325cf461 Aaron Lewis 2020-08-18  5213  		.bitmap = bitmap,

In case of overflow then "bitmap" is 0x16 and .nmsrs is a very high
number.

107c87325cf461 Aaron Lewis 2020-08-18  5214  	};
107c87325cf461 Aaron Lewis 2020-08-18  5215  
107c87325cf461 Aaron Lewis 2020-08-18  5216  	if (range.flags & ~(KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE)) {
107c87325cf461 Aaron Lewis 2020-08-18  5217  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5218  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5219  	}
107c87325cf461 Aaron Lewis 2020-08-18  5220  
107c87325cf461 Aaron Lewis 2020-08-18  5221  	/*
107c87325cf461 Aaron Lewis 2020-08-18  5222  	 * Protect from concurrent calls to this function that could trigger
107c87325cf461 Aaron Lewis 2020-08-18  5223  	 * a TOCTOU violation on kvm->arch.msr_allowlist_ranges_count.
107c87325cf461 Aaron Lewis 2020-08-18  5224  	 */
107c87325cf461 Aaron Lewis 2020-08-18  5225  	mutex_lock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5226  
107c87325cf461 Aaron Lewis 2020-08-18  5227  	if (kvm->arch.msr_allowlist_ranges_count >=
107c87325cf461 Aaron Lewis 2020-08-18  5228  	    ARRAY_SIZE(kvm->arch.msr_allowlist_ranges)) {
107c87325cf461 Aaron Lewis 2020-08-18  5229  		r = -E2BIG;
107c87325cf461 Aaron Lewis 2020-08-18  5230  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5231  	}
107c87325cf461 Aaron Lewis 2020-08-18  5232  
107c87325cf461 Aaron Lewis 2020-08-18  5233  	if (msr_range_overlaps(kvm, &range)) {
107c87325cf461 Aaron Lewis 2020-08-18  5234  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5235  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5236  	}
107c87325cf461 Aaron Lewis 2020-08-18  5237  
107c87325cf461 Aaron Lewis 2020-08-18  5238  	/* Everything ok, add this range identifier to our global pool */
107c87325cf461 Aaron Lewis 2020-08-18  5239  	ranges[kvm->arch.msr_allowlist_ranges_count] = range;
107c87325cf461 Aaron Lewis 2020-08-18  5240  	/* Make sure we filled the array before we tell anyone to walk it */
107c87325cf461 Aaron Lewis 2020-08-18  5241  	smp_wmb();
107c87325cf461 Aaron Lewis 2020-08-18  5242  	kvm->arch.msr_allowlist_ranges_count++;
107c87325cf461 Aaron Lewis 2020-08-18  5243  
107c87325cf461 Aaron Lewis 2020-08-18  5244  out_locked:
107c87325cf461 Aaron Lewis 2020-08-18  5245  	mutex_unlock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5246  out:
107c87325cf461 Aaron Lewis 2020-08-18  5247  	if (r)
107c87325cf461 Aaron Lewis 2020-08-18 @5248  		kfree(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5249  
107c87325cf461 Aaron Lewis 2020-08-18  5250  	return r;
107c87325cf461 Aaron Lewis 2020-08-18  5251  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 43464 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
@ 2020-08-31 10:39     ` Dan Carpenter
  0 siblings, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-08-31 10:39 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 6650 bytes --]

Hi Aaron,

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-randconfig-m001-20200827 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()

# https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
vim +/bitmap +5248 arch/x86/kvm/x86.c

107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
107c87325cf461 Aaron Lewis 2020-08-18  5182  {
107c87325cf461 Aaron Lewis 2020-08-18  5183  	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
107c87325cf461 Aaron Lewis 2020-08-18  5184  	struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
107c87325cf461 Aaron Lewis 2020-08-18  5185  	struct msr_bitmap_range range;
107c87325cf461 Aaron Lewis 2020-08-18  5186  	struct kvm_msr_allowlist kernel_msr_allowlist;
107c87325cf461 Aaron Lewis 2020-08-18  5187  	unsigned long *bitmap = NULL;
107c87325cf461 Aaron Lewis 2020-08-18  5188  	size_t bitmap_size;
107c87325cf461 Aaron Lewis 2020-08-18  5189  	int r = 0;
107c87325cf461 Aaron Lewis 2020-08-18  5190  
107c87325cf461 Aaron Lewis 2020-08-18  5191  	if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
107c87325cf461 Aaron Lewis 2020-08-18  5192  			   sizeof(kernel_msr_allowlist))) {
107c87325cf461 Aaron Lewis 2020-08-18  5193  		r = -EFAULT;
107c87325cf461 Aaron Lewis 2020-08-18  5194  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5195  	}
107c87325cf461 Aaron Lewis 2020-08-18  5196  
107c87325cf461 Aaron Lewis 2020-08-18  5197  	bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
On 32 bit systems the BITS_TO_LONGS() can integer overflow if
kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
that case bitmap_size is zero.

107c87325cf461 Aaron Lewis 2020-08-18  5198  	if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
107c87325cf461 Aaron Lewis 2020-08-18  5199  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5200  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5201  	}
107c87325cf461 Aaron Lewis 2020-08-18  5202  
107c87325cf461 Aaron Lewis 2020-08-18  5203  	bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
107c87325cf461 Aaron Lewis 2020-08-18  5204  	if (IS_ERR(bitmap)) {
107c87325cf461 Aaron Lewis 2020-08-18  5205  		r = PTR_ERR(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5206  		goto out;
                                                        ^^^^^^^^
"out" is always a vague label name.  It's better style to return
directly instead of doing a complicated no-op.

	if (IS_ERR(bitmap))
		return PTR_ERR(bitmap);

107c87325cf461 Aaron Lewis 2020-08-18  5207  	}
107c87325cf461 Aaron Lewis 2020-08-18  5208  
107c87325cf461 Aaron Lewis 2020-08-18  5209  	range = (struct msr_bitmap_range) {
107c87325cf461 Aaron Lewis 2020-08-18  5210  		.flags = kernel_msr_allowlist.flags,
107c87325cf461 Aaron Lewis 2020-08-18  5211  		.base = kernel_msr_allowlist.base,
107c87325cf461 Aaron Lewis 2020-08-18  5212  		.nmsrs = kernel_msr_allowlist.nmsrs,
107c87325cf461 Aaron Lewis 2020-08-18  5213  		.bitmap = bitmap,

In case of overflow then "bitmap" is 0x16 and .nmsrs is a very high
number.

107c87325cf461 Aaron Lewis 2020-08-18  5214  	};
107c87325cf461 Aaron Lewis 2020-08-18  5215  
107c87325cf461 Aaron Lewis 2020-08-18  5216  	if (range.flags & ~(KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE)) {
107c87325cf461 Aaron Lewis 2020-08-18  5217  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5218  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5219  	}
107c87325cf461 Aaron Lewis 2020-08-18  5220  
107c87325cf461 Aaron Lewis 2020-08-18  5221  	/*
107c87325cf461 Aaron Lewis 2020-08-18  5222  	 * Protect from concurrent calls to this function that could trigger
107c87325cf461 Aaron Lewis 2020-08-18  5223  	 * a TOCTOU violation on kvm->arch.msr_allowlist_ranges_count.
107c87325cf461 Aaron Lewis 2020-08-18  5224  	 */
107c87325cf461 Aaron Lewis 2020-08-18  5225  	mutex_lock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5226  
107c87325cf461 Aaron Lewis 2020-08-18  5227  	if (kvm->arch.msr_allowlist_ranges_count >=
107c87325cf461 Aaron Lewis 2020-08-18  5228  	    ARRAY_SIZE(kvm->arch.msr_allowlist_ranges)) {
107c87325cf461 Aaron Lewis 2020-08-18  5229  		r = -E2BIG;
107c87325cf461 Aaron Lewis 2020-08-18  5230  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5231  	}
107c87325cf461 Aaron Lewis 2020-08-18  5232  
107c87325cf461 Aaron Lewis 2020-08-18  5233  	if (msr_range_overlaps(kvm, &range)) {
107c87325cf461 Aaron Lewis 2020-08-18  5234  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5235  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5236  	}
107c87325cf461 Aaron Lewis 2020-08-18  5237  
107c87325cf461 Aaron Lewis 2020-08-18  5238  	/* Everything ok, add this range identifier to our global pool */
107c87325cf461 Aaron Lewis 2020-08-18  5239  	ranges[kvm->arch.msr_allowlist_ranges_count] = range;
107c87325cf461 Aaron Lewis 2020-08-18  5240  	/* Make sure we filled the array before we tell anyone to walk it */
107c87325cf461 Aaron Lewis 2020-08-18  5241  	smp_wmb();
107c87325cf461 Aaron Lewis 2020-08-18  5242  	kvm->arch.msr_allowlist_ranges_count++;
107c87325cf461 Aaron Lewis 2020-08-18  5243  
107c87325cf461 Aaron Lewis 2020-08-18  5244  out_locked:
107c87325cf461 Aaron Lewis 2020-08-18  5245  	mutex_unlock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5246  out:
107c87325cf461 Aaron Lewis 2020-08-18  5247  	if (r)
107c87325cf461 Aaron Lewis 2020-08-18 @5248  		kfree(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5249  
107c87325cf461 Aaron Lewis 2020-08-18  5250  	return r;
107c87325cf461 Aaron Lewis 2020-08-18  5251  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 43464 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
@ 2020-08-31 10:39     ` Dan Carpenter
  0 siblings, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-08-31 10:39 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6650 bytes --]

Hi Aaron,

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-randconfig-m001-20200827 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()

# https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
vim +/bitmap +5248 arch/x86/kvm/x86.c

107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
107c87325cf461 Aaron Lewis 2020-08-18  5182  {
107c87325cf461 Aaron Lewis 2020-08-18  5183  	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
107c87325cf461 Aaron Lewis 2020-08-18  5184  	struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
107c87325cf461 Aaron Lewis 2020-08-18  5185  	struct msr_bitmap_range range;
107c87325cf461 Aaron Lewis 2020-08-18  5186  	struct kvm_msr_allowlist kernel_msr_allowlist;
107c87325cf461 Aaron Lewis 2020-08-18  5187  	unsigned long *bitmap = NULL;
107c87325cf461 Aaron Lewis 2020-08-18  5188  	size_t bitmap_size;
107c87325cf461 Aaron Lewis 2020-08-18  5189  	int r = 0;
107c87325cf461 Aaron Lewis 2020-08-18  5190  
107c87325cf461 Aaron Lewis 2020-08-18  5191  	if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
107c87325cf461 Aaron Lewis 2020-08-18  5192  			   sizeof(kernel_msr_allowlist))) {
107c87325cf461 Aaron Lewis 2020-08-18  5193  		r = -EFAULT;
107c87325cf461 Aaron Lewis 2020-08-18  5194  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5195  	}
107c87325cf461 Aaron Lewis 2020-08-18  5196  
107c87325cf461 Aaron Lewis 2020-08-18  5197  	bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
On 32 bit systems the BITS_TO_LONGS() can integer overflow if
kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
that case bitmap_size is zero.

107c87325cf461 Aaron Lewis 2020-08-18  5198  	if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
107c87325cf461 Aaron Lewis 2020-08-18  5199  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5200  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5201  	}
107c87325cf461 Aaron Lewis 2020-08-18  5202  
107c87325cf461 Aaron Lewis 2020-08-18  5203  	bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
107c87325cf461 Aaron Lewis 2020-08-18  5204  	if (IS_ERR(bitmap)) {
107c87325cf461 Aaron Lewis 2020-08-18  5205  		r = PTR_ERR(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5206  		goto out;
                                                        ^^^^^^^^
"out" is always a vague label name.  It's better style to return
directly instead of doing a complicated no-op.

	if (IS_ERR(bitmap))
		return PTR_ERR(bitmap);

107c87325cf461 Aaron Lewis 2020-08-18  5207  	}
107c87325cf461 Aaron Lewis 2020-08-18  5208  
107c87325cf461 Aaron Lewis 2020-08-18  5209  	range = (struct msr_bitmap_range) {
107c87325cf461 Aaron Lewis 2020-08-18  5210  		.flags = kernel_msr_allowlist.flags,
107c87325cf461 Aaron Lewis 2020-08-18  5211  		.base = kernel_msr_allowlist.base,
107c87325cf461 Aaron Lewis 2020-08-18  5212  		.nmsrs = kernel_msr_allowlist.nmsrs,
107c87325cf461 Aaron Lewis 2020-08-18  5213  		.bitmap = bitmap,

In case of overflow then "bitmap" is 0x16 and .nmsrs is a very high
number.

107c87325cf461 Aaron Lewis 2020-08-18  5214  	};
107c87325cf461 Aaron Lewis 2020-08-18  5215  
107c87325cf461 Aaron Lewis 2020-08-18  5216  	if (range.flags & ~(KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE)) {
107c87325cf461 Aaron Lewis 2020-08-18  5217  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5218  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5219  	}
107c87325cf461 Aaron Lewis 2020-08-18  5220  
107c87325cf461 Aaron Lewis 2020-08-18  5221  	/*
107c87325cf461 Aaron Lewis 2020-08-18  5222  	 * Protect from concurrent calls to this function that could trigger
107c87325cf461 Aaron Lewis 2020-08-18  5223  	 * a TOCTOU violation on kvm->arch.msr_allowlist_ranges_count.
107c87325cf461 Aaron Lewis 2020-08-18  5224  	 */
107c87325cf461 Aaron Lewis 2020-08-18  5225  	mutex_lock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5226  
107c87325cf461 Aaron Lewis 2020-08-18  5227  	if (kvm->arch.msr_allowlist_ranges_count >=
107c87325cf461 Aaron Lewis 2020-08-18  5228  	    ARRAY_SIZE(kvm->arch.msr_allowlist_ranges)) {
107c87325cf461 Aaron Lewis 2020-08-18  5229  		r = -E2BIG;
107c87325cf461 Aaron Lewis 2020-08-18  5230  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5231  	}
107c87325cf461 Aaron Lewis 2020-08-18  5232  
107c87325cf461 Aaron Lewis 2020-08-18  5233  	if (msr_range_overlaps(kvm, &range)) {
107c87325cf461 Aaron Lewis 2020-08-18  5234  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5235  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5236  	}
107c87325cf461 Aaron Lewis 2020-08-18  5237  
107c87325cf461 Aaron Lewis 2020-08-18  5238  	/* Everything ok, add this range identifier to our global pool */
107c87325cf461 Aaron Lewis 2020-08-18  5239  	ranges[kvm->arch.msr_allowlist_ranges_count] = range;
107c87325cf461 Aaron Lewis 2020-08-18  5240  	/* Make sure we filled the array before we tell anyone to walk it */
107c87325cf461 Aaron Lewis 2020-08-18  5241  	smp_wmb();
107c87325cf461 Aaron Lewis 2020-08-18  5242  	kvm->arch.msr_allowlist_ranges_count++;
107c87325cf461 Aaron Lewis 2020-08-18  5243  
107c87325cf461 Aaron Lewis 2020-08-18  5244  out_locked:
107c87325cf461 Aaron Lewis 2020-08-18  5245  	mutex_unlock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5246  out:
107c87325cf461 Aaron Lewis 2020-08-18  5247  	if (r)
107c87325cf461 Aaron Lewis 2020-08-18 @5248  		kfree(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5249  
107c87325cf461 Aaron Lewis 2020-08-18  5250  	return r;
107c87325cf461 Aaron Lewis 2020-08-18  5251  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 43464 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
  2020-08-31 10:39     ` Dan Carpenter
  (?)
  (?)
@ 2020-09-01 19:13     ` Alexander Graf
  2020-09-02  7:31         ` Dan Carpenter
  -1 siblings, 1 reply; 50+ messages in thread
From: Alexander Graf @ 2020-09-01 19:13 UTC (permalink / raw)
  To: Dan Carpenter, kbuild, Aaron Lewis, jmattson
  Cc: lkp, kbuild-all, pshier, oupton, kvm, KarimAllah Ahmed



On 31.08.20 12:39, Dan Carpenter wrote:
> 
> Hi Aaron,
> 
> url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
> config: x86_64-randconfig-m001-20200827 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

Thanks a bunch for looking at this! I'd squash in the change with the 
actual patch as it's tiny, so I'm not sure how attribution would work in 
that case.

> 
> smatch warnings:
> arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()
> 
> # https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> vim +/bitmap +5248 arch/x86/kvm/x86.c
> 
> 107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
> 107c87325cf461 Aaron Lewis 2020-08-18  5182  {
> 107c87325cf461 Aaron Lewis 2020-08-18  5183     struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> 107c87325cf461 Aaron Lewis 2020-08-18  5184     struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
> 107c87325cf461 Aaron Lewis 2020-08-18  5185     struct msr_bitmap_range range;
> 107c87325cf461 Aaron Lewis 2020-08-18  5186     struct kvm_msr_allowlist kernel_msr_allowlist;
> 107c87325cf461 Aaron Lewis 2020-08-18  5187     unsigned long *bitmap = NULL;
> 107c87325cf461 Aaron Lewis 2020-08-18  5188     size_t bitmap_size;
> 107c87325cf461 Aaron Lewis 2020-08-18  5189     int r = 0;
> 107c87325cf461 Aaron Lewis 2020-08-18  5190
> 107c87325cf461 Aaron Lewis 2020-08-18  5191     if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
> 107c87325cf461 Aaron Lewis 2020-08-18  5192                        sizeof(kernel_msr_allowlist))) {
> 107c87325cf461 Aaron Lewis 2020-08-18  5193             r = -EFAULT;
> 107c87325cf461 Aaron Lewis 2020-08-18  5194             goto out;
> 107c87325cf461 Aaron Lewis 2020-08-18  5195     }
> 107c87325cf461 Aaron Lewis 2020-08-18  5196
> 107c87325cf461 Aaron Lewis 2020-08-18  5197     bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
>                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
> On 32 bit systems the BITS_TO_LONGS() can integer overflow if
> kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
> that case bitmap_size is zero.

Nice catch! It should be enough to ...

> 
> 107c87325cf461 Aaron Lewis 2020-08-18  5198     if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {

... add a check for !bitmap_size here as well then, right?

> 107c87325cf461 Aaron Lewis 2020-08-18  5199             r = -EINVAL;
> 107c87325cf461 Aaron Lewis 2020-08-18  5200             goto out;
> 107c87325cf461 Aaron Lewis 2020-08-18  5201     }
> 107c87325cf461 Aaron Lewis 2020-08-18  5202
> 107c87325cf461 Aaron Lewis 2020-08-18  5203     bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
> 107c87325cf461 Aaron Lewis 2020-08-18  5204     if (IS_ERR(bitmap)) {
> 107c87325cf461 Aaron Lewis 2020-08-18  5205             r = PTR_ERR(bitmap);
> 107c87325cf461 Aaron Lewis 2020-08-18  5206             goto out;
>                                                          ^^^^^^^^
> "out" is always a vague label name.  It's better style to return
> directly instead of doing a complicated no-op.
> 
>          if (IS_ERR(bitmap))
>                  return PTR_ERR(bitmap);

I agree 100% :). In fact, I agree so much that I already did change it 
for v6 last week, just did not send it out yet.

> 
> 107c87325cf461 Aaron Lewis 2020-08-18  5207     }
> 107c87325cf461 Aaron Lewis 2020-08-18  5208
> 107c87325cf461 Aaron Lewis 2020-08-18  5209     range = (struct msr_bitmap_range) {
> 107c87325cf461 Aaron Lewis 2020-08-18  5210             .flags = kernel_msr_allowlist.flags,
> 107c87325cf461 Aaron Lewis 2020-08-18  5211             .base = kernel_msr_allowlist.base,
> 107c87325cf461 Aaron Lewis 2020-08-18  5212             .nmsrs = kernel_msr_allowlist.nmsrs,
> 107c87325cf461 Aaron Lewis 2020-08-18  5213             .bitmap = bitmap,
> 
> In case of overflow then "bitmap" is 0x16 and .nmsrs is a very high
> number.

The overflow case should disappear with the additional check above, right?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
  2020-09-01 19:13     ` Alexander Graf
  2020-09-02  7:31         ` Dan Carpenter
@ 2020-09-02  7:31         ` Dan Carpenter
  0 siblings, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-09-02  7:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kbuild, Aaron Lewis, jmattson, lkp, kbuild-all, pshier, oupton,
	kvm, KarimAllah Ahmed

On Tue, Sep 01, 2020 at 09:13:10PM +0200, Alexander Graf wrote:
> 
> 
> On 31.08.20 12:39, Dan Carpenter wrote:
> > 
> > Hi Aaron,
> > 
> > url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git  linux-next
> > config: x86_64-randconfig-m001-20200827 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> Thanks a bunch for looking at this! I'd squash in the change with the actual
> patch as it's tiny, so I'm not sure how attribution would work in that case.

Yep.  No problem.  These are just a template that gets sent to everyone.

> 
> > 
> > smatch warnings:
> > arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()
> > 
> > # https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > git remote add linux-review https://github.com/0day-ci/linux git fetch
> > --no-tags linux-review
> > Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > vim +/bitmap +5248 arch/x86/kvm/x86.c
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
> > 107c87325cf461 Aaron Lewis 2020-08-18  5182  {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5183     struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5184     struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5185     struct msr_bitmap_range range;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5186     struct kvm_msr_allowlist kernel_msr_allowlist;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5187     unsigned long *bitmap = NULL;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5188     size_t bitmap_size;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5189     int r = 0;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5190
> > 107c87325cf461 Aaron Lewis 2020-08-18  5191     if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
> > 107c87325cf461 Aaron Lewis 2020-08-18  5192                        sizeof(kernel_msr_allowlist))) {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5193             r = -EFAULT;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5194             goto out;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5195     }
> > 107c87325cf461 Aaron Lewis 2020-08-18  5196
> > 107c87325cf461 Aaron Lewis 2020-08-18  5197     bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
> >                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
> > On 32 bit systems the BITS_TO_LONGS() can integer overflow if
> > kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
> > that case bitmap_size is zero.
> 
> Nice catch! It should be enough to ...
> 
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5198     if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
> 
> ... add a check for !bitmap_size here as well then, right?

Yup.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
@ 2020-09-02  7:31         ` Dan Carpenter
  0 siblings, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-09-02  7:31 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 3387 bytes --]

On Tue, Sep 01, 2020 at 09:13:10PM +0200, Alexander Graf wrote:
> 
> 
> On 31.08.20 12:39, Dan Carpenter wrote:
> > 
> > Hi Aaron,
> > 
> > url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git  linux-next
> > config: x86_64-randconfig-m001-20200827 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> Thanks a bunch for looking at this! I'd squash in the change with the actual
> patch as it's tiny, so I'm not sure how attribution would work in that case.

Yep.  No problem.  These are just a template that gets sent to everyone.

> 
> > 
> > smatch warnings:
> > arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()
> > 
> > # https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > git remote add linux-review https://github.com/0day-ci/linux git fetch
> > --no-tags linux-review
> > Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > vim +/bitmap +5248 arch/x86/kvm/x86.c
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
> > 107c87325cf461 Aaron Lewis 2020-08-18  5182  {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5183     struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5184     struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5185     struct msr_bitmap_range range;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5186     struct kvm_msr_allowlist kernel_msr_allowlist;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5187     unsigned long *bitmap = NULL;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5188     size_t bitmap_size;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5189     int r = 0;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5190
> > 107c87325cf461 Aaron Lewis 2020-08-18  5191     if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
> > 107c87325cf461 Aaron Lewis 2020-08-18  5192                        sizeof(kernel_msr_allowlist))) {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5193             r = -EFAULT;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5194             goto out;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5195     }
> > 107c87325cf461 Aaron Lewis 2020-08-18  5196
> > 107c87325cf461 Aaron Lewis 2020-08-18  5197     bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
> >                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
> > On 32 bit systems the BITS_TO_LONGS() can integer overflow if
> > kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
> > that case bitmap_size is zero.
> 
> Nice catch! It should be enough to ...
> 
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5198     if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
> 
> ... add a check for !bitmap_size here as well then, right?

Yup.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
@ 2020-09-02  7:31         ` Dan Carpenter
  0 siblings, 0 replies; 50+ messages in thread
From: Dan Carpenter @ 2020-09-02  7:31 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3387 bytes --]

On Tue, Sep 01, 2020 at 09:13:10PM +0200, Alexander Graf wrote:
> 
> 
> On 31.08.20 12:39, Dan Carpenter wrote:
> > 
> > Hi Aaron,
> > 
> > url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git  linux-next
> > config: x86_64-randconfig-m001-20200827 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> Thanks a bunch for looking at this! I'd squash in the change with the actual
> patch as it's tiny, so I'm not sure how attribution would work in that case.

Yep.  No problem.  These are just a template that gets sent to everyone.

> 
> > 
> > smatch warnings:
> > arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()
> > 
> > # https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > git remote add linux-review https://github.com/0day-ci/linux git fetch
> > --no-tags linux-review
> > Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
> > git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
> > vim +/bitmap +5248 arch/x86/kvm/x86.c
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
> > 107c87325cf461 Aaron Lewis 2020-08-18  5182  {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5183     struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5184     struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5185     struct msr_bitmap_range range;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5186     struct kvm_msr_allowlist kernel_msr_allowlist;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5187     unsigned long *bitmap = NULL;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5188     size_t bitmap_size;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5189     int r = 0;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5190
> > 107c87325cf461 Aaron Lewis 2020-08-18  5191     if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
> > 107c87325cf461 Aaron Lewis 2020-08-18  5192                        sizeof(kernel_msr_allowlist))) {
> > 107c87325cf461 Aaron Lewis 2020-08-18  5193             r = -EFAULT;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5194             goto out;
> > 107c87325cf461 Aaron Lewis 2020-08-18  5195     }
> > 107c87325cf461 Aaron Lewis 2020-08-18  5196
> > 107c87325cf461 Aaron Lewis 2020-08-18  5197     bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
> >                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^n
> > On 32 bit systems the BITS_TO_LONGS() can integer overflow if
> > kernel_msr_allowlist.nmsrs is larger than ULONG_MAX - bits_per_long.  In
> > that case bitmap_size is zero.
> 
> Nice catch! It should be enough to ...
> 
> > 
> > 107c87325cf461 Aaron Lewis 2020-08-18  5198     if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
> 
> ... add a check for !bitmap_size here as well then, right?

Yup.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation
@ 2020-08-27 14:06 kernel test robot
  0 siblings, 0 replies; 50+ messages in thread
From: kernel test robot @ 2020-08-27 14:06 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 7021 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20200818211533.849501-3-aaronlewis@google.com>
References: <20200818211533.849501-3-aaronlewis@google.com>
TO: Aaron Lewis <aaronlewis@google.com>
TO: jmattson(a)google.com
TO: graf(a)amazon.com
CC: pshier(a)google.com
CC: oupton(a)google.com
CC: kvm(a)vger.kernel.org
CC: Aaron Lewis <aaronlewis@google.com>
CC: KarimAllah Ahmed <karahmed@amazon.de>

Hi Aaron,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v5.9-rc2 next-20200827]
[cannot apply to kvms390/next vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
:::::: branch date: 9 days ago
:::::: commit date: 9 days ago
config: x86_64-randconfig-m001-20200827 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
arch/x86/kvm/x86.c:5248 kvm_vm_ioctl_add_msr_allowlist() error: 'bitmap' dereferencing possible ERR_PTR()

# https://github.com/0day-ci/linux/commit/107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Aaron-Lewis/Allow-userspace-to-manage-MSRs/20200819-051903
git checkout 107c87325cf461b7b1bd07bb6ddbaf808a8d8a2a
vim +/bitmap +5248 arch/x86/kvm/x86.c

107c87325cf461 Aaron Lewis 2020-08-18  5180  
107c87325cf461 Aaron Lewis 2020-08-18  5181  static int kvm_vm_ioctl_add_msr_allowlist(struct kvm *kvm, void __user *argp)
107c87325cf461 Aaron Lewis 2020-08-18  5182  {
107c87325cf461 Aaron Lewis 2020-08-18  5183  	struct msr_bitmap_range *ranges = kvm->arch.msr_allowlist_ranges;
107c87325cf461 Aaron Lewis 2020-08-18  5184  	struct kvm_msr_allowlist __user *user_msr_allowlist = argp;
107c87325cf461 Aaron Lewis 2020-08-18  5185  	struct msr_bitmap_range range;
107c87325cf461 Aaron Lewis 2020-08-18  5186  	struct kvm_msr_allowlist kernel_msr_allowlist;
107c87325cf461 Aaron Lewis 2020-08-18  5187  	unsigned long *bitmap = NULL;
107c87325cf461 Aaron Lewis 2020-08-18  5188  	size_t bitmap_size;
107c87325cf461 Aaron Lewis 2020-08-18  5189  	int r = 0;
107c87325cf461 Aaron Lewis 2020-08-18  5190  
107c87325cf461 Aaron Lewis 2020-08-18  5191  	if (copy_from_user(&kernel_msr_allowlist, user_msr_allowlist,
107c87325cf461 Aaron Lewis 2020-08-18  5192  			   sizeof(kernel_msr_allowlist))) {
107c87325cf461 Aaron Lewis 2020-08-18  5193  		r = -EFAULT;
107c87325cf461 Aaron Lewis 2020-08-18  5194  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5195  	}
107c87325cf461 Aaron Lewis 2020-08-18  5196  
107c87325cf461 Aaron Lewis 2020-08-18  5197  	bitmap_size = BITS_TO_LONGS(kernel_msr_allowlist.nmsrs) * sizeof(long);
107c87325cf461 Aaron Lewis 2020-08-18  5198  	if (bitmap_size > KVM_MSR_ALLOWLIST_MAX_LEN) {
107c87325cf461 Aaron Lewis 2020-08-18  5199  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5200  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5201  	}
107c87325cf461 Aaron Lewis 2020-08-18  5202  
107c87325cf461 Aaron Lewis 2020-08-18  5203  	bitmap = memdup_user(user_msr_allowlist->bitmap, bitmap_size);
107c87325cf461 Aaron Lewis 2020-08-18  5204  	if (IS_ERR(bitmap)) {
107c87325cf461 Aaron Lewis 2020-08-18  5205  		r = PTR_ERR(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5206  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5207  	}
107c87325cf461 Aaron Lewis 2020-08-18  5208  
107c87325cf461 Aaron Lewis 2020-08-18  5209  	range = (struct msr_bitmap_range) {
107c87325cf461 Aaron Lewis 2020-08-18  5210  		.flags = kernel_msr_allowlist.flags,
107c87325cf461 Aaron Lewis 2020-08-18  5211  		.base = kernel_msr_allowlist.base,
107c87325cf461 Aaron Lewis 2020-08-18  5212  		.nmsrs = kernel_msr_allowlist.nmsrs,
107c87325cf461 Aaron Lewis 2020-08-18  5213  		.bitmap = bitmap,
107c87325cf461 Aaron Lewis 2020-08-18  5214  	};
107c87325cf461 Aaron Lewis 2020-08-18  5215  
107c87325cf461 Aaron Lewis 2020-08-18  5216  	if (range.flags & ~(KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE)) {
107c87325cf461 Aaron Lewis 2020-08-18  5217  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5218  		goto out;
107c87325cf461 Aaron Lewis 2020-08-18  5219  	}
107c87325cf461 Aaron Lewis 2020-08-18  5220  
107c87325cf461 Aaron Lewis 2020-08-18  5221  	/*
107c87325cf461 Aaron Lewis 2020-08-18  5222  	 * Protect from concurrent calls to this function that could trigger
107c87325cf461 Aaron Lewis 2020-08-18  5223  	 * a TOCTOU violation on kvm->arch.msr_allowlist_ranges_count.
107c87325cf461 Aaron Lewis 2020-08-18  5224  	 */
107c87325cf461 Aaron Lewis 2020-08-18  5225  	mutex_lock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5226  
107c87325cf461 Aaron Lewis 2020-08-18  5227  	if (kvm->arch.msr_allowlist_ranges_count >=
107c87325cf461 Aaron Lewis 2020-08-18  5228  	    ARRAY_SIZE(kvm->arch.msr_allowlist_ranges)) {
107c87325cf461 Aaron Lewis 2020-08-18  5229  		r = -E2BIG;
107c87325cf461 Aaron Lewis 2020-08-18  5230  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5231  	}
107c87325cf461 Aaron Lewis 2020-08-18  5232  
107c87325cf461 Aaron Lewis 2020-08-18  5233  	if (msr_range_overlaps(kvm, &range)) {
107c87325cf461 Aaron Lewis 2020-08-18  5234  		r = -EINVAL;
107c87325cf461 Aaron Lewis 2020-08-18  5235  		goto out_locked;
107c87325cf461 Aaron Lewis 2020-08-18  5236  	}
107c87325cf461 Aaron Lewis 2020-08-18  5237  
107c87325cf461 Aaron Lewis 2020-08-18  5238  	/* Everything ok, add this range identifier to our global pool */
107c87325cf461 Aaron Lewis 2020-08-18  5239  	ranges[kvm->arch.msr_allowlist_ranges_count] = range;
107c87325cf461 Aaron Lewis 2020-08-18  5240  	/* Make sure we filled the array before we tell anyone to walk it */
107c87325cf461 Aaron Lewis 2020-08-18  5241  	smp_wmb();
107c87325cf461 Aaron Lewis 2020-08-18  5242  	kvm->arch.msr_allowlist_ranges_count++;
107c87325cf461 Aaron Lewis 2020-08-18  5243  
107c87325cf461 Aaron Lewis 2020-08-18  5244  out_locked:
107c87325cf461 Aaron Lewis 2020-08-18  5245  	mutex_unlock(&kvm->lock);
107c87325cf461 Aaron Lewis 2020-08-18  5246  out:
107c87325cf461 Aaron Lewis 2020-08-18  5247  	if (r)
107c87325cf461 Aaron Lewis 2020-08-18 @5248  		kfree(bitmap);
107c87325cf461 Aaron Lewis 2020-08-18  5249  
107c87325cf461 Aaron Lewis 2020-08-18  5250  	return r;
107c87325cf461 Aaron Lewis 2020-08-18  5251  }
107c87325cf461 Aaron Lewis 2020-08-18  5252  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 43464 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2020-09-02  7:32 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 21:15 [PATCH v3 00/12] Allow userspace to manage MSRs Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 01/12] KVM: x86: Deflect unknown MSR accesses to user space Aaron Lewis
2020-08-19  8:42   ` Alexander Graf
2020-08-18 21:15 ` [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation Aaron Lewis
2020-08-19  8:53   ` Alexander Graf
2020-08-31 10:39   ` Dan Carpenter
2020-08-31 10:39     ` Dan Carpenter
2020-08-31 10:39     ` Dan Carpenter
2020-09-01 19:13     ` Alexander Graf
2020-09-02  7:31       ` Dan Carpenter
2020-09-02  7:31         ` Dan Carpenter
2020-09-02  7:31         ` Dan Carpenter
2020-08-18 21:15 ` [PATCH v3 03/12] KVM: selftests: Add test for user space MSR handling Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 04/12] KVM: x86: Add ioctl for accepting a userspace provided MSR list Aaron Lewis
2020-08-19  9:00   ` Alexander Graf
2020-08-20 17:30     ` Jim Mattson
2020-08-20 21:49       ` Alexander Graf
2020-08-20 22:28         ` Jim Mattson
2020-08-18 21:15 ` [PATCH v3 05/12] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr Aaron Lewis
2020-08-19 10:25   ` Alexander Graf
2020-08-20 18:17   ` Jim Mattson
2020-08-20 21:59     ` Alexander Graf
2020-08-20 22:55       ` Jim Mattson
2020-08-21 17:58         ` Jim Mattson
2020-08-24  1:35           ` Alexander Graf
2020-08-24 17:23             ` Jim Mattson
2020-08-24 18:09               ` Alexander Graf
2020-08-24 18:34                 ` Jim Mattson
2020-08-18 21:15 ` [PATCH v3 06/12] KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears " Aaron Lewis
2020-08-19  1:12   ` kernel test robot
2020-08-19  1:12     ` kernel test robot
2020-08-19  1:12   ` [RFC PATCH] KVM: x86: vmx_set_user_msr_intercept() can be static kernel test robot
2020-08-19  1:12     ` kernel test robot
2020-08-19 15:26   ` [PATCH v3 07/12] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs Alexander Graf
2020-08-20  0:18     ` Aaron Lewis
2020-08-20 22:04       ` Alexander Graf
2020-08-20 22:35         ` Jim Mattson
2020-08-21 14:27           ` Aaron Lewis
2020-08-21 16:07             ` Alexander Graf
2020-08-21 16:43               ` Aaron Lewis
2020-08-26 15:48   ` kernel test robot
2020-08-26 15:48     ` kernel test robot
2020-08-18 21:15 ` [PATCH v3 08/12] selftests: kvm: Fix the segment descriptor layout to match the actual layout Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 09/12] selftests: kvm: Clear uc so UCALL_NONE is being properly reported Aaron Lewis
2020-08-19  9:13   ` Andrew Jones
2020-08-18 21:15 ` [PATCH v3 10/12] selftests: kvm: Add exception handling to selftests Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 11/12] selftests: kvm: Add a test to exercise the userspace MSR list Aaron Lewis
2020-08-18 21:15 ` [PATCH v3 12/12] selftests: kvm: Add emulated rdmsr, wrmsr tests Aaron Lewis
2020-08-27 14:06 [PATCH v3 02/12] KVM: x86: Introduce allow list for MSR emulation kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.