All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Restrict PV features to only enabled guests
@ 2020-08-06 15:14 Oliver Upton
  2020-08-06 15:14 ` [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn Oliver Upton
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Oliver Upton @ 2020-08-06 15:14 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Sean Christopherson, Oliver Upton

To date, KVM has allowed guests to use paravirtual interfaces regardless
of the configured CPUID. While almost any guest will consult the
KVM_CPUID_FEATURES leaf _before_ using PV features, it is still
undesirable to have such interfaces silently present.

This series aims to address the issue by adding explicit checks against
the guest's CPUID when servicing any paravirtual feature. Since this
effectively changes the guest/hypervisor ABI, a KVM_CAP is warranted to
guard the new behavior.

Patches 1-2 refactor some of the PV code in anticipation of the change.
Patch 3 introduces the checks + KVM_CAP. Finally, patch 4 fixes some doc
typos that were noticed when working on this series.

v1 => v2:
 - Strip Change-Id footers (checkpatch is your friend!)

v2 => v3:
 - Mark kvm_write_system_time() as static

Parent commit: f3633c268354 ("Merge tag 'kvm-s390-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next-5.6")

Oliver Upton (4):
  kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper
    fn
  kvm: x86: set wall_clock in kvm_write_wall_clock()
  kvm: x86: only provide PV features if enabled in guest's CPUID
  Documentation: kvm: fix some typos in cpuid.rst

 Documentation/virt/kvm/api.rst   |  11 +++
 Documentation/virt/kvm/cpuid.rst |  88 +++++++++++-----------
 arch/x86/include/asm/kvm_host.h  |   6 ++
 arch/x86/kvm/cpuid.h             |  16 ++++
 arch/x86/kvm/x86.c               | 122 +++++++++++++++++++++++--------
 include/uapi/linux/kvm.h         |   1 +
 6 files changed, 171 insertions(+), 73 deletions(-)

-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn
  2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
@ 2020-08-06 15:14 ` Oliver Upton
  2020-08-14  0:18   ` Wanpeng Li
  2020-08-06 15:14 ` [PATCH v3 2/4] kvm: x86: set wall_clock in kvm_write_wall_clock() Oliver Upton
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2020-08-06 15:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Oliver Upton, Jim Mattson,
	Peter Shier

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 58 +++++++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dc4370394ab8..5ba713108686 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1822,6 +1822,34 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock)
 	kvm_write_guest(kvm, wall_clock, &version, sizeof(version));
 }
 
+static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
+				  bool old_msr, bool host_initiated)
+{
+	struct kvm_arch *ka = &vcpu->kvm->arch;
+
+	if (vcpu->vcpu_id == 0 && !host_initiated) {
+		if (ka->boot_vcpu_runs_old_kvmclock && old_msr)
+			kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
+
+		ka->boot_vcpu_runs_old_kvmclock = old_msr;
+	}
+
+	vcpu->arch.time = system_time;
+	kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
+
+	/* we verify if the enable bit is set... */
+	vcpu->arch.pv_time_enabled = false;
+	if (!(system_time & 1))
+		return;
+
+	if (!kvm_gfn_to_hva_cache_init(vcpu->kvm,
+				       &vcpu->arch.pv_time, system_time & ~1ULL,
+				       sizeof(struct pvclock_vcpu_time_info)))
+		vcpu->arch.pv_time_enabled = true;
+
+	return;
+}
+
 static uint32_t div_frac(uint32_t dividend, uint32_t divisor)
 {
 	do_shl32_div32(dividend, divisor);
@@ -2973,33 +3001,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvm_write_wall_clock(vcpu->kvm, data);
 		break;
 	case MSR_KVM_SYSTEM_TIME_NEW:
-	case MSR_KVM_SYSTEM_TIME: {
-		struct kvm_arch *ka = &vcpu->kvm->arch;
-
-		if (vcpu->vcpu_id == 0 && !msr_info->host_initiated) {
-			bool tmp = (msr == MSR_KVM_SYSTEM_TIME);
-
-			if (ka->boot_vcpu_runs_old_kvmclock != tmp)
-				kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
-
-			ka->boot_vcpu_runs_old_kvmclock = tmp;
-		}
-
-		vcpu->arch.time = data;
-		kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
-
-		/* we verify if the enable bit is set... */
-		vcpu->arch.pv_time_enabled = false;
-		if (!(data & 1))
-			break;
-
-		if (!kvm_gfn_to_hva_cache_init(vcpu->kvm,
-		     &vcpu->arch.pv_time, data & ~1ULL,
-		     sizeof(struct pvclock_vcpu_time_info)))
-			vcpu->arch.pv_time_enabled = true;
-
+		kvm_write_system_time(vcpu, data, false, msr_info->host_initiated);
+		break;
+	case MSR_KVM_SYSTEM_TIME:
+		kvm_write_system_time(vcpu, data, true, msr_info->host_initiated);
 		break;
-	}
 	case MSR_KVM_ASYNC_PF_EN:
 		if (kvm_pv_enable_async_pf(vcpu, data))
 			return 1;
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/4] kvm: x86: set wall_clock in kvm_write_wall_clock()
  2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
  2020-08-06 15:14 ` [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn Oliver Upton
@ 2020-08-06 15:14 ` Oliver Upton
  2020-08-06 15:14 ` [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID Oliver Upton
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2020-08-06 15:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Oliver Upton, Jim Mattson,
	Peter Shier

Small change to avoid meaningless duplication in the subsequent patch.
No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ba713108686..683ce68d96b2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1790,6 +1790,8 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock)
 	struct pvclock_wall_clock wc;
 	u64 wall_nsec;
 
+	kvm->arch.wall_clock = wall_clock;
+
 	if (!wall_clock)
 		return;
 
@@ -2997,7 +2999,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_KVM_WALL_CLOCK_NEW:
 	case MSR_KVM_WALL_CLOCK:
-		vcpu->kvm->arch.wall_clock = data;
 		kvm_write_wall_clock(vcpu->kvm, data);
 		break;
 	case MSR_KVM_SYSTEM_TIME_NEW:
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID
  2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
  2020-08-06 15:14 ` [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn Oliver Upton
  2020-08-06 15:14 ` [PATCH v3 2/4] kvm: x86: set wall_clock in kvm_write_wall_clock() Oliver Upton
@ 2020-08-06 15:14 ` Oliver Upton
  2020-08-14  0:58   ` Wanpeng Li
  2020-08-06 15:14 ` [PATCH v3 4/4] Documentation: kvm: fix some typos in cpuid.rst Oliver Upton
  2020-08-13 16:41 ` [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
  4 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2020-08-06 15:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Oliver Upton, Jim Mattson,
	Peter Shier

KVM unconditionally provides PV features to the guest, regardless of the
configured CPUID. An unwitting guest that doesn't check
KVM_CPUID_FEATURES before use could access paravirt features that
userspace did not intend to provide. Fix this by checking the guest's
CPUID before performing any paravirtual operations.

Introduce a capability, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, to gate the
aforementioned enforcement. Migrating a VM from a host w/o this patch to
a host with this patch could silently change the ABI exposed to the
guest, warranting that we default to the old behavior and opt-in for
the new one.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst  | 11 ++++++
 arch/x86/include/asm/kvm_host.h |  6 +++
 arch/x86/kvm/cpuid.h            | 16 ++++++++
 arch/x86/kvm/x86.c              | 67 ++++++++++++++++++++++++++++++---
 include/uapi/linux/kvm.h        |  1 +
 5 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 644e5326aa50..e8fc6e34f344 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6155,3 +6155,14 @@ KVM can therefore start protected VMs.
 This capability governs the KVM_S390_PV_COMMAND ioctl and the
 KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected
 guests when the state change is invalid.
+
+
+8.24 KVM_CAP_ENFORCE_PV_CPUID
+-----------------------------
+
+Architectures: x86
+
+When enabled, KVM will disable paravirtual features provided to the
+guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
+(0x40000001). Otherwise, a guest may use the paravirtual features
+regardless of what has actually been exposed through the CPUID leaf.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5ab3af7275d8..a641c3840a1e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -788,6 +788,12 @@ struct kvm_vcpu_arch {
 
 	/* AMD MSRC001_0015 Hardware Configuration */
 	u64 msr_hwcr;
+
+	/*
+	 * Indicates whether PV emulation should be disabled if not present in
+	 * the guest's cpuid.
+	 */
+	bool enforce_pv_feature_cpuid;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 3a923ae15f2f..c364c2877583 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -5,6 +5,7 @@
 #include "x86.h"
 #include <asm/cpu.h>
 #include <asm/processor.h>
+#include <uapi/asm/kvm_para.h>
 
 extern u32 kvm_cpu_caps[NCAPINTS] __read_mostly;
 void kvm_set_cpu_caps(void);
@@ -308,4 +309,19 @@ static inline bool page_address_valid(struct kvm_vcpu *vcpu, gpa_t gpa)
 	return PAGE_ALIGNED(gpa) && !(gpa >> cpuid_maxphyaddr(vcpu));
 }
 
+static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
+					   unsigned int kvm_feature)
+{
+	struct kvm_cpuid_entry2 *cpuid;
+
+	if (!vcpu->arch.enforce_pv_feature_cpuid)
+		return true;
+
+	cpuid = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!cpuid)
+		return false;
+
+	return cpuid->eax & (1u << kvm_feature);
+}
+
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 683ce68d96b2..9900a846dfc0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2763,6 +2763,14 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data)
 	if (data & 0x30)
 		return 1;
 
+	if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_VMEXIT) &&
+	    (data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT))
+		return 1;
+
+	if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT) &&
+	    (data & KVM_ASYNC_PF_DELIVERY_AS_INT))
+		return 1;
+
 	if (!lapic_in_kernel(vcpu))
 		return 1;
 
@@ -2840,10 +2848,12 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	 * Doing a TLB flush here, on the guest's behalf, can avoid
 	 * expensive IPIs.
 	 */
-	trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
-		st->preempted & KVM_VCPU_FLUSH_TLB);
-	if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
-		kvm_vcpu_flush_tlb_guest(vcpu);
+	if (guest_pv_has(vcpu, KVM_FEATURE_PV_TLB_FLUSH)) {
+		trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
+				       st->preempted & KVM_VCPU_FLUSH_TLB);
+		if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
+			kvm_vcpu_flush_tlb_guest(vcpu);
+	}
 
 	vcpu->arch.st.preempted = 0;
 
@@ -2998,30 +3008,54 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		vcpu->arch.smi_count = data;
 		break;
 	case MSR_KVM_WALL_CLOCK_NEW:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
+			return 1;
+
+		kvm_write_wall_clock(vcpu->kvm, data);
+		break;
 	case MSR_KVM_WALL_CLOCK:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
+			return 1;
+
 		kvm_write_wall_clock(vcpu->kvm, data);
 		break;
 	case MSR_KVM_SYSTEM_TIME_NEW:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
+			return 1;
+
 		kvm_write_system_time(vcpu, data, false, msr_info->host_initiated);
 		break;
 	case MSR_KVM_SYSTEM_TIME:
-		kvm_write_system_time(vcpu, data, true, msr_info->host_initiated);
+		if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
+			return 1;
+
+		kvm_write_system_time(vcpu, data, true,  msr_info->host_initiated);
 		break;
 	case MSR_KVM_ASYNC_PF_EN:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
+			return 1;
+
 		if (kvm_pv_enable_async_pf(vcpu, data))
 			return 1;
 		break;
 	case MSR_KVM_ASYNC_PF_INT:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT))
+			return 1;
+
 		if (kvm_pv_enable_async_pf_int(vcpu, data))
 			return 1;
 		break;
 	case MSR_KVM_ASYNC_PF_ACK:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
+			return 1;
 		if (data & 0x1) {
 			vcpu->arch.apf.pageready_pending = false;
 			kvm_check_async_pf_completion(vcpu);
 		}
 		break;
 	case MSR_KVM_STEAL_TIME:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_STEAL_TIME))
+			return 1;
 
 		if (unlikely(!sched_info_on()))
 			return 1;
@@ -3038,11 +3072,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 		break;
 	case MSR_KVM_PV_EOI_EN:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_EOI))
+			return 1;
+
 		if (kvm_lapic_enable_pv_eoi(vcpu, data, sizeof(u8)))
 			return 1;
 		break;
 
 	case MSR_KVM_POLL_CONTROL:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_POLL_CONTROL))
+			return 1;
+
 		/* only enable bit supported */
 		if (data & (-1ULL << 1))
 			return 1;
@@ -3522,6 +3562,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_EXCEPTION_PAYLOAD:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_LAST_CPU:
+	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -4389,6 +4430,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 
 		return kvm_x86_ops.enable_direct_tlbflush(vcpu);
 
+	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
+		vcpu->arch.enforce_pv_feature_cpuid = cap->args[0];
+
+		return 0;
+
 	default:
 		return -EINVAL;
 	}
@@ -7723,11 +7769,16 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		goto out;
 	}
 
+	ret = -KVM_ENOSYS;
+
 	switch (nr) {
 	case KVM_HC_VAPIC_POLL_IRQ:
 		ret = 0;
 		break;
 	case KVM_HC_KICK_CPU:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
+			break;
+
 		kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
 		kvm_sched_yield(vcpu->kvm, a1);
 		ret = 0;
@@ -7738,9 +7789,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		break;
 #endif
 	case KVM_HC_SEND_IPI:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SEND_IPI))
+			break;
+
 		ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
 		break;
 	case KVM_HC_SCHED_YIELD:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SCHED_YIELD))
+			break;
+
 		kvm_sched_yield(vcpu->kvm, a0);
 		ret = 0;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6d86033c4fa..48c2d5c10b1e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1035,6 +1035,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_LAST_CPU 184
 #define KVM_CAP_SMALLER_MAXPHYADDR 185
 #define KVM_CAP_S390_DIAG318 186
+#define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 187
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/4] Documentation: kvm: fix some typos in cpuid.rst
  2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
                   ` (2 preceding siblings ...)
  2020-08-06 15:14 ` [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID Oliver Upton
@ 2020-08-06 15:14 ` Oliver Upton
  2020-08-13 16:41 ` [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
  4 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2020-08-06 15:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Oliver Upton, Jim Mattson,
	Peter Shier

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/cpuid.rst | 88 ++++++++++++++++----------------
 1 file changed, 44 insertions(+), 44 deletions(-)

diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index a7dff9186bed..f1583e682cc8 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -38,64 +38,64 @@ returns::
 
 where ``flag`` is defined as below:
 
-================================= =========== ================================
-flag                              value       meaning
-================================= =========== ================================
-KVM_FEATURE_CLOCKSOURCE           0           kvmclock available at msrs
-                                              0x11 and 0x12
+================================== =========== ================================
+flag                               value       meaning
+================================== =========== ================================
+KVM_FEATURE_CLOCKSOURCE            0           kvmclock available at msrs
+                                               0x11 and 0x12
 
-KVM_FEATURE_NOP_IO_DELAY          1           not necessary to perform delays
-                                              on PIO operations
+KVM_FEATURE_NOP_IO_DELAY           1           not necessary to perform delays
+                                               on PIO operations
 
-KVM_FEATURE_MMU_OP                2           deprecated
+KVM_FEATURE_MMU_OP                 2           deprecated
 
-KVM_FEATURE_CLOCKSOURCE2          3           kvmclock available at msrs
-                                              0x4b564d00 and 0x4b564d01
+KVM_FEATURE_CLOCKSOURCE2           3           kvmclock available at msrs
+                                               0x4b564d00 and 0x4b564d01
 
-KVM_FEATURE_ASYNC_PF              4           async pf can be enabled by
-                                              writing to msr 0x4b564d02
+KVM_FEATURE_ASYNC_PF               4           async pf can be enabled by
+                                               writing to msr 0x4b564d02
 
-KVM_FEATURE_STEAL_TIME            5           steal time can be enabled by
-                                              writing to msr 0x4b564d03
+KVM_FEATURE_STEAL_TIME             5           steal time can be enabled by
+                                               writing to msr 0x4b564d03
 
-KVM_FEATURE_PV_EOI                6           paravirtualized end of interrupt
-                                              handler can be enabled by
-                                              writing to msr 0x4b564d04
+KVM_FEATURE_PV_EOI                 6           paravirtualized end of interrupt
+                                               handler can be enabled by
+                                               writing to msr 0x4b564d04
 
-KVM_FEATURE_PV_UNHAULT            7           guest checks this feature bit
-                                              before enabling paravirtualized
-                                              spinlock support
+KVM_FEATURE_PV_UNHALT              7           guest checks this feature bit
+                                               before enabling paravirtualized
+                                               spinlock support
 
-KVM_FEATURE_PV_TLB_FLUSH          9           guest checks this feature bit
-                                              before enabling paravirtualized
-                                              tlb flush
+KVM_FEATURE_PV_TLB_FLUSH           9           guest checks this feature bit
+                                               before enabling paravirtualized
+                                               tlb flush
 
-KVM_FEATURE_ASYNC_PF_VMEXIT       10          paravirtualized async PF VM EXIT
-                                              can be enabled by setting bit 2
-                                              when writing to msr 0x4b564d02
+KVM_FEATURE_ASYNC_PF_VMEXIT        10          paravirtualized async PF VM EXIT
+                                               can be enabled by setting bit 2
+                                               when writing to msr 0x4b564d02
 
-KVM_FEATURE_PV_SEND_IPI           11          guest checks this feature bit
-                                              before enabling paravirtualized
-                                              sebd IPIs
+KVM_FEATURE_PV_SEND_IPI            11          guest checks this feature bit
+                                               before enabling paravirtualized
+                                               send IPIs
 
-KVM_FEATURE_PV_POLL_CONTROL       12          host-side polling on HLT can
-                                              be disabled by writing
-                                              to msr 0x4b564d05.
+KVM_FEATURE_PV_POLL_CONTROL        12          host-side polling on HLT can
+                                               be disabled by writing
+                                               to msr 0x4b564d05.
 
-KVM_FEATURE_PV_SCHED_YIELD        13          guest checks this feature bit
-                                              before using paravirtualized
-                                              sched yield.
+KVM_FEATURE_PV_SCHED_YIELD         13          guest checks this feature bit
+                                               before using paravirtualized
+                                               sched yield.
 
-KVM_FEATURE_ASYNC_PF_INT          14          guest checks this feature bit
-                                              before using the second async
-                                              pf control msr 0x4b564d06 and
-                                              async pf acknowledgment msr
-                                              0x4b564d07.
+KVM_FEATURE_ASYNC_PF_INT           14          guest checks this feature bit
+                                               before using the second async
+                                               pf control msr 0x4b564d06 and
+                                               async pf acknowledgment msr
+                                               0x4b564d07.
 
-KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24          host will warn if no guest-side
-                                              per-cpu warps are expeced in
-                                              kvmclock
-================================= =========== ================================
+KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 24          host will warn if no guest-side
+                                               per-cpu warps are expected in
+                                               kvmclock
+================================== =========== ================================
 
 ::
 
-- 
2.28.0.236.gb10cc79966-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/4] Restrict PV features to only enabled guests
  2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
                   ` (3 preceding siblings ...)
  2020-08-06 15:14 ` [PATCH v3 4/4] Documentation: kvm: fix some typos in cpuid.rst Oliver Upton
@ 2020-08-13 16:41 ` Oliver Upton
  4 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2020-08-13 16:41 UTC (permalink / raw)
  To: kvm list; +Cc: Paolo Bonzini, Sean Christopherson

On Thu, Aug 6, 2020 at 10:14 AM Oliver Upton <oupton@google.com> wrote:
>
> To date, KVM has allowed guests to use paravirtual interfaces regardless
> of the configured CPUID. While almost any guest will consult the
> KVM_CPUID_FEATURES leaf _before_ using PV features, it is still
> undesirable to have such interfaces silently present.
>
> This series aims to address the issue by adding explicit checks against
> the guest's CPUID when servicing any paravirtual feature. Since this
> effectively changes the guest/hypervisor ABI, a KVM_CAP is warranted to
> guard the new behavior.
>
> Patches 1-2 refactor some of the PV code in anticipation of the change.
> Patch 3 introduces the checks + KVM_CAP. Finally, patch 4 fixes some doc
> typos that were noticed when working on this series.
>
> v1 => v2:
>  - Strip Change-Id footers (checkpatch is your friend!)
>
> v2 => v3:
>  - Mark kvm_write_system_time() as static
>
> Parent commit: f3633c268354 ("Merge tag 'kvm-s390-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next-5.6")
>
> Oliver Upton (4):
>   kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper
>     fn
>   kvm: x86: set wall_clock in kvm_write_wall_clock()
>   kvm: x86: only provide PV features if enabled in guest's CPUID
>   Documentation: kvm: fix some typos in cpuid.rst
>
>  Documentation/virt/kvm/api.rst   |  11 +++
>  Documentation/virt/kvm/cpuid.rst |  88 +++++++++++-----------
>  arch/x86/include/asm/kvm_host.h  |   6 ++
>  arch/x86/kvm/cpuid.h             |  16 ++++
>  arch/x86/kvm/x86.c               | 122 +++++++++++++++++++++++--------
>  include/uapi/linux/kvm.h         |   1 +
>  6 files changed, 171 insertions(+), 73 deletions(-)
>
> --
> 2.28.0.236.gb10cc79966-goog
>

Ping :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn
  2020-08-06 15:14 ` [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn Oliver Upton
@ 2020-08-14  0:18   ` Wanpeng Li
  0 siblings, 0 replies; 9+ messages in thread
From: Wanpeng Li @ 2020-08-14  0:18 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Sean Christopherson, Jim Mattson, Peter Shier

On Fri, 7 Aug 2020 at 01:56, Oliver Upton <oupton@google.com> wrote:
>
> No functional change intended.
>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Peter Shier <pshier@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>

Reviewed-by: Wanpeng Li <wanpengli@tencent.com>

> ---
>  arch/x86/kvm/x86.c | 58 +++++++++++++++++++++++++---------------------
>  1 file changed, 32 insertions(+), 26 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index dc4370394ab8..5ba713108686 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1822,6 +1822,34 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock)
>         kvm_write_guest(kvm, wall_clock, &version, sizeof(version));
>  }
>
> +static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
> +                                 bool old_msr, bool host_initiated)
> +{
> +       struct kvm_arch *ka = &vcpu->kvm->arch;
> +
> +       if (vcpu->vcpu_id == 0 && !host_initiated) {
> +               if (ka->boot_vcpu_runs_old_kvmclock && old_msr)
> +                       kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
> +
> +               ka->boot_vcpu_runs_old_kvmclock = old_msr;
> +       }
> +
> +       vcpu->arch.time = system_time;
> +       kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
> +
> +       /* we verify if the enable bit is set... */
> +       vcpu->arch.pv_time_enabled = false;
> +       if (!(system_time & 1))
> +               return;
> +
> +       if (!kvm_gfn_to_hva_cache_init(vcpu->kvm,
> +                                      &vcpu->arch.pv_time, system_time & ~1ULL,
> +                                      sizeof(struct pvclock_vcpu_time_info)))
> +               vcpu->arch.pv_time_enabled = true;
> +
> +       return;
> +}
> +
>  static uint32_t div_frac(uint32_t dividend, uint32_t divisor)
>  {
>         do_shl32_div32(dividend, divisor);
> @@ -2973,33 +3001,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                 kvm_write_wall_clock(vcpu->kvm, data);
>                 break;
>         case MSR_KVM_SYSTEM_TIME_NEW:
> -       case MSR_KVM_SYSTEM_TIME: {
> -               struct kvm_arch *ka = &vcpu->kvm->arch;
> -
> -               if (vcpu->vcpu_id == 0 && !msr_info->host_initiated) {
> -                       bool tmp = (msr == MSR_KVM_SYSTEM_TIME);
> -
> -                       if (ka->boot_vcpu_runs_old_kvmclock != tmp)
> -                               kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
> -
> -                       ka->boot_vcpu_runs_old_kvmclock = tmp;
> -               }
> -
> -               vcpu->arch.time = data;
> -               kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
> -
> -               /* we verify if the enable bit is set... */
> -               vcpu->arch.pv_time_enabled = false;
> -               if (!(data & 1))
> -                       break;
> -
> -               if (!kvm_gfn_to_hva_cache_init(vcpu->kvm,
> -                    &vcpu->arch.pv_time, data & ~1ULL,
> -                    sizeof(struct pvclock_vcpu_time_info)))
> -                       vcpu->arch.pv_time_enabled = true;
> -
> +               kvm_write_system_time(vcpu, data, false, msr_info->host_initiated);
> +               break;
> +       case MSR_KVM_SYSTEM_TIME:
> +               kvm_write_system_time(vcpu, data, true, msr_info->host_initiated);
>                 break;
> -       }
>         case MSR_KVM_ASYNC_PF_EN:
>                 if (kvm_pv_enable_async_pf(vcpu, data))
>                         return 1;
> --
> 2.28.0.236.gb10cc79966-goog
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID
  2020-08-06 15:14 ` [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID Oliver Upton
@ 2020-08-14  0:58   ` Wanpeng Li
  2020-08-14 15:43     ` Oliver Upton
  0 siblings, 1 reply; 9+ messages in thread
From: Wanpeng Li @ 2020-08-14  0:58 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Sean Christopherson, Jim Mattson, Peter Shier

On Fri, 7 Aug 2020 at 01:54, Oliver Upton <oupton@google.com> wrote:
>
> KVM unconditionally provides PV features to the guest, regardless of the
> configured CPUID. An unwitting guest that doesn't check
> KVM_CPUID_FEATURES before use could access paravirt features that
> userspace did not intend to provide. Fix this by checking the guest's
> CPUID before performing any paravirtual operations.
>
> Introduce a capability, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, to gate the
> aforementioned enforcement. Migrating a VM from a host w/o this patch to
> a host with this patch could silently change the ABI exposed to the
> guest, warranting that we default to the old behavior and opt-in for
> the new one.
>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Peter Shier <pshier@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  Documentation/virt/kvm/api.rst  | 11 ++++++
>  arch/x86/include/asm/kvm_host.h |  6 +++
>  arch/x86/kvm/cpuid.h            | 16 ++++++++
>  arch/x86/kvm/x86.c              | 67 ++++++++++++++++++++++++++++++---
>  include/uapi/linux/kvm.h        |  1 +
>  5 files changed, 96 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 644e5326aa50..e8fc6e34f344 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6155,3 +6155,14 @@ KVM can therefore start protected VMs.
>  This capability governs the KVM_S390_PV_COMMAND ioctl and the
>  KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected
>  guests when the state change is invalid.
> +
> +
> +8.24 KVM_CAP_ENFORCE_PV_CPUID
> +-----------------------------
> +
> +Architectures: x86
> +
> +When enabled, KVM will disable paravirtual features provided to the
> +guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
> +(0x40000001). Otherwise, a guest may use the paravirtual features
> +regardless of what has actually been exposed through the CPUID leaf.
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 5ab3af7275d8..a641c3840a1e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -788,6 +788,12 @@ struct kvm_vcpu_arch {
>
>         /* AMD MSRC001_0015 Hardware Configuration */
>         u64 msr_hwcr;
> +
> +       /*
> +        * Indicates whether PV emulation should be disabled if not present in
> +        * the guest's cpuid.
> +        */
> +       bool enforce_pv_feature_cpuid;
>  };
>
>  struct kvm_lpage_info {
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 3a923ae15f2f..c364c2877583 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -5,6 +5,7 @@
>  #include "x86.h"
>  #include <asm/cpu.h>
>  #include <asm/processor.h>
> +#include <uapi/asm/kvm_para.h>
>
>  extern u32 kvm_cpu_caps[NCAPINTS] __read_mostly;
>  void kvm_set_cpu_caps(void);
> @@ -308,4 +309,19 @@ static inline bool page_address_valid(struct kvm_vcpu *vcpu, gpa_t gpa)
>         return PAGE_ALIGNED(gpa) && !(gpa >> cpuid_maxphyaddr(vcpu));
>  }
>
> +static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
> +                                          unsigned int kvm_feature)
> +{
> +       struct kvm_cpuid_entry2 *cpuid;
> +
> +       if (!vcpu->arch.enforce_pv_feature_cpuid)
> +               return true;
> +
> +       cpuid = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
> +       if (!cpuid)
> +               return false;
> +
> +       return cpuid->eax & (1u << kvm_feature);
> +}
> +
>  #endif
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 683ce68d96b2..9900a846dfc0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2763,6 +2763,14 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data)
>         if (data & 0x30)
>                 return 1;
>
> +       if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_VMEXIT) &&
> +           (data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT))
> +               return 1;
> +
> +       if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT) &&
> +           (data & KVM_ASYNC_PF_DELIVERY_AS_INT))
> +               return 1;
> +
>         if (!lapic_in_kernel(vcpu))
>                 return 1;
>
> @@ -2840,10 +2848,12 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
>          * Doing a TLB flush here, on the guest's behalf, can avoid
>          * expensive IPIs.
>          */
> -       trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
> -               st->preempted & KVM_VCPU_FLUSH_TLB);
> -       if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
> -               kvm_vcpu_flush_tlb_guest(vcpu);
> +       if (guest_pv_has(vcpu, KVM_FEATURE_PV_TLB_FLUSH)) {

Is it expensive to recalculate it every time if vcpu is
sched_in/sched_out frequently?

> +               trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
> +                                      st->preempted & KVM_VCPU_FLUSH_TLB);
> +               if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
> +                       kvm_vcpu_flush_tlb_guest(vcpu);
> +       }
>
>         vcpu->arch.st.preempted = 0;
>
> @@ -2998,30 +3008,54 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                 vcpu->arch.smi_count = data;
>                 break;
>         case MSR_KVM_WALL_CLOCK_NEW:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
> +                       return 1;
> +
> +               kvm_write_wall_clock(vcpu->kvm, data);
> +               break;
>         case MSR_KVM_WALL_CLOCK:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
> +                       return 1;
> +
>                 kvm_write_wall_clock(vcpu->kvm, data);
>                 break;
>         case MSR_KVM_SYSTEM_TIME_NEW:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
> +                       return 1;
> +
>                 kvm_write_system_time(vcpu, data, false, msr_info->host_initiated);
>                 break;
>         case MSR_KVM_SYSTEM_TIME:
> -               kvm_write_system_time(vcpu, data, true, msr_info->host_initiated);
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
> +                       return 1;
> +
> +               kvm_write_system_time(vcpu, data, true,  msr_info->host_initiated);
>                 break;
>         case MSR_KVM_ASYNC_PF_EN:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
> +                       return 1;
> +
>                 if (kvm_pv_enable_async_pf(vcpu, data))
>                         return 1;
>                 break;
>         case MSR_KVM_ASYNC_PF_INT:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT))
> +                       return 1;
> +
>                 if (kvm_pv_enable_async_pf_int(vcpu, data))
>                         return 1;
>                 break;
>         case MSR_KVM_ASYNC_PF_ACK:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
> +                       return 1;
>                 if (data & 0x1) {
>                         vcpu->arch.apf.pageready_pending = false;
>                         kvm_check_async_pf_completion(vcpu);
>                 }
>                 break;
>         case MSR_KVM_STEAL_TIME:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_STEAL_TIME))
> +                       return 1;
>
>                 if (unlikely(!sched_info_on()))
>                         return 1;
> @@ -3038,11 +3072,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>
>                 break;
>         case MSR_KVM_PV_EOI_EN:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_EOI))
> +                       return 1;
> +
>                 if (kvm_lapic_enable_pv_eoi(vcpu, data, sizeof(u8)))
>                         return 1;
>                 break;
>
>         case MSR_KVM_POLL_CONTROL:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_POLL_CONTROL))
> +                       return 1;
> +
>                 /* only enable bit supported */
>                 if (data & (-1ULL << 1))
>                         return 1;
> @@ -3522,6 +3562,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_EXCEPTION_PAYLOAD:
>         case KVM_CAP_SET_GUEST_DEBUG:
>         case KVM_CAP_LAST_CPU:
> +       case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
>                 r = 1;
>                 break;
>         case KVM_CAP_SYNC_REGS:
> @@ -4389,6 +4430,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
>
>                 return kvm_x86_ops.enable_direct_tlbflush(vcpu);
>
> +       case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
> +               vcpu->arch.enforce_pv_feature_cpuid = cap->args[0];
> +
> +               return 0;
> +
>         default:
>                 return -EINVAL;
>         }
> @@ -7723,11 +7769,16 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>                 goto out;
>         }
>
> +       ret = -KVM_ENOSYS;
> +
>         switch (nr) {
>         case KVM_HC_VAPIC_POLL_IRQ:
>                 ret = 0;
>                 break;
>         case KVM_HC_KICK_CPU:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
> +                       break;
> +
>                 kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
>                 kvm_sched_yield(vcpu->kvm, a1);
>                 ret = 0;
> @@ -7738,9 +7789,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>                 break;
>  #endif
>         case KVM_HC_SEND_IPI:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SEND_IPI))
> +                       break;
> +
>                 ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
>                 break;
>         case KVM_HC_SCHED_YIELD:
> +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SCHED_YIELD))
> +                       break;
> +
>                 kvm_sched_yield(vcpu->kvm, a0);
>                 ret = 0;
>                 break;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index f6d86033c4fa..48c2d5c10b1e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1035,6 +1035,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_LAST_CPU 184
>  #define KVM_CAP_SMALLER_MAXPHYADDR 185
>  #define KVM_CAP_S390_DIAG318 186
> +#define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 187
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> --
> 2.28.0.236.gb10cc79966-goog
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID
  2020-08-14  0:58   ` Wanpeng Li
@ 2020-08-14 15:43     ` Oliver Upton
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Upton @ 2020-08-14 15:43 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: kvm, Paolo Bonzini, Sean Christopherson, Jim Mattson, Peter Shier

On Thu, Aug 13, 2020 at 7:58 PM Wanpeng Li <kernellwp@gmail.com> wrote:
>
> On Fri, 7 Aug 2020 at 01:54, Oliver Upton <oupton@google.com> wrote:
> >
> > KVM unconditionally provides PV features to the guest, regardless of the
> > configured CPUID. An unwitting guest that doesn't check
> > KVM_CPUID_FEATURES before use could access paravirt features that
> > userspace did not intend to provide. Fix this by checking the guest's
> > CPUID before performing any paravirtual operations.
> >
> > Introduce a capability, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, to gate the
> > aforementioned enforcement. Migrating a VM from a host w/o this patch to
> > a host with this patch could silently change the ABI exposed to the
> > guest, warranting that we default to the old behavior and opt-in for
> > the new one.
> >
> > Reviewed-by: Jim Mattson <jmattson@google.com>
> > Reviewed-by: Peter Shier <pshier@google.com>
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >  Documentation/virt/kvm/api.rst  | 11 ++++++
> >  arch/x86/include/asm/kvm_host.h |  6 +++
> >  arch/x86/kvm/cpuid.h            | 16 ++++++++
> >  arch/x86/kvm/x86.c              | 67 ++++++++++++++++++++++++++++++---
> >  include/uapi/linux/kvm.h        |  1 +
> >  5 files changed, 96 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 644e5326aa50..e8fc6e34f344 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6155,3 +6155,14 @@ KVM can therefore start protected VMs.
> >  This capability governs the KVM_S390_PV_COMMAND ioctl and the
> >  KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected
> >  guests when the state change is invalid.
> > +
> > +
> > +8.24 KVM_CAP_ENFORCE_PV_CPUID
> > +-----------------------------
> > +
> > +Architectures: x86
> > +
> > +When enabled, KVM will disable paravirtual features provided to the
> > +guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
> > +(0x40000001). Otherwise, a guest may use the paravirtual features
> > +regardless of what has actually been exposed through the CPUID leaf.
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 5ab3af7275d8..a641c3840a1e 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -788,6 +788,12 @@ struct kvm_vcpu_arch {
> >
> >         /* AMD MSRC001_0015 Hardware Configuration */
> >         u64 msr_hwcr;
> > +
> > +       /*
> > +        * Indicates whether PV emulation should be disabled if not present in
> > +        * the guest's cpuid.
> > +        */
> > +       bool enforce_pv_feature_cpuid;
> >  };
> >
> >  struct kvm_lpage_info {
> > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> > index 3a923ae15f2f..c364c2877583 100644
> > --- a/arch/x86/kvm/cpuid.h
> > +++ b/arch/x86/kvm/cpuid.h
> > @@ -5,6 +5,7 @@
> >  #include "x86.h"
> >  #include <asm/cpu.h>
> >  #include <asm/processor.h>
> > +#include <uapi/asm/kvm_para.h>
> >
> >  extern u32 kvm_cpu_caps[NCAPINTS] __read_mostly;
> >  void kvm_set_cpu_caps(void);
> > @@ -308,4 +309,19 @@ static inline bool page_address_valid(struct kvm_vcpu *vcpu, gpa_t gpa)
> >         return PAGE_ALIGNED(gpa) && !(gpa >> cpuid_maxphyaddr(vcpu));
> >  }
> >
> > +static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
> > +                                          unsigned int kvm_feature)
> > +{
> > +       struct kvm_cpuid_entry2 *cpuid;
> > +
> > +       if (!vcpu->arch.enforce_pv_feature_cpuid)
> > +               return true;
> > +
> > +       cpuid = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
> > +       if (!cpuid)
> > +               return false;
> > +
> > +       return cpuid->eax & (1u << kvm_feature);
> > +}
> > +
> >  #endif
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 683ce68d96b2..9900a846dfc0 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2763,6 +2763,14 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data)
> >         if (data & 0x30)
> >                 return 1;
> >
> > +       if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_VMEXIT) &&
> > +           (data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT))
> > +               return 1;
> > +
> > +       if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT) &&
> > +           (data & KVM_ASYNC_PF_DELIVERY_AS_INT))
> > +               return 1;
> > +
> >         if (!lapic_in_kernel(vcpu))
> >                 return 1;
> >
> > @@ -2840,10 +2848,12 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
> >          * Doing a TLB flush here, on the guest's behalf, can avoid
> >          * expensive IPIs.
> >          */
> > -       trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
> > -               st->preempted & KVM_VCPU_FLUSH_TLB);
> > -       if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
> > -               kvm_vcpu_flush_tlb_guest(vcpu);
> > +       if (guest_pv_has(vcpu, KVM_FEATURE_PV_TLB_FLUSH)) {
>
> Is it expensive to recalculate it every time if vcpu is
> sched_in/sched_out frequently?

Perhaps so, since we wind up iterating the cpuid array every time. We
could instead just save the feature bitmap in kvm_vcpu_arch when the
guest's CPUID is set to avoid the lookup cost.

>
> > +               trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
> > +                                      st->preempted & KVM_VCPU_FLUSH_TLB);
> > +               if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
> > +                       kvm_vcpu_flush_tlb_guest(vcpu);
> > +       }
> >
> >         vcpu->arch.st.preempted = 0;
> >
> > @@ -2998,30 +3008,54 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >                 vcpu->arch.smi_count = data;
> >                 break;
> >         case MSR_KVM_WALL_CLOCK_NEW:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
> > +                       return 1;
> > +
> > +               kvm_write_wall_clock(vcpu->kvm, data);
> > +               break;
> >         case MSR_KVM_WALL_CLOCK:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
> > +                       return 1;
> > +
> >                 kvm_write_wall_clock(vcpu->kvm, data);
> >                 break;
> >         case MSR_KVM_SYSTEM_TIME_NEW:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE2))
> > +                       return 1;
> > +
> >                 kvm_write_system_time(vcpu, data, false, msr_info->host_initiated);
> >                 break;
> >         case MSR_KVM_SYSTEM_TIME:
> > -               kvm_write_system_time(vcpu, data, true, msr_info->host_initiated);
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_CLOCKSOURCE))
> > +                       return 1;
> > +
> > +               kvm_write_system_time(vcpu, data, true,  msr_info->host_initiated);
> >                 break;
> >         case MSR_KVM_ASYNC_PF_EN:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
> > +                       return 1;
> > +
> >                 if (kvm_pv_enable_async_pf(vcpu, data))
> >                         return 1;
> >                 break;
> >         case MSR_KVM_ASYNC_PF_INT:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF_INT))
> > +                       return 1;
> > +
> >                 if (kvm_pv_enable_async_pf_int(vcpu, data))
> >                         return 1;
> >                 break;
> >         case MSR_KVM_ASYNC_PF_ACK:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
> > +                       return 1;
> >                 if (data & 0x1) {
> >                         vcpu->arch.apf.pageready_pending = false;
> >                         kvm_check_async_pf_completion(vcpu);
> >                 }
> >                 break;
> >         case MSR_KVM_STEAL_TIME:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_STEAL_TIME))
> > +                       return 1;
> >
> >                 if (unlikely(!sched_info_on()))
> >                         return 1;
> > @@ -3038,11 +3072,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >
> >                 break;
> >         case MSR_KVM_PV_EOI_EN:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_EOI))
> > +                       return 1;
> > +
> >                 if (kvm_lapic_enable_pv_eoi(vcpu, data, sizeof(u8)))
> >                         return 1;
> >                 break;
> >
> >         case MSR_KVM_POLL_CONTROL:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_POLL_CONTROL))
> > +                       return 1;
> > +
> >                 /* only enable bit supported */
> >                 if (data & (-1ULL << 1))
> >                         return 1;
> > @@ -3522,6 +3562,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >         case KVM_CAP_EXCEPTION_PAYLOAD:
> >         case KVM_CAP_SET_GUEST_DEBUG:
> >         case KVM_CAP_LAST_CPU:
> > +       case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
> >                 r = 1;
> >                 break;
> >         case KVM_CAP_SYNC_REGS:
> > @@ -4389,6 +4430,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
> >
> >                 return kvm_x86_ops.enable_direct_tlbflush(vcpu);
> >
> > +       case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
> > +               vcpu->arch.enforce_pv_feature_cpuid = cap->args[0];
> > +
> > +               return 0;
> > +
> >         default:
> >                 return -EINVAL;
> >         }
> > @@ -7723,11 +7769,16 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >                 goto out;
> >         }
> >
> > +       ret = -KVM_ENOSYS;
> > +
> >         switch (nr) {
> >         case KVM_HC_VAPIC_POLL_IRQ:
> >                 ret = 0;
> >                 break;
> >         case KVM_HC_KICK_CPU:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
> > +                       break;
> > +
> >                 kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
> >                 kvm_sched_yield(vcpu->kvm, a1);
> >                 ret = 0;
> > @@ -7738,9 +7789,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> >                 break;
> >  #endif
> >         case KVM_HC_SEND_IPI:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SEND_IPI))
> > +                       break;
> > +
> >                 ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
> >                 break;
> >         case KVM_HC_SCHED_YIELD:
> > +               if (!guest_pv_has(vcpu, KVM_FEATURE_PV_SCHED_YIELD))
> > +                       break;
> > +
> >                 kvm_sched_yield(vcpu->kvm, a0);
> >                 ret = 0;
> >                 break;
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index f6d86033c4fa..48c2d5c10b1e 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1035,6 +1035,7 @@ struct kvm_ppc_resize_hpt {
> >  #define KVM_CAP_LAST_CPU 184
> >  #define KVM_CAP_SMALLER_MAXPHYADDR 185
> >  #define KVM_CAP_S390_DIAG318 186
> > +#define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 187
> >
> >  #ifdef KVM_CAP_IRQ_ROUTING
> >
> > --
> > 2.28.0.236.gb10cc79966-goog
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-14 15:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06 15:14 [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton
2020-08-06 15:14 ` [PATCH v3 1/4] kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn Oliver Upton
2020-08-14  0:18   ` Wanpeng Li
2020-08-06 15:14 ` [PATCH v3 2/4] kvm: x86: set wall_clock in kvm_write_wall_clock() Oliver Upton
2020-08-06 15:14 ` [PATCH v3 3/4] kvm: x86: only provide PV features if enabled in guest's CPUID Oliver Upton
2020-08-14  0:58   ` Wanpeng Li
2020-08-14 15:43     ` Oliver Upton
2020-08-06 15:14 ` [PATCH v3 4/4] Documentation: kvm: fix some typos in cpuid.rst Oliver Upton
2020-08-13 16:41 ` [PATCH v3 0/4] Restrict PV features to only enabled guests Oliver Upton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.