All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability
@ 2022-06-22  0:49 Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 1/7] KVM: x86: only allow exits disable before vCPUs created Kechen Lu
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

Summary
===========
Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
cap for disabling exits to enable finer-grained VM exits disabling
on per vCPU scales instead of whole guest. This patch series enabled
the vCPU-scoped exits control and toggling.

Motivation
============
In use cases like Windows guest running heavy CPU-bound
workloads, disabling HLT VM-exits could mitigate host sched ctx switch
overhead. Simply HLT disabling on all vCPUs could bring
performance benefits, but if no pCPUs reserved for host threads, could
happened to the forced preemption as host does not know the time to do
the schedule for other host threads want to run. With this patch, we
could only disable part of vCPUs HLT exits for one guest, this still
keeps performance benefits, and also shows resiliency to host stressing
workload running at the same time.

Performance and Testing
=========================
In the host stressing workload experiment with Windows guest heavy
CPU-bound workloads, it shows good resiliency and having the ~3%
performance improvement. E.g. Passmark running in a Windows guest
with this patch disabling HLT exits on only half of vCPUs still
showing 2.4% higher main score v/s baseline.

Tested everything on AMD machines.

v3->v4 (Chao Gao) :
- Use kvm vCPU request KVM_REQ_DISABLE_EXIT to perform the arch
  VMCS updating (patch 5)
- Fix selftests redundant arguments (patch 7)
- Merge overlapped fix bits from patch 4 to patch 3

v2->v3 (Sean Christopherson) :
- Reject KVM_CAP_X86_DISABLE_EXITS if userspace disable MWAIT exits
  when MWAIT is not allowed in guest (patch 3)
- Make userspace able to re-enable previously disabled exits (patch 4)
- Add mwait/pause/cstate exits flag toggling instead of only hlt
  exits (patch 5)
- Add selftests for KVM_CAP_X86_DISABLE_EXITS (patch 7)

v1->v2 (Sean Christopherson) :
- Add explicit restriction for VM-scoped exits disabling to be called
  before vCPUs creation (patch 1)
- Use vCPU ioctl instead of 64bit vCPU bitmask (patch 5), and make exits
  disable flags check purely for vCPU instead of VM (patch 2)

Best Regards,
Kechen

Kechen Lu (4):
  KVM: x86: Move *_in_guest power management flags to vCPU scope
  KVM: x86: add vCPU scoped toggling for disabled exits
  KVM: x86: Add a new guest_debug flag forcing exit to userspace
  KVM: selftests: Add tests for VM and vCPU cap
    KVM_CAP_X86_DISABLE_EXITS

Sean Christopherson (3):
  KVM: x86: only allow exits disable before vCPUs created
  KVM: x86: Reject disabling of MWAIT interception when not allowed
  KVM: x86: Let userspace re-enable previously disabled exits

 Documentation/virt/kvm/api.rst                |   8 +-
 arch/x86/include/asm/kvm-x86-ops.h            |   1 +
 arch/x86/include/asm/kvm_host.h               |   8 +
 arch/x86/kvm/cpuid.c                          |   4 +-
 arch/x86/kvm/lapic.c                          |   7 +-
 arch/x86/kvm/svm/nested.c                     |   4 +-
 arch/x86/kvm/svm/svm.c                        |  44 +++++-
 arch/x86/kvm/vmx/vmx.c                        |  54 ++++++-
 arch/x86/kvm/x86.c                            |  79 ++++++++--
 arch/x86/kvm/x86.h                            |  16 +-
 include/uapi/linux/kvm.h                      |   5 +-
 tools/include/uapi/linux/kvm.h                |   1 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/svm_util.h   |   1 +
 .../selftests/kvm/x86_64/disable_exits_test.c | 145 ++++++++++++++++++
 16 files changed, 332 insertions(+), 47 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 1/7] KVM: x86: only allow exits disable before vCPUs created
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 2/7] KVM: x86: Move *_in_guest power management flags to vCPU scope Kechen Lu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel, stable

From: Sean Christopherson <seanjc@google.com>

Since VMX and SVM both would never update the control bits if exits
are disable after vCPUs are created, only allow setting exits
disable flag before vCPU creation.

Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT
intercepts")

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kechen Lu <kechenl@nvidia.com>
Cc: stable@vger.kernel.org
---
 Documentation/virt/kvm/api.rst | 1 +
 arch/x86/kvm/x86.c             | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 11e00a46c610..d0d8749591a8 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6933,6 +6933,7 @@ branch to guests' 0x200 interrupt vector.
 :Architectures: x86
 :Parameters: args[0] defines which exits are disabled
 :Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+          or if any vCPU has already been created
 
 Valid bits in args[0] are::
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 158b2e135efc..3ac6329e6d43 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6006,6 +6006,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
 			break;
 
+		mutex_lock(&kvm->lock);
+		if (kvm->created_vcpus)
+			goto disable_exits_unlock;
+
 		if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
 			kvm_can_mwait_in_guest())
 			kvm->arch.mwait_in_guest = true;
@@ -6016,6 +6020,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
 			kvm->arch.cstate_in_guest = true;
 		r = 0;
+disable_exits_unlock:
+		mutex_unlock(&kvm->lock);
 		break;
 	case KVM_CAP_MSR_PLATFORM_INFO:
 		kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 2/7] KVM: x86: Move *_in_guest power management flags to vCPU scope
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 1/7] KVM: x86: only allow exits disable before vCPUs created Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed Kechen Lu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

Make the runtime disabled mwait/hlt/pause/cstate exits flags vCPU scope
to allow finer-grained, per-vCPU control.  The VM-scoped control is only
allowed before vCPUs are created, thus preserving the existing behavior
is a simple matter of snapshotting the flags at vCPU creation.

Signed-off-by: Kechen Lu <kechenl@nvidia.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +++++
 arch/x86/kvm/cpuid.c            |  4 ++--
 arch/x86/kvm/lapic.c            |  7 +++----
 arch/x86/kvm/svm/nested.c       |  4 ++--
 arch/x86/kvm/svm/svm.c          | 12 ++++++------
 arch/x86/kvm/vmx/vmx.c          | 16 ++++++++--------
 arch/x86/kvm/x86.c              |  6 +++++-
 arch/x86/kvm/x86.h              | 16 ++++++++--------
 8 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9217bd6cf0d1..573a39bf7a84 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -924,6 +924,11 @@ struct kvm_vcpu_arch {
 #if IS_ENABLED(CONFIG_HYPERV)
 	hpa_t hv_root_tdp;
 #endif
+
+	bool mwait_in_guest;
+	bool hlt_in_guest;
+	bool pause_in_guest;
+	bool cstate_in_guest;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index de6d44e07e34..f013ff4f49c5 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -245,8 +245,8 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
 
 	best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
-	if (kvm_hlt_in_guest(vcpu->kvm) && best &&
-		(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
+	if (kvm_hlt_in_guest(vcpu) &&
+	    best && (best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
 		best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
 
 	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0e68b4c937fc..9e29d658a8c2 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -147,14 +147,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
 static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
 {
 	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
-		(kvm_mwait_in_guest(vcpu->kvm) || kvm_hlt_in_guest(vcpu->kvm));
+		(kvm_mwait_in_guest(vcpu) || kvm_hlt_in_guest(vcpu));
 }
 
 bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
 {
-	return kvm_x86_ops.set_hv_timer
-	       && !(kvm_mwait_in_guest(vcpu->kvm) ||
-		    kvm_can_post_timer_interrupt(vcpu));
+	return kvm_x86_ops.set_hv_timer &&
+		!(kvm_mwait_in_guest(vcpu) || kvm_can_post_timer_interrupt(vcpu));
 }
 EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
 
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ba7cd26f438f..f143ec757467 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -675,7 +675,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
 
 	pause_count12 = svm->pause_filter_enabled ? svm->nested.ctl.pause_filter_count : 0;
 	pause_thresh12 = svm->pause_threshold_enabled ? svm->nested.ctl.pause_filter_thresh : 0;
-	if (kvm_pause_in_guest(svm->vcpu.kvm)) {
+	if (kvm_pause_in_guest(&svm->vcpu)) {
 		/* use guest values since host doesn't intercept PAUSE */
 		vmcb02->control.pause_filter_count = pause_count12;
 		vmcb02->control.pause_filter_thresh = pause_thresh12;
@@ -951,7 +951,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	vmcb12->control.event_inj         = svm->nested.ctl.event_inj;
 	vmcb12->control.event_inj_err     = svm->nested.ctl.event_inj_err;
 
-	if (!kvm_pause_in_guest(vcpu->kvm)) {
+	if (!kvm_pause_in_guest(vcpu)) {
 		vmcb01->control.pause_filter_count = vmcb02->control.pause_filter_count;
 		vmcb_mark_dirty(vmcb01, VMCB_INTERCEPTS);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 87da90360bc7..b32987f54ace 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -921,7 +921,7 @@ static void grow_ple_window(struct kvm_vcpu *vcpu)
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	int old = control->pause_filter_count;
 
-	if (kvm_pause_in_guest(vcpu->kvm))
+	if (kvm_pause_in_guest(vcpu))
 		return;
 
 	control->pause_filter_count = __grow_ple_window(old,
@@ -942,7 +942,7 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu)
 	struct vmcb_control_area *control = &svm->vmcb->control;
 	int old = control->pause_filter_count;
 
-	if (kvm_pause_in_guest(vcpu->kvm))
+	if (kvm_pause_in_guest(vcpu))
 		return;
 
 	control->pause_filter_count =
@@ -1136,12 +1136,12 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	svm_set_intercept(svm, INTERCEPT_RDPRU);
 	svm_set_intercept(svm, INTERCEPT_RSM);
 
-	if (!kvm_mwait_in_guest(vcpu->kvm)) {
+	if (!kvm_mwait_in_guest(vcpu)) {
 		svm_set_intercept(svm, INTERCEPT_MONITOR);
 		svm_set_intercept(svm, INTERCEPT_MWAIT);
 	}
 
-	if (!kvm_hlt_in_guest(vcpu->kvm))
+	if (!kvm_hlt_in_guest(vcpu))
 		svm_set_intercept(svm, INTERCEPT_HLT);
 
 	control->iopm_base_pa = __sme_set(iopm_base);
@@ -1185,7 +1185,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	svm->nested.vmcb12_gpa = INVALID_GPA;
 	svm->nested.last_vmcb12_gpa = INVALID_GPA;
 
-	if (!kvm_pause_in_guest(vcpu->kvm)) {
+	if (!kvm_pause_in_guest(vcpu)) {
 		control->pause_filter_count = pause_filter_count;
 		if (pause_filter_thresh)
 			control->pause_filter_thresh = pause_filter_thresh;
@@ -4269,7 +4269,7 @@ static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu)
 
 static void svm_sched_in(struct kvm_vcpu *vcpu, int cpu)
 {
-	if (!kvm_pause_in_guest(vcpu->kvm))
+	if (!kvm_pause_in_guest(vcpu))
 		shrink_ple_window(vcpu);
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 553dd2317b9c..f24c9a357f70 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1597,7 +1597,7 @@ static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 	 * then the instruction is already executing and RIP has already been
 	 * advanced.
 	 */
-	if (kvm_hlt_in_guest(vcpu->kvm) &&
+	if (kvm_hlt_in_guest(vcpu) &&
 			vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT)
 		vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
 }
@@ -4212,10 +4212,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 		exec_control |= CPU_BASED_CR3_STORE_EXITING |
 				CPU_BASED_CR3_LOAD_EXITING  |
 				CPU_BASED_INVLPG_EXITING;
-	if (kvm_mwait_in_guest(vmx->vcpu.kvm))
+	if (kvm_mwait_in_guest(&vmx->vcpu))
 		exec_control &= ~(CPU_BASED_MWAIT_EXITING |
 				CPU_BASED_MONITOR_EXITING);
-	if (kvm_hlt_in_guest(vmx->vcpu.kvm))
+	if (kvm_hlt_in_guest(&vmx->vcpu))
 		exec_control &= ~CPU_BASED_HLT_EXITING;
 	return exec_control;
 }
@@ -4294,7 +4294,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 	}
 	if (!enable_unrestricted_guest)
 		exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
-	if (kvm_pause_in_guest(vmx->vcpu.kvm))
+	if (kvm_pause_in_guest(&vmx->vcpu))
 		exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING;
 	if (!kvm_vcpu_apicv_active(vcpu))
 		exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
@@ -4397,7 +4397,7 @@ static void init_vmcs(struct vcpu_vmx *vmx)
 		vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc)));
 	}
 
-	if (!kvm_pause_in_guest(vmx->vcpu.kvm)) {
+	if (!kvm_pause_in_guest(&vmx->vcpu)) {
 		vmcs_write32(PLE_GAP, ple_gap);
 		vmx->ple_window = ple_window;
 		vmx->ple_window_dirty = true;
@@ -5562,7 +5562,7 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu)
  */
 static int handle_pause(struct kvm_vcpu *vcpu)
 {
-	if (!kvm_pause_in_guest(vcpu->kvm))
+	if (!kvm_pause_in_guest(vcpu))
 		grow_ple_window(vcpu);
 
 	/*
@@ -7059,7 +7059,7 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu)
 	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
-	if (kvm_cstate_in_guest(vcpu->kvm)) {
+	if (kvm_cstate_in_guest(vcpu)) {
 		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R);
 		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
 		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
@@ -7597,7 +7597,7 @@ static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
 
 static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
 {
-	if (!kvm_pause_in_guest(vcpu->kvm))
+	if (!kvm_pause_in_guest(vcpu))
 		shrink_ple_window(vcpu);
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3ac6329e6d43..b419b258ed90 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11355,6 +11355,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 #if IS_ENABLED(CONFIG_HYPERV)
 	vcpu->arch.hv_root_tdp = INVALID_PAGE;
 #endif
+	vcpu->arch.mwait_in_guest = vcpu->kvm->arch.mwait_in_guest;
+	vcpu->arch.hlt_in_guest = vcpu->kvm->arch.hlt_in_guest;
+	vcpu->arch.pause_in_guest = vcpu->kvm->arch.pause_in_guest;
+	vcpu->arch.cstate_in_guest = vcpu->kvm->arch.cstate_in_guest;
 
 	r = static_call(kvm_x86_vcpu_create)(vcpu);
 	if (r)
@@ -12539,7 +12543,7 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
 		     vcpu->arch.exception.pending))
 		return false;
 
-	if (kvm_hlt_in_guest(vcpu->kvm) && !kvm_can_deliver_async_pf(vcpu))
+	if (kvm_hlt_in_guest(vcpu) && !kvm_can_deliver_async_pf(vcpu))
 		return false;
 
 	/*
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 588792f00334..a59b73e11726 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -324,24 +324,24 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
 	    __rem;						\
 	 })
 
-static inline bool kvm_mwait_in_guest(struct kvm *kvm)
+static inline bool kvm_mwait_in_guest(struct kvm_vcpu *vcpu)
 {
-	return kvm->arch.mwait_in_guest;
+	return vcpu->arch.mwait_in_guest;
 }
 
-static inline bool kvm_hlt_in_guest(struct kvm *kvm)
+static inline bool kvm_hlt_in_guest(struct kvm_vcpu *vcpu)
 {
-	return kvm->arch.hlt_in_guest;
+	return vcpu->arch.hlt_in_guest;
 }
 
-static inline bool kvm_pause_in_guest(struct kvm *kvm)
+static inline bool kvm_pause_in_guest(struct kvm_vcpu *vcpu)
 {
-	return kvm->arch.pause_in_guest;
+	return vcpu->arch.pause_in_guest;
 }
 
-static inline bool kvm_cstate_in_guest(struct kvm *kvm)
+static inline bool kvm_cstate_in_guest(struct kvm_vcpu *vcpu)
 {
-	return kvm->arch.cstate_in_guest;
+	return vcpu->arch.cstate_in_guest;
 }
 
 enum kvm_intr_type {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 1/7] KVM: x86: only allow exits disable before vCPUs created Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 2/7] KVM: x86: Move *_in_guest power management flags to vCPU scope Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-07-20 17:53   ` Sean Christopherson
  2022-06-22  0:49 ` [RFC PATCH v4 4/7] KVM: x86: Let userspace re-enable previously disabled exits Kechen Lu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Reject KVM_CAP_X86_DISABLE_EXITS if userspace attempts to disable MWAIT
exits and KVM previously reported (via KVM_CHECK_EXTENSION) that MWAIT is
not allowed in guest, e.g. because it's not supported or the CPU doesn't
have an aways-running APIC timer.

Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT intercepts")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Kechen Lu <kechenl@nvidia.com>
Suggested-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/kvm/x86.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b419b258ed90..6ec01362a7d8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4199,6 +4199,16 @@ static inline bool kvm_can_mwait_in_guest(void)
 		boot_cpu_has(X86_FEATURE_ARAT);
 }
 
+static u64 kvm_get_allowed_disable_exits(void)
+{
+	u64 r = KVM_X86_DISABLE_VALID_EXITS;
+
+	if(!kvm_can_mwait_in_guest())
+		r &= ~KVM_X86_DISABLE_EXITS_MWAIT;
+
+	return r;
+}
+
 static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
 					    struct kvm_cpuid2 __user *cpuid_arg)
 {
@@ -4318,10 +4328,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = KVM_CLOCK_VALID_FLAGS;
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
-		r |=  KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE |
-		      KVM_X86_DISABLE_EXITS_CSTATE;
-		if(kvm_can_mwait_in_guest())
-			r |= KVM_X86_DISABLE_EXITS_MWAIT;
+		r |= kvm_get_allowed_disable_exits();
 		break;
 	case KVM_CAP_X86_SMM:
 		/* SMBASE is usually relocated above 1M on modern chipsets,
@@ -6003,15 +6010,14 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
 		r = -EINVAL;
-		if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
+		if (cap->args[0] & ~kvm_get_allowed_disable_exits())
 			break;
 
 		mutex_lock(&kvm->lock);
 		if (kvm->created_vcpus)
 			goto disable_exits_unlock;
 
-		if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
-			kvm_can_mwait_in_guest())
+		if (cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT)
 			kvm->arch.mwait_in_guest = true;
 		if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
 			kvm->arch.hlt_in_guest = true;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 4/7] KVM: x86: Let userspace re-enable previously disabled exits
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
                   ` (2 preceding siblings ...)
  2022-06-22  0:49 ` [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for " Kechen Lu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Add an OVERRIDE flag to KVM_CAP_X86_DISABLE_EXITS allow userspace to
re-enable exits and/or override previous settings.  There's no real use
case for the the per-VM ioctl, but a future per-vCPU variant wants to let
userspace toggle interception while the vCPU is running; add the OVERRIDE
functionality now to provide consistent between between the per-VM and
per-vCPU variants.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst |  5 +++++
 arch/x86/kvm/x86.c             | 32 ++++++++++++++++++++++++--------
 include/uapi/linux/kvm.h       |  4 +++-
 3 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index d0d8749591a8..89e13b6783b5 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6941,6 +6941,7 @@ Valid bits in args[0] are::
   #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
   #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
   #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
+  #define KVM_X86_DISABLE_EXITS_OVERRIDE         (1ull << 63)
 
 Enabling this capability on a VM provides userspace with a way to no
 longer intercept some instructions for improved latency in some
@@ -6949,6 +6950,10 @@ physical CPUs.  More bits can be added in the future; userspace can
 just pass the KVM_CHECK_EXTENSION result to KVM_ENABLE_CAP to disable
 all such vmexits.
 
+By default, this capability only disables exits.  To re-enable an exit, or to
+override previous settings, userspace can set KVM_X86_DISABLE_EXITS_OVERRIDE,
+in which case KVM will enable/disable according to the mask (a '1' == disable).
+
 Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
 
 7.14 KVM_CAP_S390_HPAGE_1M
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6ec01362a7d8..fe114e319a89 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5263,6 +5263,28 @@ static int kvm_vcpu_ioctl_device_attr(struct kvm_vcpu *vcpu,
 	return r;
 }
 
+
+#define kvm_ioctl_disable_exits(a, mask)				     \
+({									     \
+	if (!kvm_can_mwait_in_guest())                                       \
+		(mask) &= KVM_X86_DISABLE_EXITS_MWAIT;                       \
+	if ((mask) & KVM_X86_DISABLE_EXITS_OVERRIDE) {			     \
+		(a).mwait_in_guest = (mask) & KVM_X86_DISABLE_EXITS_MWAIT;   \
+		(a).hlt_in_guest = (mask) & KVM_X86_DISABLE_EXITS_HLT;	     \
+		(a).pause_in_guest = (mask) & KVM_X86_DISABLE_EXITS_PAUSE;   \
+		(a).cstate_in_guest = (mask) & KVM_X86_DISABLE_EXITS_CSTATE; \
+	} else {							     \
+		if ((mask) & KVM_X86_DISABLE_EXITS_MWAIT)		     \
+			(a).mwait_in_guest = true;			     \
+		if ((mask) & KVM_X86_DISABLE_EXITS_HLT)			     \
+			(a).hlt_in_guest = true;			     \
+		if ((mask) & KVM_X86_DISABLE_EXITS_PAUSE)		     \
+			(a).pause_in_guest = true;			     \
+		if ((mask) & KVM_X86_DISABLE_EXITS_CSTATE)		     \
+			(a).cstate_in_guest = true;			     \
+	}								     \
+})
+
 static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 				     struct kvm_enable_cap *cap)
 {
@@ -6017,14 +6039,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (kvm->created_vcpus)
 			goto disable_exits_unlock;
 
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT)
-			kvm->arch.mwait_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
-			kvm->arch.hlt_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
-			kvm->arch.pause_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
-			kvm->arch.cstate_in_guest = true;
+		kvm_ioctl_disable_exits(kvm->arch, cap->args[0]);
+
 		r = 0;
 disable_exits_unlock:
 		mutex_unlock(&kvm->lock);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5088bd9f1922..f2e76e436be5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -814,10 +814,12 @@ struct kvm_ioeventfd {
 #define KVM_X86_DISABLE_EXITS_HLT            (1 << 1)
 #define KVM_X86_DISABLE_EXITS_PAUSE          (1 << 2)
 #define KVM_X86_DISABLE_EXITS_CSTATE         (1 << 3)
+#define KVM_X86_DISABLE_EXITS_OVERRIDE	     (1ull << 63)
 #define KVM_X86_DISABLE_VALID_EXITS          (KVM_X86_DISABLE_EXITS_MWAIT | \
                                               KVM_X86_DISABLE_EXITS_HLT | \
                                               KVM_X86_DISABLE_EXITS_PAUSE | \
-                                              KVM_X86_DISABLE_EXITS_CSTATE)
+                                              KVM_X86_DISABLE_EXITS_CSTATE | \
+					      KVM_X86_DISABLE_EXITS_OVERRIDE)
 
 /* for KVM_ENABLE_CAP */
 struct kvm_enable_cap {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for disabled exits
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
                   ` (3 preceding siblings ...)
  2022-06-22  0:49 ` [RFC PATCH v4 4/7] KVM: x86: Let userspace re-enable previously disabled exits Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-07-20 18:41   ` Sean Christopherson
  2022-06-22  0:49 ` [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace Kechen Lu
  2022-06-22  0:49 ` [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS Kechen Lu
  6 siblings, 1 reply; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

Introduce support of vCPU-scoped ioctl with KVM_CAP_X86_DISABLE_EXITS
cap for disabling exits to enable finer-grained VM exits disabling
on per vCPU scales instead of whole guest. This patch enabled
the vCPU-scoped exits control toggling, also align the VM-scoped
exits control behaviors. Add a new kvm request KVM_REQ_DISABLE_EXITS
to guarantee updating the vmcs before vCPU entry especially for
toggling the VM-scoped exits.

In use cases like Windows guest running heavy CPU-bound
workloads, disabling HLT VM-exits could mitigate host sched ctx switch
overhead. Simply HLT disabling on all vCPUs could bring
performance benefits, but if no pCPUs reserved for host threads, could
happened to the forced preemption as host does not know the time to do
the schedule for other host threads want to run. With this patch, we
could only disable part of vCPUs HLT exits for one guest, this still
keeps performance benefits, and also shows resiliency to host stressing
workload running at the same time.

In the host stressing workload experiment with Windows guest heavy
CPU-bound workloads, it shows good resiliency and having the ~3%
performance improvement. E.g. Passmark running in a Windows guest
with this patch disabling HLT exits on only half of vCPUs still
showing 2.4% higher main score v/s baseline.

Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Kechen Lu <kechenl@nvidia.com>
---
 Documentation/virt/kvm/api.rst     |  2 +-
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  3 +++
 arch/x86/kvm/svm/svm.c             | 30 ++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.c             | 37 ++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                 | 23 +++++++++++++++----
 6 files changed, 91 insertions(+), 5 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 89e13b6783b5..7f614b7d5ad8 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6948,7 +6948,7 @@ longer intercept some instructions for improved latency in some
 workloads, and is suggested when vCPUs are associated to dedicated
 physical CPUs.  More bits can be added in the future; userspace can
 just pass the KVM_CHECK_EXTENSION result to KVM_ENABLE_CAP to disable
-all such vmexits.
+all such vmexits. VM scoped and vCPU scoped capability are both supported.
 
 By default, this capability only disables exits.  To re-enable an exit, or to
 override previous settings, userspace can set KVM_X86_DISABLE_EXITS_OVERRIDE,
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index da47f60a4650..c17d417cb3cf 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -128,6 +128,7 @@ KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP(update_disabled_exits)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 573a39bf7a84..86baae62af86 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -105,6 +105,7 @@
 	KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \
 	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_DISABLE_EXITS		KVM_ARCH_REQ(32)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1584,6 +1585,8 @@ struct kvm_x86_ops {
 	 * Returns vCPU specific APICv inhibit reasons
 	 */
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+	void (*update_disabled_exits)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b32987f54ace..7b3d64b3b901 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4589,6 +4589,33 @@ static void svm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 	sev_vcpu_deliver_sipi_vector(vcpu, vector);
 }
 
+static void svm_update_disabled_exits(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control = &svm->vmcb->control;
+
+	if (kvm_hlt_in_guest(vcpu))
+		svm_clr_intercept(svm, INTERCEPT_HLT);
+	else
+		svm_set_intercept(svm, INTERCEPT_HLT);
+
+	if (kvm_mwait_in_guest(vcpu)) {
+		svm_clr_intercept(svm, INTERCEPT_MONITOR);
+		svm_clr_intercept(svm, INTERCEPT_MWAIT);
+	} else {
+		svm_set_intercept(svm, INTERCEPT_MONITOR);
+		svm_set_intercept(svm, INTERCEPT_MWAIT);
+	}
+
+	if (kvm_pause_in_guest(vcpu)) {
+		svm_clr_intercept(svm, INTERCEPT_PAUSE);
+	} else {
+		control->pause_filter_count = pause_filter_count;
+		if (pause_filter_thresh)
+			control->pause_filter_thresh = pause_filter_thresh;
+	}
+}
+
 static void svm_vm_destroy(struct kvm *kvm)
 {
 	avic_vm_destroy(kvm);
@@ -4732,7 +4759,10 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.complete_emulated_msr = svm_complete_emulated_msr,
 
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
+
 	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
+
+	.update_disabled_exits = svm_update_disabled_exits,
 };
 
 /*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f24c9a357f70..2d000638cc9b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7716,6 +7716,41 @@ static bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
 	return supported & BIT(reason);
 }
 
+static void vmx_update_disabled_exits(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (kvm_hlt_in_guest(vcpu))
+		exec_controls_clearbit(vmx, CPU_BASED_HLT_EXITING);
+	else
+		exec_controls_setbit(vmx, CPU_BASED_HLT_EXITING);
+
+	if (kvm_mwait_in_guest(vcpu))
+		exec_controls_clearbit(vmx, CPU_BASED_MWAIT_EXITING |
+			CPU_BASED_MONITOR_EXITING);
+	else
+		exec_controls_setbit(vmx, CPU_BASED_MWAIT_EXITING |
+			CPU_BASED_MONITOR_EXITING);
+
+	if (!kvm_pause_in_guest(vcpu)) {
+		vmcs_write32(PLE_GAP, ple_gap);
+		vmx->ple_window = ple_window;
+		vmx->ple_window_dirty = true;
+	}
+
+	if (kvm_cstate_in_guest(vcpu)) {
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
+		vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
+	} else {
+		vmx_enable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R);
+		vmx_enable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
+		vmx_enable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
+		vmx_enable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
+	}
+}
+
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = "kvm_intel",
 
@@ -7849,6 +7884,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.complete_emulated_msr = kvm_complete_insn_gp,
 
 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+
+	.update_disabled_exits = vmx_update_disabled_exits,
 };
 
 static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe114e319a89..6165f0b046ed 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5331,6 +5331,13 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		if (vcpu->arch.pv_cpuid.enforce)
 			kvm_update_pv_runtime(vcpu);
 
+		return 0;
+	case KVM_CAP_X86_DISABLE_EXITS:
+		if (cap->args[0] & ~kvm_get_allowed_disable_exits())
+			return -EINVAL;
+
+		kvm_ioctl_disable_exits(vcpu->arch, cap->args[0]);
+		kvm_make_request(KVM_REQ_DISABLE_EXITS, vcpu);
 		return 0;
 	default:
 		return -EINVAL;
@@ -5980,6 +5987,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
+	struct kvm_vcpu *vcpu;
+	unsigned long i;
 	int r;
 
 	if (cap->flags)
@@ -6036,14 +6045,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			break;
 
 		mutex_lock(&kvm->lock);
-		if (kvm->created_vcpus)
-			goto disable_exits_unlock;
+		if (kvm->created_vcpus) {
+			kvm_for_each_vcpu(i, vcpu, kvm) {
+				kvm_ioctl_disable_exits(vcpu->arch, cap->args[0]);
+				kvm_make_request(KVM_REQ_DISABLE_EXITS, vcpu);
+			}
+		}
+		mutex_unlock(&kvm->lock);
 
 		kvm_ioctl_disable_exits(kvm->arch, cap->args[0]);
 
 		r = 0;
-disable_exits_unlock:
-		mutex_unlock(&kvm->lock);
 		break;
 	case KVM_CAP_MSR_PLATFORM_INFO:
 		kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
@@ -10175,6 +10187,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
 			static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
+
+		if (kvm_check_request(KVM_REQ_DISABLE_EXITS, vcpu))
+			static_call(kvm_x86_update_disabled_exits)(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
                   ` (4 preceding siblings ...)
  2022-06-22  0:49 ` [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for " Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-07-20 17:06   ` Sean Christopherson
  2022-06-22  0:49 ` [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS Kechen Lu
  6 siblings, 1 reply; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

For debug and test purposes, there are needs to explicitly make
instruction triggered exits could be trapped to userspace. Simply
add a new flag for guest_debug interface could achieve this.

This patch also fills the userspace accessible field
vcpu->run->hw.hardware_exit_reason for userspace to determine the
original triggered VM-exits.

Signed-off-by: Kechen Lu <kechenl@nvidia.com>
---
 arch/x86/kvm/svm/svm.c         | 2 ++
 arch/x86/kvm/vmx/vmx.c         | 1 +
 arch/x86/kvm/x86.c             | 2 ++
 include/uapi/linux/kvm.h       | 1 +
 tools/include/uapi/linux/kvm.h | 1 +
 5 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7b3d64b3b901..e7ced6c3fbea 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3259,6 +3259,8 @@ int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code)
 	if (!svm_check_exit_valid(exit_code))
 		return svm_handle_invalid_exit(vcpu, exit_code);
 
+	vcpu->run->hw.hardware_exit_reason = exit_code;
+
 #ifdef CONFIG_RETPOLINE
 	if (exit_code == SVM_EXIT_MSR)
 		return msr_interception(vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2d000638cc9b..c32c20c4aa4d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6151,6 +6151,7 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 
 	if (exit_reason.basic >= kvm_vmx_max_exit_handlers)
 		goto unexpected_vmexit;
+	vcpu->run->hw.hardware_exit_reason = exit_reason.basic;
 #ifdef CONFIG_RETPOLINE
 	if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
 		return kvm_emulate_wrmsr(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6165f0b046ed..91384a56ae0a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8349,6 +8349,8 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	 */
 	if (unlikely(rflags & X86_EFLAGS_TF))
 		r = kvm_vcpu_do_singlestep(vcpu);
+	r &= !(vcpu->guest_debug & KVM_GUESTDBG_EXIT_USERSPACE);
+
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f2e76e436be5..23c335a6a285 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -777,6 +777,7 @@ struct kvm_s390_irq_state {
 
 #define KVM_GUESTDBG_ENABLE		0x00000001
 #define KVM_GUESTDBG_SINGLESTEP		0x00000002
+#define KVM_GUESTDBG_EXIT_USERSPACE	0x00000004
 
 struct kvm_guest_debug {
 	__u32 control;
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 6a184d260c7f..373b4a2b7fe9 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -773,6 +773,7 @@ struct kvm_s390_irq_state {
 
 #define KVM_GUESTDBG_ENABLE		0x00000001
 #define KVM_GUESTDBG_SINGLESTEP		0x00000002
+#define KVM_GUESTDBG_EXIT_USERSPACE	0x00000004
 
 struct kvm_guest_debug {
 	__u32 control;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS
  2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
                   ` (5 preceding siblings ...)
  2022-06-22  0:49 ` [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace Kechen Lu
@ 2022-06-22  0:49 ` Kechen Lu
  2022-06-22  6:44   ` Huang, Shaoqin
  6 siblings, 1 reply; 18+ messages in thread
From: Kechen Lu @ 2022-06-22  0:49 UTC (permalink / raw)
  To: kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, kechenl, linux-kernel

Add tests for KVM cap KVM_CAP_X86_DISABLE_EXITS overriding flags
in VM and vCPU scope both works as expected.

Suggested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Kechen Lu <kechenl@nvidia.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/svm_util.h   |   1 +
 .../selftests/kvm/x86_64/disable_exits_test.c | 145 ++++++++++++++++++
 4 files changed, 148 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 4509a3a7eeae..2b50170db9b2 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -15,6 +15,7 @@
 /x86_64/cpuid_test
 /x86_64/cr4_cpuid_sync_test
 /x86_64/debug_regs
+/x86_64/disable_exits_test
 /x86_64/evmcs_test
 /x86_64/emulator_error_test
 /x86_64/fix_hypercall_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 22423c871ed6..de11d1f95700 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -115,6 +115,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
 TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
 TEST_GEN_PROGS_x86_64 += x86_64/amx_test
+TEST_GEN_PROGS_x86_64 += x86_64/disable_exits_test
 TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
index a25aabd8f5e7..d8cad1cff578 100644
--- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h
+++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
@@ -17,6 +17,7 @@
 #define CPUID_SVM		BIT_ULL(CPUID_SVM_BIT)
 
 #define SVM_EXIT_MSR		0x07c
+#define SVM_EXIT_HLT		0x078
 #define SVM_EXIT_VMMCALL	0x081
 
 struct svm_test_data {
diff --git a/tools/testing/selftests/kvm/x86_64/disable_exits_test.c b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
new file mode 100644
index 000000000000..2811b07e8885
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Test per-VM and per-vCPU disable exits cap
+ *
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <sys/ioctl.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "svm_util.h"
+#include "vmx.h"
+#include "processor.h"
+
+#define VCPU_ID_1 0
+#define VCPU_ID_2 1
+
+static void guest_code_exits(void) {
+	asm volatile("sti; hlt; cli");
+}
+
+/* Set debug control for trapped instruction exiting to userspace */
+static void vcpu_set_debug_exit_userspace(struct kvm_vm *vm, int vcpu_id) {
+	struct kvm_guest_debug debug;
+	memset(&debug, 0, sizeof(debug));
+	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_EXIT_USERSPACE;
+	vcpu_set_guest_debug(vm, vcpu_id, &debug);
+}
+
+static void test_vm_cap_disable_exits(void) {
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_X86_DISABLE_EXITS,
+		.args[0] = KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
+	};
+	struct kvm_vm *vm;
+	struct kvm_run *run;
+
+	/* Create VM */
+	vm = vm_create_without_vcpus(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES);
+
+	/* Test Case #1
+	 * Default without disabling HLT exits in VM scope
+	 */
+	vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
+	vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
+	run = vcpu_state(vm, VCPU_ID_1);
+	vcpu_run(vm, VCPU_ID_1);
+	/* Exit reason should be HLT */
+	if (is_amd_cpu())
+		TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
+			"Got exit_reason other than HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+	else
+		TEST_ASSERT(run->hw.hardware_exit_reason == EXIT_REASON_HLT,
+			"Got exit_reason other than HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+
+	/* Test Case #2
+	 * Disabling HLT exits in VM scope
+	 */
+	vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
+	vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
+	run = vcpu_state(vm, VCPU_ID_2);
+	/* Set VM scoped cap arg
+	 * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
+	 * after vCPUs creation so requiring override flag
+	 */
+	TEST_ASSERT(!vm_enable_cap(vm, &cap), "Failed to set KVM_CAP_X86_DISABLE_EXITS");
+	vcpu_run(vm, VCPU_ID_2);
+	/* Exit reason should not be HLT, would finish the guest
+	 * running and exit (e.g. SVM_EXIT_SHUTDOWN)
+	 */
+	if (is_amd_cpu())
+		TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
+			"Got exit_reason as HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+	else
+		TEST_ASSERT(run->hw.hardware_exit_reason != EXIT_REASON_HLT,
+			"Got exit_reason as HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+
+	kvm_vm_free(vm);
+}
+
+static void test_vcpu_cap_disable_exits(void) {
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_X86_DISABLE_EXITS,
+		.args[0] = KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
+	};
+	struct kvm_vm *vm;
+	struct kvm_run *run;
+
+	/* Create VM */
+	vm = vm_create_without_vcpus(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES);
+	vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
+	vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
+	vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
+	vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
+	/* Set vCPU 2 scoped cap arg
+	 * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
+	 */
+	TEST_ASSERT(!vcpu_enable_cap(vm, VCPU_ID_2, &cap), "Failed to set KVM_CAP_X86_DISABLE_EXITS");
+
+	/* Test Case #3
+	 * Default without disabling HLT exits in this vCPU 1
+	 */
+	run = vcpu_state(vm, VCPU_ID_1);
+	vcpu_run(vm, VCPU_ID_1);
+	/* Exit reason should be HLT */
+	if (is_amd_cpu())
+		TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
+			"Got exit_reason other than HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+	else
+		TEST_ASSERT(run->hw.hardware_exit_reason == EXIT_REASON_HLT,
+			"Got exit_reason other than HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+
+	/* Test Case #4
+	 * Disabling HLT exits in vCPU 2
+	 */
+	run = vcpu_state(vm, VCPU_ID_2);
+	vcpu_run(vm, VCPU_ID_2);
+	/* Exit reason should not be HLT, would finish the guest
+	 * running and exit (e.g. SVM_EXIT_SHUTDOWN)
+	 */
+	if (is_amd_cpu())
+		TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
+			"Got exit_reason as HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+	else
+		TEST_ASSERT(run->hw.hardware_exit_reason != EXIT_REASON_HLT,
+			"Got exit_reason as HLT: 0x%llx\n",
+			run->hw.hardware_exit_reason);
+
+	kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	test_vm_cap_disable_exits();
+	test_vcpu_cap_disable_exits();
+	return 0;
+}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS
  2022-06-22  0:49 ` [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS Kechen Lu
@ 2022-06-22  6:44   ` Huang, Shaoqin
  2022-06-22 23:30     ` Kechen Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Huang, Shaoqin @ 2022-06-22  6:44 UTC (permalink / raw)
  To: Kechen Lu, kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, somduttar, linux-kernel



On 6/22/2022 8:49 AM, Kechen Lu wrote:
> Add tests for KVM cap KVM_CAP_X86_DISABLE_EXITS overriding flags
> in VM and vCPU scope both works as expected.
> 
> Suggested-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Kechen Lu <kechenl@nvidia.com>
> ---
>   tools/testing/selftests/kvm/.gitignore        |   1 +
>   tools/testing/selftests/kvm/Makefile          |   1 +
>   .../selftests/kvm/include/x86_64/svm_util.h   |   1 +
>   .../selftests/kvm/x86_64/disable_exits_test.c | 145 ++++++++++++++++++
>   4 files changed, 148 insertions(+)
>   create mode 100644 tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> 
> diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
> index 4509a3a7eeae..2b50170db9b2 100644
> --- a/tools/testing/selftests/kvm/.gitignore
> +++ b/tools/testing/selftests/kvm/.gitignore
> @@ -15,6 +15,7 @@
>   /x86_64/cpuid_test
>   /x86_64/cr4_cpuid_sync_test
>   /x86_64/debug_regs
> +/x86_64/disable_exits_test
>   /x86_64/evmcs_test
>   /x86_64/emulator_error_test
>   /x86_64/fix_hypercall_test
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 22423c871ed6..de11d1f95700 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -115,6 +115,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
>   TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
>   TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
>   TEST_GEN_PROGS_x86_64 += x86_64/amx_test
> +TEST_GEN_PROGS_x86_64 += x86_64/disable_exits_test
>   TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
>   TEST_GEN_PROGS_x86_64 += demand_paging_test
>   TEST_GEN_PROGS_x86_64 += dirty_log_test
> diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> index a25aabd8f5e7..d8cad1cff578 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> @@ -17,6 +17,7 @@
>   #define CPUID_SVM		BIT_ULL(CPUID_SVM_BIT)
>   
>   #define SVM_EXIT_MSR		0x07c
> +#define SVM_EXIT_HLT		0x078

There has other people add the SVM_EXIT_HLT in the kvm/queue, so you may 
not need to add it here.

>   #define SVM_EXIT_VMMCALL	0x081
>   
>   struct svm_test_data {
> diff --git a/tools/testing/selftests/kvm/x86_64/disable_exits_test.c b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> new file mode 100644
> index 000000000000..2811b07e8885
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Test per-VM and per-vCPU disable exits cap
> + *
> + */
> +
> +#define _GNU_SOURCE /* for program_invocation_short_name */
> +#include <sys/ioctl.h>
> +
> +#include "test_util.h"
> +#include "kvm_util.h"
> +#include "svm_util.h"
> +#include "vmx.h"
> +#include "processor.h"
> +
> +#define VCPU_ID_1 0
> +#define VCPU_ID_2 1
> +
> +static void guest_code_exits(void) {
> +	asm volatile("sti; hlt; cli");
> +}
> +
> +/* Set debug control for trapped instruction exiting to userspace */
> +static void vcpu_set_debug_exit_userspace(struct kvm_vm *vm, int vcpu_id) {

nit: you should make the code style consistent, please use the format:
function()
{

}

> +	struct kvm_guest_debug debug;
> +	memset(&debug, 0, sizeof(debug));
> +	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_EXIT_USERSPACE;
> +	vcpu_set_guest_debug(vm, vcpu_id, &debug);
> +}
> +
> +static void test_vm_cap_disable_exits(void) {
> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_X86_DISABLE_EXITS,
> +		.args[0] = KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
						    ^
			nit: a space is much more clear here

> +	};
> +	struct kvm_vm *vm;
> +	struct kvm_run *run;
> +
> +	/* Create VM */
> +	vm = vm_create_without_vcpus(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES);
> +
> +	/* Test Case #1
> +	 * Default without disabling HLT exits in VM scope
> +	 */
> +	vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
> +	vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
> +	run = vcpu_state(vm, VCPU_ID_1);
> +	vcpu_run(vm, VCPU_ID_1);
> +	/* Exit reason should be HLT */
> +	if (is_amd_cpu())
> +		TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
> +			"Got exit_reason other than HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +	else
> +		TEST_ASSERT(run->hw.hardware_exit_reason == EXIT_REASON_HLT,
> +			"Got exit_reason other than HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +
> +	/* Test Case #2
> +	 * Disabling HLT exits in VM scope
> +	 */
> +	vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
> +	vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
> +	run = vcpu_state(vm, VCPU_ID_2);

I think you can add more vcpu here to make sure after disabling HLT 
exits in VM scope here, every vcpu will not exit due to the HLT.

> +	/* Set VM scoped cap arg
> +	 * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
> +	 * after vCPUs creation so requiring override flag
> +	 */
> +	TEST_ASSERT(!vm_enable_cap(vm, &cap), "Failed to set KVM_CAP_X86_DISABLE_EXITS");
> +	vcpu_run(vm, VCPU_ID_2);
> +	/* Exit reason should not be HLT, would finish the guest
> +	 * running and exit (e.g. SVM_EXIT_SHUTDOWN)
> +	 */
> +	if (is_amd_cpu())
> +		TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
> +			"Got exit_reason as HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +	else
> +		TEST_ASSERT(run->hw.hardware_exit_reason != EXIT_REASON_HLT,
> +			"Got exit_reason as HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +
> +	kvm_vm_free(vm);
> +}
> +
> +static void test_vcpu_cap_disable_exits(void) {
> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_X86_DISABLE_EXITS,
> +		.args[0] = KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
> +	};
> +	struct kvm_vm *vm;
> +	struct kvm_run *run;
> +
> +	/* Create VM */
> +	vm = vm_create_without_vcpus(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES);
> +	vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
> +	vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
> +	vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
> +	vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
> +	/* Set vCPU 2 scoped cap arg
> +	 * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
> +	 */
> +	TEST_ASSERT(!vcpu_enable_cap(vm, VCPU_ID_2, &cap), "Failed to set KVM_CAP_X86_DISABLE_EXITS");
> +
> +	/* Test Case #3
> +	 * Default without disabling HLT exits in this vCPU 1
> +	 */
> +	run = vcpu_state(vm, VCPU_ID_1);
> +	vcpu_run(vm, VCPU_ID_1);
> +	/* Exit reason should be HLT */
> +	if (is_amd_cpu())
> +		TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
> +			"Got exit_reason other than HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +	else
> +		TEST_ASSERT(run->hw.hardware_exit_reason == EXIT_REASON_HLT,
> +			"Got exit_reason other than HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +
> +	/* Test Case #4
> +	 * Disabling HLT exits in vCPU 2
> +	 */
> +	run = vcpu_state(vm, VCPU_ID_2);
> +	vcpu_run(vm, VCPU_ID_2);
> +	/* Exit reason should not be HLT, would finish the guest
> +	 * running and exit (e.g. SVM_EXIT_SHUTDOWN)
> +	 */
> +	if (is_amd_cpu())
> +		TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
> +			"Got exit_reason as HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +	else
> +		TEST_ASSERT(run->hw.hardware_exit_reason != EXIT_REASON_HLT,
> +			"Got exit_reason as HLT: 0x%llx\n",
> +			run->hw.hardware_exit_reason);
> +
> +	kvm_vm_free(vm);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	test_vm_cap_disable_exits();
> +	test_vcpu_cap_disable_exits();
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS
  2022-06-22  6:44   ` Huang, Shaoqin
@ 2022-06-22 23:30     ` Kechen Lu
  0 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-06-22 23:30 UTC (permalink / raw)
  To: Huang, Shaoqin, kvm, pbonzini
  Cc: seanjc, chao.gao, vkuznets, Somdutta Roy, linux-kernel



> -----Original Message-----
> From: Huang, Shaoqin <shaoqin.huang@intel.com>
> Sent: Tuesday, June 21, 2022 11:44 PM
> To: Kechen Lu <kechenl@nvidia.com>; kvm@vger.kernel.org;
> pbonzini@redhat.com
> Cc: seanjc@google.com; chao.gao@intel.com; vkuznets@redhat.com;
> Somdutta Roy <somduttar@nvidia.com>; linux-kernel@vger.kernel.org
> Subject: Re: [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU
> cap KVM_CAP_X86_DISABLE_EXITS
> 
> External email: Use caution opening links or attachments
> 
> 
> On 6/22/2022 8:49 AM, Kechen Lu wrote:
> > Add tests for KVM cap KVM_CAP_X86_DISABLE_EXITS overriding flags in
> VM
> > and vCPU scope both works as expected.
> >
> > Suggested-by: Chao Gao <chao.gao@intel.com>
> > Signed-off-by: Kechen Lu <kechenl@nvidia.com>
> > ---
> >   tools/testing/selftests/kvm/.gitignore        |   1 +
> >   tools/testing/selftests/kvm/Makefile          |   1 +
> >   .../selftests/kvm/include/x86_64/svm_util.h   |   1 +
> >   .../selftests/kvm/x86_64/disable_exits_test.c | 145 ++++++++++++++++++
> >   4 files changed, 148 insertions(+)
> >   create mode 100644
> > tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> >
> > diff --git a/tools/testing/selftests/kvm/.gitignore
> > b/tools/testing/selftests/kvm/.gitignore
> > index 4509a3a7eeae..2b50170db9b2 100644
> > --- a/tools/testing/selftests/kvm/.gitignore
> > +++ b/tools/testing/selftests/kvm/.gitignore
> > @@ -15,6 +15,7 @@
> >   /x86_64/cpuid_test
> >   /x86_64/cr4_cpuid_sync_test
> >   /x86_64/debug_regs
> > +/x86_64/disable_exits_test
> >   /x86_64/evmcs_test
> >   /x86_64/emulator_error_test
> >   /x86_64/fix_hypercall_test
> > diff --git a/tools/testing/selftests/kvm/Makefile
> > b/tools/testing/selftests/kvm/Makefile
> > index 22423c871ed6..de11d1f95700 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -115,6 +115,7 @@ TEST_GEN_PROGS_x86_64 +=
> x86_64/xen_shinfo_test
> >   TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
> >   TEST_GEN_PROGS_x86_64 += x86_64/sev_migrate_tests
> >   TEST_GEN_PROGS_x86_64 += x86_64/amx_test
> > +TEST_GEN_PROGS_x86_64 += x86_64/disable_exits_test
> >   TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
> >   TEST_GEN_PROGS_x86_64 += demand_paging_test
> >   TEST_GEN_PROGS_x86_64 += dirty_log_test diff --git
> > a/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> > b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> > index a25aabd8f5e7..d8cad1cff578 100644
> > --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> > +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
> > @@ -17,6 +17,7 @@
> >   #define CPUID_SVM           BIT_ULL(CPUID_SVM_BIT)
> >
> >   #define SVM_EXIT_MSR                0x07c
> > +#define SVM_EXIT_HLT         0x078
> 
> There has other people add the SVM_EXIT_HLT in the kvm/queue, so you
> may not need to add it here.
> 

Ack. Thanks!

> >   #define SVM_EXIT_VMMCALL    0x081
> >
> >   struct svm_test_data {
> > diff --git a/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> > b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> > new file mode 100644
> > index 000000000000..2811b07e8885
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/disable_exits_test.c
> > @@ -0,0 +1,145 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Test per-VM and per-vCPU disable exits cap
> > + *
> > + */
> > +
> > +#define _GNU_SOURCE /* for program_invocation_short_name */
> #include
> > +<sys/ioctl.h>
> > +
> > +#include "test_util.h"
> > +#include "kvm_util.h"
> > +#include "svm_util.h"
> > +#include "vmx.h"
> > +#include "processor.h"
> > +
> > +#define VCPU_ID_1 0
> > +#define VCPU_ID_2 1
> > +
> > +static void guest_code_exits(void) {
> > +     asm volatile("sti; hlt; cli");
> > +}
> > +
> > +/* Set debug control for trapped instruction exiting to userspace */
> > +static void vcpu_set_debug_exit_userspace(struct kvm_vm *vm, int
> > +vcpu_id) {
> 
> nit: you should make the code style consistent, please use the format:
> function()
> {
> 
> }
> 

Noted.

> > +     struct kvm_guest_debug debug;
> > +     memset(&debug, 0, sizeof(debug));
> > +     debug.control = KVM_GUESTDBG_ENABLE |
> KVM_GUESTDBG_EXIT_USERSPACE;
> > +     vcpu_set_guest_debug(vm, vcpu_id, &debug); }
> > +
> > +static void test_vm_cap_disable_exits(void) {
> > +     struct kvm_enable_cap cap = {
> > +             .cap = KVM_CAP_X86_DISABLE_EXITS,
> > +             .args[0] =
> > +KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
>                                                     ^
>                         nit: a space is much more clear here
> 

Noted.

> > +     };
> > +     struct kvm_vm *vm;
> > +     struct kvm_run *run;
> > +
> > +     /* Create VM */
> > +     vm = vm_create_without_vcpus(VM_MODE_DEFAULT,
> > + DEFAULT_GUEST_PHY_PAGES);
> > +
> > +     /* Test Case #1
> > +      * Default without disabling HLT exits in VM scope
> > +      */
> > +     vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
> > +     vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
> > +     run = vcpu_state(vm, VCPU_ID_1);
> > +     vcpu_run(vm, VCPU_ID_1);
> > +     /* Exit reason should be HLT */
> > +     if (is_amd_cpu())
> > +             TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
> > +                     "Got exit_reason other than HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +     else
> > +             TEST_ASSERT(run->hw.hardware_exit_reason ==
> EXIT_REASON_HLT,
> > +                     "Got exit_reason other than HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +
> > +     /* Test Case #2
> > +      * Disabling HLT exits in VM scope
> > +      */
> > +     vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
> > +     vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
> > +     run = vcpu_state(vm, VCPU_ID_2);
> 
> I think you can add more vcpu here to make sure after disabling HLT exits in
> VM scope here, every vcpu will not exit due to the HLT.
> 

Makes sense. Will refine the case design. Thanks.

BR,
Kechen

> > +     /* Set VM scoped cap arg
> > +      * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
> > +      * after vCPUs creation so requiring override flag
> > +      */
> > +     TEST_ASSERT(!vm_enable_cap(vm, &cap), "Failed to set
> KVM_CAP_X86_DISABLE_EXITS");
> > +     vcpu_run(vm, VCPU_ID_2);
> > +     /* Exit reason should not be HLT, would finish the guest
> > +      * running and exit (e.g. SVM_EXIT_SHUTDOWN)
> > +      */
> > +     if (is_amd_cpu())
> > +             TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
> > +                     "Got exit_reason as HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +     else
> > +             TEST_ASSERT(run->hw.hardware_exit_reason !=
> EXIT_REASON_HLT,
> > +                     "Got exit_reason as HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +
> > +     kvm_vm_free(vm);
> > +}
> > +
> > +static void test_vcpu_cap_disable_exits(void) {
> > +     struct kvm_enable_cap cap = {
> > +             .cap = KVM_CAP_X86_DISABLE_EXITS,
> > +             .args[0] =
> KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE,
> > +     };
> > +     struct kvm_vm *vm;
> > +     struct kvm_run *run;
> > +
> > +     /* Create VM */
> > +     vm = vm_create_without_vcpus(VM_MODE_DEFAULT,
> DEFAULT_GUEST_PHY_PAGES);
> > +     vm_vcpu_add_default(vm, VCPU_ID_1, (void *)guest_code_exits);
> > +     vcpu_set_debug_exit_userspace(vm, VCPU_ID_1);
> > +     vm_vcpu_add_default(vm, VCPU_ID_2, (void *)guest_code_exits);
> > +     vcpu_set_debug_exit_userspace(vm, VCPU_ID_2);
> > +     /* Set vCPU 2 scoped cap arg
> > +      * KVM_X86_DISABLE_EXITS_HLT|KVM_X86_DISABLE_EXITS_OVERRIDE
> > +      */
> > +     TEST_ASSERT(!vcpu_enable_cap(vm, VCPU_ID_2, &cap), "Failed to
> > + set KVM_CAP_X86_DISABLE_EXITS");
> > +
> > +     /* Test Case #3
> > +      * Default without disabling HLT exits in this vCPU 1
> > +      */
> > +     run = vcpu_state(vm, VCPU_ID_1);
> > +     vcpu_run(vm, VCPU_ID_1);
> > +     /* Exit reason should be HLT */
> > +     if (is_amd_cpu())
> > +             TEST_ASSERT(run->hw.hardware_exit_reason == SVM_EXIT_HLT,
> > +                     "Got exit_reason other than HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +     else
> > +             TEST_ASSERT(run->hw.hardware_exit_reason ==
> EXIT_REASON_HLT,
> > +                     "Got exit_reason other than HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +
> > +     /* Test Case #4
> > +      * Disabling HLT exits in vCPU 2
> > +      */
> > +     run = vcpu_state(vm, VCPU_ID_2);
> > +     vcpu_run(vm, VCPU_ID_2);
> > +     /* Exit reason should not be HLT, would finish the guest
> > +      * running and exit (e.g. SVM_EXIT_SHUTDOWN)
> > +      */
> > +     if (is_amd_cpu())
> > +             TEST_ASSERT(run->hw.hardware_exit_reason != SVM_EXIT_HLT,
> > +                     "Got exit_reason as HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +     else
> > +             TEST_ASSERT(run->hw.hardware_exit_reason !=
> EXIT_REASON_HLT,
> > +                     "Got exit_reason as HLT: 0x%llx\n",
> > +                     run->hw.hardware_exit_reason);
> > +
> > +     kvm_vm_free(vm);
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +     test_vm_cap_disable_exits();
> > +     test_vcpu_cap_disable_exits();
> > +     return 0;
> > +}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace
  2022-06-22  0:49 ` [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace Kechen Lu
@ 2022-07-20 17:06   ` Sean Christopherson
  2022-07-20 19:11     ` Kechen Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2022-07-20 17:06 UTC (permalink / raw)
  To: Kechen Lu; +Cc: kvm, pbonzini, chao.gao, vkuznets, somduttar, linux-kernel

On Tue, Jun 21, 2022, Kechen Lu wrote:
> For debug and test purposes, there are needs to explicitly make
> instruction triggered exits could be trapped to userspace. Simply
> add a new flag for guest_debug interface could achieve this.
> 
> This patch also fills the userspace accessible field
> vcpu->run->hw.hardware_exit_reason for userspace to determine the
> original triggered VM-exits.

This patch belongs in a different series, AFAICT there are no dependencies between
this and allowing per-vCPU disabling of exits.  Allowing userspace to exit on
"every" instruction exit is going to be much more controversial, largely because
it will be difficult for KVM to provide a consistent, robust ABI.  E.g. should
KVM exit to userspace if an intercepted instruction is encountered by the emualtor?

TL;DR: drop this patch from the next version.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed
  2022-06-22  0:49 ` [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed Kechen Lu
@ 2022-07-20 17:53   ` Sean Christopherson
  2022-07-20 18:37     ` Kechen Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2022-07-20 17:53 UTC (permalink / raw)
  To: Kechen Lu; +Cc: kvm, pbonzini, chao.gao, vkuznets, somduttar, linux-kernel

On Tue, Jun 21, 2022, Kechen Lu wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Reject KVM_CAP_X86_DISABLE_EXITS if userspace attempts to disable MWAIT
> exits and KVM previously reported (via KVM_CHECK_EXTENSION) that MWAIT is
> not allowed in guest, e.g. because it's not supported or the CPU doesn't
> have an aways-running APIC timer.
> 
> Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT intercepts")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Co-developed-by: Kechen Lu <kechenl@nvidia.com>

Needs your SOB.

> Suggested-by: Chao Gao <chao.gao@intel.com>

For code review feedback of this nature, adding Suggested-by isn't appropriate.
Suggested-by is for when the idea of the patch itself was suggested by someone,
where as Chao's feedback was a purely mechanical change.

> ---
>  arch/x86/kvm/x86.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b419b258ed90..6ec01362a7d8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4199,6 +4199,16 @@ static inline bool kvm_can_mwait_in_guest(void)
>  		boot_cpu_has(X86_FEATURE_ARAT);
>  }
>  
> +static u64 kvm_get_allowed_disable_exits(void)
> +{
> +	u64 r = KVM_X86_DISABLE_VALID_EXITS;

In v3 I "voted" to keep the switch to KVM_X86_DISABLE_VALID_EXITS in the next
patch[*], but seeing the result I 100% agree it's better to handle it here since
the "enable" patch previously used KVM_X86_DISABLE_VALID_EXITS.

[*] https://lore.kernel.org/all/Ytg428sleo7uMRQt@google.com

> +
> +	if(!kvm_can_mwait_in_guest())

Space after the "if".

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed
  2022-07-20 17:53   ` Sean Christopherson
@ 2022-07-20 18:37     ` Kechen Lu
  0 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-07-20 18:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, pbonzini, chao.gao, vkuznets, Somdutta Roy, linux-kernel



> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Wednesday, July 20, 2022 10:54 AM
> To: Kechen Lu <kechenl@nvidia.com>
> Cc: kvm@vger.kernel.org; pbonzini@redhat.com; chao.gao@intel.com;
> vkuznets@redhat.com; Somdutta Roy <somduttar@nvidia.com>; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT
> interception when not allowed
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, Jun 21, 2022, Kechen Lu wrote:
> > From: Sean Christopherson <seanjc@google.com>
> >
> > Reject KVM_CAP_X86_DISABLE_EXITS if userspace attempts to disable
> > MWAIT exits and KVM previously reported (via KVM_CHECK_EXTENSION)
> that
> > MWAIT is not allowed in guest, e.g. because it's not supported or the
> > CPU doesn't have an aways-running APIC timer.
> >
> > Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT
> > intercepts")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > Co-developed-by: Kechen Lu <kechenl@nvidia.com>
> 
> Needs your SOB.
>
 
Ack!

> > Suggested-by: Chao Gao <chao.gao@intel.com>
> 
> For code review feedback of this nature, adding Suggested-by isn't
> appropriate.
> Suggested-by is for when the idea of the patch itself was suggested by
> someone, where as Chao's feedback was a purely mechanical change.
> 

Sure I see.

> > ---
> >  arch/x86/kvm/x86.c | 20 +++++++++++++-------
> >  1 file changed, 13 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index
> > b419b258ed90..6ec01362a7d8 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4199,6 +4199,16 @@ static inline bool
> kvm_can_mwait_in_guest(void)
> >               boot_cpu_has(X86_FEATURE_ARAT);  }
> >
> > +static u64 kvm_get_allowed_disable_exits(void)
> > +{
> > +     u64 r = KVM_X86_DISABLE_VALID_EXITS;
> 
> In v3 I "voted" to keep the switch to KVM_X86_DISABLE_VALID_EXITS in the
> next patch[*], but seeing the result I 100% agree it's better to handle it here
> since the "enable" patch previously used KVM_X86_DISABLE_VALID_EXITS.
> 

Yes, I agree, handling here makes sense.

> [*] https://lore.kernel.org/all/Ytg428sleo7uMRQt@google.com
> 
> > +
> > +     if(!kvm_can_mwait_in_guest())
> 
> Space after the "if".

Ack!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for disabled exits
  2022-06-22  0:49 ` [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for " Kechen Lu
@ 2022-07-20 18:41   ` Sean Christopherson
  2022-07-20 19:04     ` Kechen Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2022-07-20 18:41 UTC (permalink / raw)
  To: Kechen Lu; +Cc: kvm, pbonzini, chao.gao, vkuznets, somduttar, linux-kernel

On Tue, Jun 21, 2022, Kechen Lu wrote:
> @@ -5980,6 +5987,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>  int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			    struct kvm_enable_cap *cap)
>  {
> +	struct kvm_vcpu *vcpu;
> +	unsigned long i;
>  	int r;
>  
>  	if (cap->flags)
> @@ -6036,14 +6045,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			break;
>  
>  		mutex_lock(&kvm->lock);
> -		if (kvm->created_vcpus)
> -			goto disable_exits_unlock;
> +		if (kvm->created_vcpus) {

I retract my comment about using a request, I got ahead of myself.

Don't update vCPUs, the whole point of adding the !kvm->created_vcpus check was
to avoid having to update vCPUs when the per-VM behavior changed.

In other words, keep the restriction and drop the request.

> +			kvm_for_each_vcpu(i, vcpu, kvm) {
> +				kvm_ioctl_disable_exits(vcpu->arch, cap->args[0]);
> +				kvm_make_request(KVM_REQ_DISABLE_EXITS, vcpu);
> +			}
> +		}
> +		mutex_unlock(&kvm->lock);
>  
>  		kvm_ioctl_disable_exits(kvm->arch, cap->args[0]);
>  
>  		r = 0;
> -disable_exits_unlock:
> -		mutex_unlock(&kvm->lock);
>  		break;
>  	case KVM_CAP_MSR_PLATFORM_INFO:
>  		kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
> @@ -10175,6 +10187,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
>  			static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
> +
> +		if (kvm_check_request(KVM_REQ_DISABLE_EXITS, vcpu))
> +			static_call(kvm_x86_update_disabled_exits)(vcpu);
>  	}
>  
>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for disabled exits
  2022-07-20 18:41   ` Sean Christopherson
@ 2022-07-20 19:04     ` Kechen Lu
  2022-07-20 19:30       ` Sean Christopherson
  0 siblings, 1 reply; 18+ messages in thread
From: Kechen Lu @ 2022-07-20 19:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, pbonzini, chao.gao, vkuznets, Somdutta Roy, linux-kernel



> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Wednesday, July 20, 2022 11:42 AM
> To: Kechen Lu <kechenl@nvidia.com>
> Cc: kvm@vger.kernel.org; pbonzini@redhat.com; chao.gao@intel.com;
> vkuznets@redhat.com; Somdutta Roy <somduttar@nvidia.com>; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for
> disabled exits
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, Jun 21, 2022, Kechen Lu wrote:
> > @@ -5980,6 +5987,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm,
> > struct kvm_irq_level *irq_event,  int kvm_vm_ioctl_enable_cap(struct kvm
> *kvm,
> >                           struct kvm_enable_cap *cap)  {
> > +     struct kvm_vcpu *vcpu;
> > +     unsigned long i;
> >       int r;
> >
> >       if (cap->flags)
> > @@ -6036,14 +6045,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm
> *kvm,
> >                       break;
> >
> >               mutex_lock(&kvm->lock);
> > -             if (kvm->created_vcpus)
> > -                     goto disable_exits_unlock;
> > +             if (kvm->created_vcpus) {
> 
> I retract my comment about using a request, I got ahead of myself.
> 
> Don't update vCPUs, the whole point of adding the !kvm->created_vcpus
> check was to avoid having to update vCPUs when the per-VM behavior
> changed.
> 
> In other words, keep the restriction and drop the request.
> 

I see. If we keep the restriction here and not updating vCPUs when kvm->created_vcpus is true, the per-VM and per-vCPU assumption would be different here? Not sure if I understand right:
For per-VM, we assume the per-VM cap enabling is only before vcpus creation. For per-vCPU cap enabling, we are able to toggle the disabled exits runtime.

If I understand correctly, this also makes sense though.

BR,
Kechen

> > +                     kvm_for_each_vcpu(i, vcpu, kvm) {
> > +                             kvm_ioctl_disable_exits(vcpu->arch, cap->args[0]);
> > +                             kvm_make_request(KVM_REQ_DISABLE_EXITS, vcpu);
> > +                     }
> > +             }
> > +             mutex_unlock(&kvm->lock);
> >
> >               kvm_ioctl_disable_exits(kvm->arch, cap->args[0]);
> >
> >               r = 0;
> > -disable_exits_unlock:
> > -             mutex_unlock(&kvm->lock);
> >               break;
> >       case KVM_CAP_MSR_PLATFORM_INFO:
> >               kvm->arch.guest_can_read_msr_platform_info =
> > cap->args[0]; @@ -10175,6 +10187,9 @@ static int
> > vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >
> >               if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING,
> vcpu))
> >
> > static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
> > +
> > +             if (kvm_check_request(KVM_REQ_DISABLE_EXITS, vcpu))
> > +
> > + static_call(kvm_x86_update_disabled_exits)(vcpu);
> >       }
> >
> >       if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
> > --
> > 2.32.0
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace
  2022-07-20 17:06   ` Sean Christopherson
@ 2022-07-20 19:11     ` Kechen Lu
  0 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-07-20 19:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, pbonzini, chao.gao, vkuznets, Somdutta Roy, linux-kernel



> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Wednesday, July 20, 2022 10:06 AM
> To: Kechen Lu <kechenl@nvidia.com>
> Cc: kvm@vger.kernel.org; pbonzini@redhat.com; chao.gao@intel.com;
> vkuznets@redhat.com; Somdutta Roy <somduttar@nvidia.com>; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag
> forcing exit to userspace
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, Jun 21, 2022, Kechen Lu wrote:
> > For debug and test purposes, there are needs to explicitly make
> > instruction triggered exits could be trapped to userspace. Simply add
> > a new flag for guest_debug interface could achieve this.
> >
> > This patch also fills the userspace accessible field
> > vcpu->run->hw.hardware_exit_reason for userspace to determine the
> > original triggered VM-exits.
> 
> This patch belongs in a different series, AFAICT there are no dependencies
> between this and allowing per-vCPU disabling of exits.  Allowing userspace to
> exit on "every" instruction exit is going to be much more controversial,
> largely because it will be difficult for KVM to provide a consistent, robust ABI.
> E.g. should KVM exit to userspace if an intercepted instruction is
> encountered by the emualtor?
> 
> TL;DR: drop this patch from the next version.

Ack. This patch I introduced as prerequisite for the patch 7 of implementing the selftests for KVM_CAP_X86_DISABLE_EXITS. But yeah, it's not a good practice, I will try to think about a better way to implement the disabled exits testing.

BR,
Kechen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for disabled exits
  2022-07-20 19:04     ` Kechen Lu
@ 2022-07-20 19:30       ` Sean Christopherson
  2022-07-20 20:23         ` Kechen Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Sean Christopherson @ 2022-07-20 19:30 UTC (permalink / raw)
  To: Kechen Lu; +Cc: kvm, pbonzini, chao.gao, vkuznets, Somdutta Roy, linux-kernel

On Wed, Jul 20, 2022, Kechen Lu wrote:
> > > @@ -6036,14 +6045,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm kvm,
> > >                       break;
> > >
> > >               mutex_lock(&kvm->lock);
> > > -             if (kvm->created_vcpus)
> > > -                     goto disable_exits_unlock;
> > > +             if (kvm->created_vcpus) {
> > 
> > I retract my comment about using a request, I got ahead of myself.
> > 
> > Don't update vCPUs, the whole point of adding the !kvm->created_vcpus
> > check was to avoid having to update vCPUs when the per-VM behavior
> > changed.
> > 
> > In other words, keep the restriction and drop the request.
> > 
> 
> I see. If we keep the restriction here and not updating vCPUs when
> kvm->created_vcpus is true, the per-VM and per-vCPU assumption would be
> different here? Not sure if I understand right:
> For per-VM, we assume the per-VM cap enabling is only before vcpus creation.
> For per-vCPU cap enabling, we are able to toggle the disabled exits runtime.

Yep.  The main reason being that there's no use case for changing per-VM settings
after vCPUs are created.  I.e. we could lift the restriction in the future if a
use case pops up, but until then, keep things simple.

> If I understand correctly, this also makes sense though.

Paging this all back in...

There are two (sane) options for defining KVM's ABI:

  1) KVM combines the per-VM and per-vCPU settings
  2) The per-vCPU settings override the per-VM settings

This series implements (2).

For (1), KVM would need to recheck the per-VM state during the per-vCPU update,
e.g. instead of simply modifying the per-vCPU flags, the vCPU-scoped handler
for KVM_CAP_X86_DISABLE_EXITS would need to merge the incoming settings with the
existing kvm->arch.xxx_in_guest flags.

I like (2) because it's simpler to implement and document (merging state is always
messy) and is more flexible.  E.g. with (1), the only way to have per-vCPU settings
is for userspace to NOT set the per-VM disables and then set disables on a per-vCPU
basis.  Whereas with (2), userspace can set (or not) the per-VM disables and then
override as needed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for disabled exits
  2022-07-20 19:30       ` Sean Christopherson
@ 2022-07-20 20:23         ` Kechen Lu
  0 siblings, 0 replies; 18+ messages in thread
From: Kechen Lu @ 2022-07-20 20:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, pbonzini, chao.gao, vkuznets, Somdutta Roy, linux-kernel



> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Wednesday, July 20, 2022 12:30 PM
> To: Kechen Lu <kechenl@nvidia.com>
> Cc: kvm@vger.kernel.org; pbonzini@redhat.com; chao.gao@intel.com;
> vkuznets@redhat.com; Somdutta Roy <somduttar@nvidia.com>; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for
> disabled exits
> 
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Jul 20, 2022, Kechen Lu wrote:
> > > > @@ -6036,14 +6045,17 @@ int kvm_vm_ioctl_enable_cap(struct kvm
> kvm,
> > > >                       break;
> > > >
> > > >               mutex_lock(&kvm->lock);
> > > > -             if (kvm->created_vcpus)
> > > > -                     goto disable_exits_unlock;
> > > > +             if (kvm->created_vcpus) {
> > >
> > > I retract my comment about using a request, I got ahead of myself.
> > >
> > > Don't update vCPUs, the whole point of adding the
> > > !kvm->created_vcpus check was to avoid having to update vCPUs when
> > > the per-VM behavior changed.
> > >
> > > In other words, keep the restriction and drop the request.
> > >
> >
> > I see. If we keep the restriction here and not updating vCPUs when
> > kvm->created_vcpus is true, the per-VM and per-vCPU assumption would
> > kvm->be
> > different here? Not sure if I understand right:
> > For per-VM, we assume the per-VM cap enabling is only before vcpus
> creation.
> > For per-vCPU cap enabling, we are able to toggle the disabled exits runtime.
> 
> Yep.  The main reason being that there's no use case for changing per-VM
> settings after vCPUs are created.  I.e. we could lift the restriction in the future
> if a use case pops up, but until then, keep things simple.
> 
> > If I understand correctly, this also makes sense though.
> 
> Paging this all back in...
> 
> There are two (sane) options for defining KVM's ABI:
> 
>   1) KVM combines the per-VM and per-vCPU settings
>   2) The per-vCPU settings override the per-VM settings
> 
> This series implements (2).
> 
> For (1), KVM would need to recheck the per-VM state during the per-vCPU
> update, e.g. instead of simply modifying the per-vCPU flags, the vCPU-scoped
> handler for KVM_CAP_X86_DISABLE_EXITS would need to merge the
> incoming settings with the existing kvm->arch.xxx_in_guest flags.
> 
> I like (2) because it's simpler to implement and document (merging state is
> always
> messy) and is more flexible.  E.g. with (1), the only way to have per-vCPU
> settings is for userspace to NOT set the per-VM disables and then set
> disables on a per-vCPU basis.  Whereas with (2), userspace can set (or not)
> the per-VM disables and then override as needed.

Gotcha. Makes sense to me. Thanks for the elaboration!

BR,
Kechen

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-20 20:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22  0:49 [RFC PATCH v4 0/7] KVM: x86: add per-vCPU exits disable capability Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 1/7] KVM: x86: only allow exits disable before vCPUs created Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 2/7] KVM: x86: Move *_in_guest power management flags to vCPU scope Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 3/7] KVM: x86: Reject disabling of MWAIT interception when not allowed Kechen Lu
2022-07-20 17:53   ` Sean Christopherson
2022-07-20 18:37     ` Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 4/7] KVM: x86: Let userspace re-enable previously disabled exits Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 5/7] KVM: x86: add vCPU scoped toggling for " Kechen Lu
2022-07-20 18:41   ` Sean Christopherson
2022-07-20 19:04     ` Kechen Lu
2022-07-20 19:30       ` Sean Christopherson
2022-07-20 20:23         ` Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 6/7] KVM: x86: Add a new guest_debug flag forcing exit to userspace Kechen Lu
2022-07-20 17:06   ` Sean Christopherson
2022-07-20 19:11     ` Kechen Lu
2022-06-22  0:49 ` [RFC PATCH v4 7/7] KVM: selftests: Add tests for VM and vCPU cap KVM_CAP_X86_DISABLE_EXITS Kechen Lu
2022-06-22  6:44   ` Huang, Shaoqin
2022-06-22 23:30     ` Kechen Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.