All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv5 0/8] Virtual NMI feature
@ 2022-10-27  8:38 Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 1/8] x86/cpu: Add CPUID feature bit for VNMI Santosh Shukla
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

VNMI Spec is at [1].

Change History:

v5 (6.1-rc2)
01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)

v4 (v6.0-rc3):
https://lore.kernel.org/all/20220829100850.1474-1-santosh.shukla@amd.com/

v3 (rebased on eb555cb5b794f):
https://lore.kernel.org/all/20220810061226.1286-1-santosh.shukla@amd.com/

v2:
https://lore.kernel.org/lkml/20220709134230.2397-7-santosh.shukla@amd.com/T/#m4bf8a131748688fed00ab0fefdcac209a169e202

v1:
https://lore.kernel.org/all/20220602142620.3196-1-santosh.shukla@amd.com/

Description:
Currently, NMI is delivered to the guest using the Event Injection
mechanism [2]. The Event Injection mechanism does not block the delivery
of subsequent NMIs. So the Hypervisor needs to track the NMI delivery
and its completion(by intercepting IRET) before sending a new NMI.

Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the guest
w/o using Event Injection mechanism meaning not required to track the
guest NMI and intercepting the IRET. To achieve that,
VNMI feature provides virtualized NMI and NMI_MASK capability bits in
VMCB intr_control -
V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.

When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor will
clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
handling NMI, After the guest handled the NMI, The processor will clear
the V_NMI_MASK on the successful completion of IRET instruction
Or if VMEXIT occurs while delivering the virtual NMI.

If NMI virtualization enabled and NMI_INTERCEPT bit is unset
then HW will exit with #INVALID exit reason.

To enable the VNMI capability, Hypervisor need to program
V_NMI_ENABLE bit 1.

The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].

Testing -
* Used qemu's `inject_nmi` for testing.
* tested with and w/o AVIC case.
* tested with kvm-unit-test
* tested with vGIF enable and disable.
* tested nested env:
  - L1+L2 using vnmi
  - L1 using vnmi and L2 not


Thanks,
Santosh
[1] https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5
(Ch-15.21.10 - NMI Virtualization)

[2] https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5
(Ch-15.20 - Event Injection)


Santosh Shukla (8):
  x86/cpu: Add CPUID feature bit for VNMI
  KVM: SVM: Add VNMI bit definition
  KVM: SVM: Add VNMI support in get/set_nmi_mask
  KVM: SVM: Report NMI not allowed when Guest busy handling VNMI
  KVM: SVM: Add VNMI support in inject_nmi
  KVM: nSVM: implement nested VNMI
  KVM: nSVM: emulate VMEXIT_INVALID case for nested VNMI
  KVM: SVM: Enable VNMI feature

 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/svm.h         |  7 +++
 arch/x86/kvm/svm/nested.c          | 32 ++++++++++++++
 arch/x86/kvm/svm/svm.c             | 44 ++++++++++++++++++-
 arch/x86/kvm/svm/svm.h             | 68 ++++++++++++++++++++++++++++++
 5 files changed, 151 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv5 1/8] x86/cpu: Add CPUID feature bit for VNMI
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 2/8] KVM: SVM: Add VNMI bit definition Santosh Shukla
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

VNMI feature allows the hypervisor to inject NMI into the guest w/o
using Event injection mechanism, The benefit of using VNMI over the
event Injection that does not require tracking the Guest's NMI state and
intercepting the IRET for the NMI completion. VNMI achieves that by
exposing 3 capability bits in VMCB intr_cntrl which helps with
virtualizing NMI injection and NMI_Masking.

The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v5:
 - Rename s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI

 arch/x86/include/asm/cpufeatures.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b71f4f2ecdd5..23bd69848d27 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -356,6 +356,7 @@
 #define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
 #define X86_FEATURE_X2AVIC		(15*32+18) /* Virtual x2apic */
 #define X86_FEATURE_V_SPEC_CTRL		(15*32+20) /* Virtual SPEC_CTRL */
+#define X86_FEATURE_AMD_VNMI		(15*32+25) /* Virtual NMI */
 #define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* "" SVME addr check */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 2/8] KVM: SVM: Add VNMI bit definition
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 1/8] x86/cpu: Add CPUID feature bit for VNMI Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 3/8] KVM: SVM: Add VNMI support in get/set_nmi_mask Santosh Shukla
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

VNMI exposes 3 capability bits (V_NMI, V_NMI_MASK, and V_NMI_ENABLE) to
virtualize NMI and NMI_MASK, Those capability bits are part of
VMCB::intr_ctrl -
V_NMI(11) - Indicates whether a virtual NMI is pending in the guest.
V_NMI_MASK(12) - Indicates whether virtual NMI is masked in the guest.
V_NMI_ENABLE(26) - Enables the NMI virtualization feature for the guest.

When Hypervisor wants to inject NMI, it will set V_NMI bit, Processor
will clear the V_NMI bit and Set the V_NMI_MASK which means the Guest is
handling NMI, After the guest handled the NMI, The processor will clear
the V_NMI_MASK on the successful completion of IRET instruction Or if
VMEXIT occurs while delivering the virtual NMI.

To enable the VNMI capability, Hypervisor need to program
V_NMI_ENABLE bit 1.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v5:
 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI

 arch/x86/include/asm/svm.h | 7 +++++++
 arch/x86/kvm/svm/svm.c     | 6 ++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 0361626841bc..73bf97e04fe3 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -198,6 +198,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define X2APIC_MODE_SHIFT 30
 #define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
 
+#define V_NMI_PENDING_SHIFT 11
+#define V_NMI_PENDING (1 << V_NMI_PENDING_SHIFT)
+#define V_NMI_MASK_SHIFT 12
+#define V_NMI_MASK (1 << V_NMI_MASK_SHIFT)
+#define V_NMI_ENABLE_SHIFT 26
+#define V_NMI_ENABLE (1 << V_NMI_ENABLE_SHIFT)
+
 #define LBR_CTL_ENABLE_MASK BIT_ULL(0)
 #define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..b759f650fe2d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -229,6 +229,8 @@ module_param(dump_invalid_vmcb, bool, 0644);
 bool intercept_smi = true;
 module_param(intercept_smi, bool, 0444);
 
+bool vnmi = true;
+module_param(vnmi, bool, 0444);
 
 static bool svm_gp_erratum_intercept = true;
 
@@ -5054,6 +5056,10 @@ static __init int svm_hardware_setup(void)
 		svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
 	}
 
+	vnmi = vnmi && boot_cpu_has(X86_FEATURE_AMD_VNMI);
+	if (vnmi)
+		pr_info("Virtual NMI enabled\n");
+
 	if (vls) {
 		if (!npt_enabled ||
 		    !boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 3/8] KVM: SVM: Add VNMI support in get/set_nmi_mask
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 1/8] x86/cpu: Add CPUID feature bit for VNMI Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 2/8] KVM: SVM: Add VNMI bit definition Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 4/8] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI Santosh Shukla
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

VMCB intr_ctrl bit12 (V_NMI_MASK) is set by the processor when handling
NMI in guest and is cleared after the NMI is handled. Treat V_NMI_MASK
as read-only in the hypervisor except for the SMM case where hypervisor
before entring and after leaving SMM mode requires to set and unset
V_NMI_MASK.

Adding API(get_vnmi_vmcb) in order to return the correct vmcb for L1 or
L2, and also API(clear/set_vnmi_mask) to clear and set mask.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v3:
* Handle SMM case
* Added set/clear_vnmi_mask() API.

v2:
- Added get_vnmi_vmcb API to return vmcb for l1 and l2.
- Use get_vnmi_vmcb to get correct vmcb in func -
  is_vnmi_enabled/_mask_set()
- removed vnmi check from is_vnmi_enabled() func.

 arch/x86/kvm/svm/svm.c | 17 +++++++++++++-
 arch/x86/kvm/svm/svm.h | 52 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b759f650fe2d..cb2eb5581d71 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3620,13 +3620,28 @@ static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection)
 
 static bool svm_get_nmi_mask(struct kvm_vcpu *vcpu)
 {
-	return !!(vcpu->arch.hflags & HF_NMI_MASK);
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (is_vnmi_enabled(svm))
+		return is_vnmi_mask_set(svm);
+	else
+		return !!(vcpu->arch.hflags & HF_NMI_MASK);
 }
 
 static void svm_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	if (is_vnmi_enabled(svm)) {
+		if (is_smm(vcpu)) {
+			if (masked)
+				set_vnmi_mask(svm);
+			else
+				clear_vnmi_mask(svm);
+		}
+		return;
+	}
+
 	if (masked) {
 		vcpu->arch.hflags |= HF_NMI_MASK;
 		if (!sev_es_guest(vcpu->kvm))
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..cc98ec7bd119 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -35,6 +35,7 @@ extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
 extern bool npt_enabled;
 extern int vgif;
 extern bool intercept_smi;
+extern bool vnmi;
 
 enum avic_modes {
 	AVIC_MODE_NONE = 0,
@@ -532,6 +533,57 @@ static inline bool is_x2apic_msrpm_offset(u32 offset)
 	       (msr < (APIC_BASE_MSR + 0x100));
 }
 
+static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
+{
+	if (!vnmi)
+		return NULL;
+
+	if (is_guest_mode(&svm->vcpu))
+		return svm->nested.vmcb02.ptr;
+	else
+		return svm->vmcb01.ptr;
+}
+
+static inline bool is_vnmi_enabled(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+	if (vmcb)
+		return !!(vmcb->control.int_ctl & V_NMI_ENABLE);
+	else
+		return false;
+}
+
+static inline bool is_vnmi_mask_set(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+	if (vmcb)
+		return !!(vmcb->control.int_ctl & V_NMI_MASK);
+	else
+		return false;
+}
+
+static inline void set_vnmi_mask(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl |= V_NMI_MASK;
+	else
+		svm->vcpu.arch.hflags |= HF_GIF_MASK;
+}
+
+static inline void clear_vnmi_mask(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl &= ~V_NMI_MASK;
+	else
+		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
+}
+
 /* svm.c */
 #define MSR_INVALID				0xffffffffU
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 4/8] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (2 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 3/8] KVM: SVM: Add VNMI support in get/set_nmi_mask Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 5/8] KVM: SVM: Add VNMI support in inject_nmi Santosh Shukla
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

In the VNMI case, Report NMI is not allowed when V_NMI_PENDING is set
which mean virtual NMI already pended for Guest to process while
the Guest is busy handling the current virtual NMI. The Guest
will first finish handling the current virtual NMI and then it will
take the pended event w/o vmexit.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v3:
- Added is_vnmi_pending_set API so to check the vnmi pending state.
- Replaced is_vnmi_mask_set check with is_vnmi_pending_set.

v2:
- Moved vnmi check after is_guest_mode() in func _nmi_blocked().
- Removed is_vnmi_mask_set check from _enable_nmi_window().
as it was a redundent check.

 arch/x86/kvm/svm/svm.c |  6 ++++++
 arch/x86/kvm/svm/svm.h | 10 ++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cb2eb5581d71..38fa51e0b70f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3597,6 +3597,9 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
 	if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
 		return false;
 
+	if (is_vnmi_enabled(svm) && is_vnmi_pending_set(svm))
+		return true;
+
 	ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
 	      (vcpu->arch.hflags & HF_NMI_MASK);
 
@@ -3733,6 +3736,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	if (is_vnmi_enabled(svm) && is_vnmi_pending_set(svm))
+		return;
+
 	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
 		return; /* IRET will cause a vm exit */
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index cc98ec7bd119..7857a89d0ec8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -584,6 +584,16 @@ static inline void clear_vnmi_mask(struct vcpu_svm *svm)
 		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
 }
 
+static inline bool is_vnmi_pending_set(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vnmi_vmcb(svm);
+
+	if (vmcb)
+		return !!(vmcb->control.int_ctl & V_NMI_PENDING);
+	else
+		return false;
+}
+
 /* svm.c */
 #define MSR_INVALID				0xffffffffU
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 5/8] KVM: SVM: Add VNMI support in inject_nmi
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (3 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 4/8] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 6/8] KVM: nSVM: implement nested VNMI Santosh Shukla
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
will clear V_NMI to acknowledge processing has started and will keep the
V_NMI_MASK set until the processor is done with processing the NMI event.

Also, handle the nmi_l1_to_l2 case such that when it is true then
NMI to be injected originally comes from L1's VMCB12 EVENTINJ field.
So adding a check for that case.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v4:
- Added `nmi_l1_to_l2` check, discussion thread
https://lore.kernel.org/all/bf9e8a9c-5172-b61a-be6e-87a919442fbd@maciej.szmigiero.name/

v3:
- Removed WARN_ON check.

v2:
- Added WARN_ON check for vnmi pending.
- use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.

 arch/x86/kvm/svm/svm.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 38fa51e0b70f..cf05a41779f3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3478,7 +3478,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
 static void svm_inject_nmi(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb *vmcb = NULL;
 
+	if (is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2) {
+		vmcb = get_vnmi_vmcb(svm);
+		vmcb->control.int_ctl |= V_NMI_PENDING;
+		++vcpu->stat.nmi_injections;
+		return;
+	}
 	svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
 
 	if (svm->nmi_l1_to_l2)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 6/8] KVM: nSVM: implement nested VNMI
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (4 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 5/8] KVM: SVM: Add VNMI support in inject_nmi Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 7/8] KVM: nSVM: emulate VMEXIT_INVALID case for " Santosh Shukla
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

In order to support nested VNMI requires saving and restoring the VNMI
bits during nested entry and exit.
In case of L1 and L2 both using VNMI- Copy VNMI bits from vmcb12 to
vmcb02 during entry and vice-versa during exit.
And in case of L1 uses VNMI and L2 doesn't- Copy VNMI bits from vmcb01 to
vmcb02 during entry and vice-versa during exit.

Tested with the KVM-unit-test and Nested Guest scenario.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v5:
 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI
v3:
- Added code to save and restore VNMI bit for L1 using vnmi and L2
  doesn't for the entry and exit case.

v2:
- Save the V_NMI_PENDING/MASK state in vmcb12 on vmexit.
- Tested nested environment - L1 injecting vnmi to L2.
- Tested vnmi in kvm-unit-test.


 arch/x86/kvm/svm/nested.c | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c    |  5 +++++
 arch/x86/kvm/svm/svm.h    |  6 ++++++
 3 files changed, 38 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 4c620999d230..d1c22c89427e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -640,6 +640,11 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
 	else
 		int_ctl_vmcb01_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK);
 
+	if (nested_vnmi_enabled(svm))
+		int_ctl_vmcb12_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);
+	else
+		int_ctl_vmcb01_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);
+
 	/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
 	vmcb02->control.nested_ctl = vmcb01->control.nested_ctl;
 	vmcb02->control.iopm_base_pa = vmcb01->control.iopm_base_pa;
@@ -992,6 +997,28 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	vmcb12->control.event_inj         = svm->nested.ctl.event_inj;
 	vmcb12->control.event_inj_err     = svm->nested.ctl.event_inj_err;
 
+	if (nested_vnmi_enabled(svm)) {
+		if (vmcb02->control.int_ctl & V_NMI_MASK)
+			vmcb12->control.int_ctl |= V_NMI_MASK;
+		else
+			vmcb12->control.int_ctl &= ~V_NMI_MASK;
+
+		if (vmcb02->control.int_ctl & V_NMI_PENDING)
+			vmcb12->control.int_ctl |= V_NMI_PENDING;
+		else
+			vmcb12->control.int_ctl &= ~V_NMI_PENDING;
+	} else {
+		if (vmcb02->control.int_ctl & V_NMI_MASK)
+			vmcb01->control.int_ctl |= V_NMI_MASK;
+		else
+			vmcb01->control.int_ctl &= ~V_NMI_MASK;
+
+		if (vmcb02->control.int_ctl & V_NMI_PENDING)
+			vmcb01->control.int_ctl |= V_NMI_PENDING;
+		else
+			vmcb01->control.int_ctl &= ~V_NMI_PENDING;
+	}
+
 	if (!kvm_pause_in_guest(vcpu->kvm)) {
 		vmcb01->control.pause_filter_count = vmcb02->control.pause_filter_count;
 		vmcb_mark_dirty(vmcb01, VMCB_INTERCEPTS);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cf05a41779f3..d8edd40e48e1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4216,6 +4216,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF);
 
+	svm->vnmi_enabled = vnmi && guest_cpuid_has(vcpu, X86_FEATURE_AMD_VNMI);
+
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
 	/* For sev guests, the memory encryption bit is not reserved in CR3.  */
@@ -4958,6 +4960,9 @@ static __init void svm_set_cpu_caps(void)
 		if (vgif)
 			kvm_cpu_cap_set(X86_FEATURE_VGIF);
 
+		if (vnmi)
+			kvm_cpu_cap_set(X86_FEATURE_AMD_VNMI);
+
 		/* Nested VM can receive #VMEXIT instead of triggering #GP */
 		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
 	}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7857a89d0ec8..4d6245ef5b6c 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -253,6 +253,7 @@ struct vcpu_svm {
 	bool pause_filter_enabled         : 1;
 	bool pause_threshold_enabled      : 1;
 	bool vgif_enabled                 : 1;
+	bool vnmi_enabled                 : 1;
 
 	u32 ldr_reg;
 	u32 dfr_reg;
@@ -533,6 +534,11 @@ static inline bool is_x2apic_msrpm_offset(u32 offset)
 	       (msr < (APIC_BASE_MSR + 0x100));
 }
 
+static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
+{
+	return svm->vnmi_enabled && (svm->nested.ctl.int_ctl & V_NMI_ENABLE);
+}
+
 static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
 {
 	if (!vnmi)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 7/8] KVM: nSVM: emulate VMEXIT_INVALID case for nested VNMI
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (5 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 6/8] KVM: nSVM: implement nested VNMI Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-10-27  8:38 ` [PATCHv5 8/8] KVM: SVM: Enable VNMI feature Santosh Shukla
  2022-11-14  8:02 ` [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

If NMI virtualization enabled and NMI_INTERCEPT is unset then next vm
entry will exit with #INVALID exit reason.

In order to emulate above (VMEXIT(#INVALID)) scenario for nested
environment, extending check for V_NMI_ENABLE, NMI_INTERCEPT bit in func
__nested_vmcb_check_controls.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
 arch/x86/kvm/svm/nested.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index d1c22c89427e..e94ec6dee661 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -274,6 +274,11 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
 	if (CC(!nested_svm_check_tlb_ctl(vcpu, control->tlb_ctl)))
 		return false;
 
+	if (CC((control->int_ctl & V_NMI_ENABLE) &&
+		!vmcb12_is_intercept(control, INTERCEPT_NMI))) {
+		return false;
+	}
+
 	return true;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCHv5 8/8] KVM: SVM: Enable VNMI feature
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (6 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 7/8] KVM: nSVM: emulate VMEXIT_INVALID case for " Santosh Shukla
@ 2022-10-27  8:38 ` Santosh Shukla
  2022-11-14  8:02 ` [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
  8 siblings, 0 replies; 16+ messages in thread
From: Santosh Shukla @ 2022-10-27  8:38 UTC (permalink / raw)
  To: pbonzini, seanjc
  Cc: kvm, jmattson, joro, linux-kernel, mail, mlevitsk,
	thomas.lendacky, vkuznets

Enable the NMI virtualization (V_NMI_ENABLE) in the VMCB interrupt
control when the vnmi module parameter is set.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
 arch/x86/kvm/svm/svm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d8edd40e48e1..9bfee8c4e95d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1307,6 +1307,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	if (kvm_vcpu_apicv_active(vcpu))
 		avic_init_vmcb(svm, vmcb);
 
+	if (vnmi)
+		svm->vmcb->control.int_ctl |= V_NMI_ENABLE;
+
 	if (vgif) {
 		svm_clr_intercept(svm, INTERCEPT_STGI);
 		svm_clr_intercept(svm, INTERCEPT_CLGI);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
                   ` (7 preceding siblings ...)
  2022-10-27  8:38 ` [PATCHv5 8/8] KVM: SVM: Enable VNMI feature Santosh Shukla
@ 2022-11-14  8:02 ` Santosh Shukla
  2022-11-14 14:31   ` Maxim Levitsky
  8 siblings, 1 reply; 16+ messages in thread
From: Santosh Shukla @ 2022-11-14  8:02 UTC (permalink / raw)
  To: pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, mlevitsk, thomas.lendacky, vkuznets



On 10/27/2022 2:08 PM, Santosh Shukla wrote:
> VNMI Spec is at [1].
> 
> Change History:
> 
> v5 (6.1-rc2)
> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
>

Gentle reminder.

Thanks,
Santosh


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-14  8:02 ` [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
@ 2022-11-14 14:31   ` Maxim Levitsky
  2022-11-16  5:40     ` Santosh Shukla
  0 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2022-11-14 14:31 UTC (permalink / raw)
  To: Santosh Shukla, pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, thomas.lendacky, vkuznets

On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
> 
> 
> On 10/27/2022 2:08 PM, Santosh Shukla wrote:
> > VNMI Spec is at [1].
> > 
> > Change History:
> > 
> > v5 (6.1-rc2)
> > 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
> > 
> 
> Gentle reminder.
> 
> Thanks,
> Santosh
> 

I started reviewing it today and I think there are still few issues,
and the biggest one is that if a NMI arrives while vNMI injection
is pending, current code just drops such NMI.

We had a discussion about this, like forcing immeditate vm exit
in this case and such but I have a simplier idea:

In this case we can just open the NMI window in the good old way
by intercepting IRET, STGI, and or RSM (which is intercepted anyway),

and only if we already *just* intercepted IRET, only then just drop 
the new NMI instead of single stepping over it based on reasoning that
its 3rd NMI (one is almost done the servicing (its IRET is executing),
one is pending injection, and we want to inject another one.

Does this sound good to you? It won't work for SEV-ES as it looks
like it doesn't intercept IRET, but it might be a reasonable tradeof
for SEV-ES guests to accept that we can't inject a NMI if one is
already pending injection.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-14 14:31   ` Maxim Levitsky
@ 2022-11-16  5:40     ` Santosh Shukla
  2022-11-16  9:21       ` Maxim Levitsky
  0 siblings, 1 reply; 16+ messages in thread
From: Santosh Shukla @ 2022-11-16  5:40 UTC (permalink / raw)
  To: Maxim Levitsky, pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, thomas.lendacky, vkuznets, santosh.shukla

Hi Maxim,

On 11/14/2022 8:01 PM, Maxim Levitsky wrote:
> On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
>>
>>
>> On 10/27/2022 2:08 PM, Santosh Shukla wrote:
>>> VNMI Spec is at [1].
>>>
>>> Change History:
>>>
>>> v5 (6.1-rc2)
>>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
>>>
>>
>> Gentle reminder.
>>
>> Thanks,
>> Santosh
>>
> 
> I started reviewing it today and I think there are still few issues,
> and the biggest one is that if a NMI arrives while vNMI injection
> is pending, current code just drops such NMI.
> 
> We had a discussion about this, like forcing immeditate vm exit

I believe, We discussed above case in [1] i.e.. HW can handle
the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit.
is it same scenario or different one that you are mentioning?

[1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/

Thanks,
Santosh

> in this case and such but I have a simplier idea:
> 
> In this case we can just open the NMI window in the good old way
> by intercepting IRET, STGI, and or RSM (which is intercepted anyway),
> 
> and only if we already *just* intercepted IRET, only then just drop 
> the new NMI instead of single stepping over it based on reasoning that
> its 3rd NMI (one is almost done the servicing (its IRET is executing),
> one is pending injection, and we want to inject another one.
> 
> Does this sound good to you? It won't work for SEV-ES as it looks
> like it doesn't intercept IRET, but it might be a reasonable tradeof
> for SEV-ES guests to accept that we can't inject a NMI if one is
> already pending injection.
> 
> Best regards,
> 	Maxim Levitsky
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-16  5:40     ` Santosh Shukla
@ 2022-11-16  9:21       ` Maxim Levitsky
  2022-11-16 15:44         ` Santosh Shukla
  0 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2022-11-16  9:21 UTC (permalink / raw)
  To: Santosh Shukla, pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, thomas.lendacky, vkuznets

On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote:
> Hi Maxim,
> 
> On 11/14/2022 8:01 PM, Maxim Levitsky wrote:
> > On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
> > > 
> > > 
> > > On 10/27/2022 2:08 PM, Santosh Shukla wrote:
> > > > VNMI Spec is at [1].
> > > > 
> > > > Change History:
> > > > 
> > > > v5 (6.1-rc2)
> > > > 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
> > > > 
> > > 
> > > Gentle reminder.
> > > 
> > > Thanks,
> > > Santosh
> > > 
> > 
> > I started reviewing it today and I think there are still few issues,
> > and the biggest one is that if a NMI arrives while vNMI injection
> > is pending, current code just drops such NMI.
> > 
> > We had a discussion about this, like forcing immeditate vm exit
> 
> I believe, We discussed above case in [1] i.e.. HW can handle
> the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit.
> is it same scenario or different one that you are mentioning?
> 
> [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/

You misunderstood the issue.

Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected 
(V_NMI_PENDING can be set),

but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but
and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set
yet).

Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service,
we will be able to inject only one NMI by setting the V_NMI_PENDING.

I think I was able to solve all these issues and I will today post a modified patch series of yours,
which should cover all these cases and have some nice refactoring as well.


Best regards,
	Maxim Levitsky


> 
> Thanks,
> Santosh
> 
> > in this case and such but I have a simplier idea:
> > 
> > In this case we can just open the NMI window in the good old way
> > by intercepting IRET, STGI, and or RSM (which is intercepted anyway),
> > 
> > and only if we already *just* intercepted IRET, only then just drop 
> > the new NMI instead of single stepping over it based on reasoning that
> > its 3rd NMI (one is almost done the servicing (its IRET is executing),
> > one is pending injection, and we want to inject another one.
> > 
> > Does this sound good to you? It won't work for SEV-ES as it looks
> > like it doesn't intercept IRET, but it might be a reasonable tradeof
> > for SEV-ES guests to accept that we can't inject a NMI if one is
> > already pending injection.
> > 
> > Best regards,
> >         Maxim Levitsky
> > 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-16  9:21       ` Maxim Levitsky
@ 2022-11-16 15:44         ` Santosh Shukla
  2022-11-16 15:55           ` Tom Lendacky
  0 siblings, 1 reply; 16+ messages in thread
From: Santosh Shukla @ 2022-11-16 15:44 UTC (permalink / raw)
  To: Maxim Levitsky, pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, thomas.lendacky, vkuznets

Hello Maxim,.

On 11/16/2022 2:51 PM, Maxim Levitsky wrote:
> On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote:
>> Hi Maxim,
>>
>> On 11/14/2022 8:01 PM, Maxim Levitsky wrote:
>>> On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
>>>>
>>>>
>>>> On 10/27/2022 2:08 PM, Santosh Shukla wrote:
>>>>> VNMI Spec is at [1].
>>>>>
>>>>> Change History:
>>>>>
>>>>> v5 (6.1-rc2)
>>>>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
>>>>>
>>>>
>>>> Gentle reminder.
>>>>
>>>> Thanks,
>>>> Santosh
>>>>
>>>
>>> I started reviewing it today and I think there are still few issues,
>>> and the biggest one is that if a NMI arrives while vNMI injection
>>> is pending, current code just drops such NMI.
>>>
>>> We had a discussion about this, like forcing immeditate vm exit
>>
>> I believe, We discussed above case in [1] i.e.. HW can handle
>> the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit.
>> is it same scenario or different one that you are mentioning?
>>
>> [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>> 
> You misunderstood the issue.
> 
> Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected 
> (V_NMI_PENDING can be set),
> 
> but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but
> and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set
> yet).
> 

In this case, HW will collapse the NMI.

Note that the HW will take the pending NMI at the boundary of IRET instruction such that
it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI,
HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,.


> Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service,
> we will be able to inject only one NMI by setting the V_NMI_PENDING.
>

Ditto,. HW will collapse the NMI.

Thanks,
Santosh
 
> I think I was able to solve all these issues and I will today post a modified patch series of yours,
> which should cover all these cases and have some nice refactoring as well.
> 
> 
> Best regards,
> 	Maxim Levitsky
> 
> 
>>
>> Thanks,
>> Santosh
>>
>>> in this case and such but I have a simplier idea:
>>>
>>> In this case we can just open the NMI window in the good old way
>>> by intercepting IRET, STGI, and or RSM (which is intercepted anyway),
>>>
>>> and only if we already *just* intercepted IRET, only then just drop 
>>> the new NMI instead of single stepping over it based on reasoning that
>>> its 3rd NMI (one is almost done the servicing (its IRET is executing),
>>> one is pending injection, and we want to inject another one.
>>>
>>> Does this sound good to you? It won't work for SEV-ES as it looks
>>> like it doesn't intercept IRET, but it might be a reasonable tradeof
>>> for SEV-ES guests to accept that we can't inject a NMI if one is
>>> already pending injection.
>>>
>>> Best regards,
>>>         Maxim Levitsky
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-16 15:44         ` Santosh Shukla
@ 2022-11-16 15:55           ` Tom Lendacky
  2022-11-16 16:59             ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Lendacky @ 2022-11-16 15:55 UTC (permalink / raw)
  To: Santosh Shukla, Maxim Levitsky, pbonzini, seanjc, jmattson
  Cc: kvm, joro, linux-kernel, mail, vkuznets

On 11/16/22 09:44, Santosh Shukla wrote:
> Hello Maxim,.
> 
> On 11/16/2022 2:51 PM, Maxim Levitsky wrote:
>> On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote:
>>> Hi Maxim,
>>>
>>> On 11/14/2022 8:01 PM, Maxim Levitsky wrote:
>>>> On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
>>>>>
>>>>>
>>>>> On 10/27/2022 2:08 PM, Santosh Shukla wrote:
>>>>>> VNMI Spec is at [1].
>>>>>>
>>>>>> Change History:
>>>>>>
>>>>>> v5 (6.1-rc2)
>>>>>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson)
>>>>>>
>>>>>
>>>>> Gentle reminder.
>>>>>
>>>>> Thanks,
>>>>> Santosh
>>>>>
>>>>
>>>> I started reviewing it today and I think there are still few issues,
>>>> and the biggest one is that if a NMI arrives while vNMI injection
>>>> is pending, current code just drops such NMI.
>>>>
>>>> We had a discussion about this, like forcing immeditate vm exit
>>>
>>> I believe, We discussed above case in [1] i.e.. HW can handle
>>> the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit.
>>> is it same scenario or different one that you are mentioning?
>>>
>>> [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>>
>> You misunderstood the issue.
>>
>> Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected
>> (V_NMI_PENDING can be set),
>>
>> but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but
>> and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set
>> yet).
>>
> 
> In this case, HW will collapse the NMI.
> 
> Note that the HW will take the pending NMI at the boundary of IRET instruction such that
> it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI,
> HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,.
> 
> 
>> Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service,
>> we will be able to inject only one NMI by setting the V_NMI_PENDING.
>>
> 
> Ditto,. HW will collapse the NMI.

Note, this is how bare-metal NMIs are also handled. Multiple NMIs are 
collapsed into a single NMI if they are received while an NMI is currently 
being processed.

Thanks,
Tom

> 
> Thanks,
> Santosh
>   
>> I think I was able to solve all these issues and I will today post a modified patch series of yours,
>> which should cover all these cases and have some nice refactoring as well.
>>
>>
>> Best regards,
>> 	Maxim Levitsky
>>
>>
>>>
>>> Thanks,
>>> Santosh
>>>
>>>> in this case and such but I have a simplier idea:
>>>>
>>>> In this case we can just open the NMI window in the good old way
>>>> by intercepting IRET, STGI, and or RSM (which is intercepted anyway),
>>>>
>>>> and only if we already *just* intercepted IRET, only then just drop
>>>> the new NMI instead of single stepping over it based on reasoning that
>>>> its 3rd NMI (one is almost done the servicing (its IRET is executing),
>>>> one is pending injection, and we want to inject another one.
>>>>
>>>> Does this sound good to you? It won't work for SEV-ES as it looks
>>>> like it doesn't intercept IRET, but it might be a reasonable tradeof
>>>> for SEV-ES guests to accept that we can't inject a NMI if one is
>>>> already pending injection.
>>>>
>>>> Best regards,
>>>>          Maxim Levitsky
>>>>
>>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv5 0/8] Virtual NMI feature
  2022-11-16 15:55           ` Tom Lendacky
@ 2022-11-16 16:59             ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2022-11-16 16:59 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Santosh Shukla, Maxim Levitsky, pbonzini, jmattson, kvm, joro,
	linux-kernel, mail, vkuznets

On Wed, Nov 16, 2022, Tom Lendacky wrote:
> On 11/16/22 09:44, Santosh Shukla wrote:
> > Hello Maxim,.
> > 
> > On 11/16/2022 2:51 PM, Maxim Levitsky wrote:
> > > On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote:
> > > > Hi Maxim,
> > > > 
> > > > On 11/14/2022 8:01 PM, Maxim Levitsky wrote:
> > > > > On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote:
> > > > > I started reviewing it today and I think there are still few issues,
> > > > > and the biggest one is that if a NMI arrives while vNMI injection
> > > > > is pending, current code just drops such NMI.

I don't think it gets dropped, just improperly delayed.

> > > > > We had a discussion about this, like forcing immeditate vm exit
> > > > 
> > > > I believe, We discussed above case in [1] i.e.. HW can handle
> > > > the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit.
> > > > is it same scenario or different one that you are mentioning?
> > > > 
> > > > [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>>
> > > You misunderstood the issue.
> > > 
> > > Hardware can handle the case when a NMI is in service (that is V_NMI_MASK
> > > is set) and another one is injected (V_NMI_PENDING can be set),
> > > 
> > > but it is not possible to handle the case when a NMI is already injected
> > > (V_NMI_PENDING set) but and KVM wants to inject another one before the
> > > first one went into the service (that is V_NMI_MASK is not set yet).
> > 
> > In this case, HW will collapse the NMI.

Yes, but practically speaking two NMIs can't arrive at the exact same instance on
bare metal.  One NMI will always arrive first and get vectored, and then the second
NMI will arrive and be pended.  In a virtual environment, two NMIs that are sent
far apart can arrive together, e.g. if the vCPU is scheduled out for an extended
period of time.  KVM mitigates this side effect by allowing two NMIs to be pending.

The problem here isn't that second the NMI is lost, it's that KVM can't get control
when the first NMI completes (IRETs).  KVM can pend both NMIs and queue the first
for injection/vectoring (set V_NMI_PENDING), but AIUI there is no mechanism (and no
code) to force a VM-Exit on the IRET so that KVM can inject the second NMI.

Santosh's response in v2[*] suggested that hardware would allow KVM to "post" an
NMI while the vCPU is running, but I don't see any code in this series to actually
do that.  svm_inject_nmi() is only called from the vCPU's run loop, i.e. requires
a VM-Exit.

[*] https://lore.kernel.org/all/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com

> > Note that the HW will take the pending NMI at the boundary of IRET instruction such that
> > it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI,
> > HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,.
> > 
> > 
> > > Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is
> > > set despite no NMI in service, we will be able to inject only one NMI by
> > > setting the V_NMI_PENDING.

I believe this one is a non-issue.  Like bare metal, KVM only allows one NMI to
be pending if NMIs are blocked.

  static void process_nmi(struct kvm_vcpu *vcpu)
  {
	unsigned limit = 2;

	/*
	 * x86 is limited to one NMI running, and one NMI pending after it.
	 * If an NMI is already in progress, limit further NMIs to just one.
	 * Otherwise, allow two (and we'll inject the first one immediately).
	 */
	if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected)
		limit = 1;

	vcpu->arch.nmi_pending += atomic_xchg(&vcpu->arch.nmi_queued, 0);
	vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit);
	kvm_make_request(KVM_REQ_EVENT, vcpu);
  }


> > Ditto,. HW will collapse the NMI.
> 
> Note, this is how bare-metal NMIs are also handled. Multiple NMIs are
> collapsed into a single NMI if they are received while an NMI is currently
> being processed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-16 16:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-27  8:38 [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 1/8] x86/cpu: Add CPUID feature bit for VNMI Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 2/8] KVM: SVM: Add VNMI bit definition Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 3/8] KVM: SVM: Add VNMI support in get/set_nmi_mask Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 4/8] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 5/8] KVM: SVM: Add VNMI support in inject_nmi Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 6/8] KVM: nSVM: implement nested VNMI Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 7/8] KVM: nSVM: emulate VMEXIT_INVALID case for " Santosh Shukla
2022-10-27  8:38 ` [PATCHv5 8/8] KVM: SVM: Enable VNMI feature Santosh Shukla
2022-11-14  8:02 ` [PATCHv5 0/8] Virtual NMI feature Santosh Shukla
2022-11-14 14:31   ` Maxim Levitsky
2022-11-16  5:40     ` Santosh Shukla
2022-11-16  9:21       ` Maxim Levitsky
2022-11-16 15:44         ` Santosh Shukla
2022-11-16 15:55           ` Tom Lendacky
2022-11-16 16:59             ` Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.