All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Handle monitor trap flag during instruction emulation
@ 2020-01-28  9:27 Oliver Upton
  2020-01-28  9:27 ` [PATCH v2 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@google.com

v1 => v2:
 - Don't split the #DB delivery by vendors. Unconditionally injecting
   #DB payloads into the 'pending debug exceptions' field will cause KVM
   to get stuck in a loop. Per the SDM, when hardware injects an event
   resulting from this field's value, it is checked against the
   exception interception bitmap.
 - Address Sean's comments by injecting the VM-exit into L1 from
   vmx_check_nested_events().
 - Added fix for nested INIT VM-exits + 'pending debug exceptions' field
   as it was noticed in implementing v2.
 - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed
   since v1.

KVM already provides guests the ability to use the 'monitor trap flag'
VM-execution control. Support for this flag is provided by the fact that
KVM unconditionally forwards MTF VM-exits to the guest (if requested),
as KVM doesn't utilize MTF. While this provides support during hardware
instruction execution, it is insufficient for instruction emulation.

Should L0 emulate an instruction on the behalf of L2, L0 should also
synthesize an MTF VM-exit into L1, should control be set.

The first patch corrects a nuanced difference between the definition of
a #DB exception payload field and DR6 register. Mask off bit 12 which is
defined in the 'pending debug exceptions' field when applying to DR6,
since the payload field is said to be compatible with the aforementioned
VMCS field.

The second patch sets the 'pending debug exceptions' VMCS field when
delivering an INIT signal VM-exit to L1, as described in the SDM. This
patch also introduces helpers for setting the 'pending debug exceptions'
VMCS field.

The third patch massages KVM's handling of exception payloads with
regard to API compatibility. Rather than immediately injecting the
payload w/o opt-in, instead defer the payload + immediately inject
before completing a KVM_GET_VCPU_EVENTS. This maintains API
compatibility whilst correcting #DB behavior with regard to higher
priority VM-exit events.

Fourth patch introduces MTF implementation for emulated instructions.
Identify if an MTF is due on an instruction boundary from
kvm_vcpu_do_singlestep(), however only deliver this VM-exit from
vmx_check_nested_events() to respect the relative prioritization to
other VM-exits. Since this augments the nested state, introduce a new
flag for (de)serialization.

Last patch adds tests to kvm-unit-tests to assert the correctness of MTF
under several conditions (concurrent #DB trap, #DB fault, etc). These
tests pass under virtualization with this series as well as on
bare-metal.

Oliver Upton (4):
  KVM: x86: Mask off reserved bit from #DB exception payload
  KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
  KVM: nVMX: Emulate MTF when performing instruction emulation

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm.c              |  1 +
 arch/x86/kvm/vmx/nested.c       | 60 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/nested.h       |  5 +++
 arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++
 arch/x86/kvm/vmx/vmx.h          |  3 ++
 arch/x86/kvm/x86.c              | 52 +++++++++++++++++-----------
 8 files changed, 125 insertions(+), 20 deletions(-)

-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/5] KVM: x86: Mask off reserved bit from #DB exception payload
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
@ 2020-01-28  9:27 ` Oliver Upton
  2020-01-28  9:27 ` [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

KVM defines the #DB payload as compatible with the 'pending debug
exceptions' field under VMX, not DR6. Mask off bit 12 when applying the
payload to DR6, as it is reserved on DR6 but not the 'pending debug
exceptions' field.

Fixes: f10c729ff965 ("kvm: vmx: Defer setting of DR6 until #DB delivery")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7e118883d8f1..7a341c0c978a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -437,6 +437,14 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu)
 		 * for #DB exceptions under VMX.
 		 */
 		vcpu->arch.dr6 ^= payload & DR6_RTM;
+
+		/*
+		 * The #DB payload is defined as compatible with the 'pending
+		 * debug exceptions' field under VMX, not DR6. While bit 12 is
+		 * defined in the 'pending debug exceptions' field (enabled
+		 * breakpoint), it is reserved and must be zero in DR6.
+		 */
+		vcpu->arch.dr6 &= ~BIT(12);
 		break;
 	case PF_VECTOR:
 		vcpu->arch.cr2 = payload;
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  2020-01-28  9:27 ` [PATCH v2 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
@ 2020-01-28  9:27 ` Oliver Upton
  2020-02-03 19:13   ` Sean Christopherson
  2020-01-28  9:27 ` [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

SDM 27.3.4 states that the 'pending debug exceptions' VMCS field will
be populated if a VM-exit caused by an INIT signal takes priority over a
debug-trap. Emulate this behavior when synthesizing an INIT signal
VM-exit into L1.

Fixes: 558b8d50dbff ("KVM: x86: Fix INIT signal handling in various CPU states")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 95b3f4306ac2..aba16599ca69 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3572,6 +3572,27 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
 	nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual);
 }
 
+static inline bool nested_vmx_check_pending_dbg(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.exception.nr == DB_VECTOR &&
+			vcpu->arch.exception.pending &&
+			vcpu->arch.exception.has_payload;
+}
+
+/*
+ * If a higher priority VM-exit is delivered before a debug-trap, hardware will
+ * set the 'pending debug exceptions' field appropriately for reinjection on the
+ * next VM-entry.
+ */
+static void nested_vmx_set_pending_dbg(struct kvm_vcpu *vcpu)
+{
+	vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, vcpu->arch.exception.payload);
+	vcpu->arch.exception.has_payload = false;
+	vcpu->arch.exception.payload = 0;
+	vcpu->arch.exception.pending = false;
+	vcpu->arch.exception.injected = true;
+}
+
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3584,6 +3605,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
 		if (block_nested_events)
 			return -EBUSY;
+		if (nested_vmx_check_pending_dbg(vcpu))
+			nested_vmx_set_pending_dbg(vcpu);
 		clear_bit(KVM_APIC_INIT, &apic->pending_events);
 		nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0);
 		return 0;
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  2020-01-28  9:27 ` [PATCH v2 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
  2020-01-28  9:27 ` [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
@ 2020-01-28  9:27 ` Oliver Upton
  2020-02-03 19:48   ` Sean Christopherson
  2020-01-28  9:27 ` [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

KVM doesn't utilize exception payloads by default, as this behavior
diverges from the expectations of the userspace API. However, this
constraint only applies if KVM is servicing a KVM_GET_VCPU_EVENTS ioctl
before delivering the exception.

Use exception payloads unconditionally if the vcpu is in guest mode.
Deliver the exception payload before completing a KVM_GET_VCPU_EVENTS
to ensure API compatibility.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7a341c0c978a..9f080101618c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -497,19 +497,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 		vcpu->arch.exception.error_code = error_code;
 		vcpu->arch.exception.has_payload = has_payload;
 		vcpu->arch.exception.payload = payload;
-		/*
-		 * In guest mode, payload delivery should be deferred,
-		 * so that the L1 hypervisor can intercept #PF before
-		 * CR2 is modified (or intercept #DB before DR6 is
-		 * modified under nVMX).  However, for ABI
-		 * compatibility with KVM_GET_VCPU_EVENTS and
-		 * KVM_SET_VCPU_EVENTS, we can't delay payload
-		 * delivery unless userspace has enabled this
-		 * functionality via the per-VM capability,
-		 * KVM_CAP_EXCEPTION_PAYLOAD.
-		 */
-		if (!vcpu->kvm->arch.exception_payload_enabled ||
-		    !is_guest_mode(vcpu))
+		if (!is_guest_mode(vcpu))
 			kvm_deliver_exception_payload(vcpu);
 		return;
 	}
@@ -3790,6 +3778,21 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
 {
 	process_nmi(vcpu);
 
+	/*
+	 * In guest mode, payload delivery should be deferred,
+	 * so that the L1 hypervisor can intercept #PF before
+	 * CR2 is modified (or intercept #DB before DR6 is
+	 * modified under nVMX).  However, for ABI
+	 * compatibility with KVM_GET_VCPU_EVENTS and
+	 * KVM_SET_VCPU_EVENTS, we can't delay payload
+	 * delivery unless userspace has enabled this
+	 * functionality via the per-VM capability,
+	 * KVM_CAP_EXCEPTION_PAYLOAD.
+	 */
+	if (!vcpu->kvm->arch.exception_payload_enabled &&
+	    vcpu->arch.exception.pending && vcpu->arch.exception.has_payload)
+		kvm_deliver_exception_payload(vcpu);
+
 	/*
 	 * The API doesn't provide the instruction length for software
 	 * exceptions, so don't report them. As long as the guest RIP
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (2 preceding siblings ...)
  2020-01-28  9:27 ` [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
@ 2020-01-28  9:27 ` Oliver Upton
  2020-02-03 22:58   ` Sean Christopherson
  2021-08-13  0:23   ` Jim Mattson
  2020-01-28  9:27 ` [kvm-unit-tests PATCH v2 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
  2020-01-28  9:39 ` [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  5 siblings, 2 replies; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

Since commit 5f3d45e7f282 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm.c              |  1 +
 arch/x86/kvm/vmx/nested.c       | 37 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/nested.h       |  5 +++++
 arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h          |  3 +++
 arch/x86/kvm/x86.c              | 15 +++++++------
 8 files changed, 78 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 69e31dbdfdc2..e1061ebc1b4b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1103,6 +1103,7 @@ struct kvm_x86_ops {
 	int (*handle_exit)(struct kvm_vcpu *vcpu,
 		enum exit_fastpath_completion exit_fastpath);
 	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+	void (*do_singlestep)(struct kvm_vcpu *vcpu);
 	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
 	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
 	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 503d3f42da16..3f3f780c8c65 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -390,6 +390,7 @@ struct kvm_sync_regs {
 #define KVM_STATE_NESTED_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_RUN_PENDING	0x00000002
 #define KVM_STATE_NESTED_EVMCS		0x00000004
+#define KVM_STATE_NESTED_MTF_PENDING	0x00000008
 
 #define KVM_STATE_NESTED_SMM_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_SMM_VMXON	0x00000002
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9dbb990c319a..3653e230d3d5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7316,6 +7316,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.run = svm_vcpu_run,
 	.handle_exit = handle_exit,
 	.skip_emulated_instruction = skip_emulated_instruction,
+	.do_singlestep = NULL,
 	.set_interrupt_shadow = svm_set_interrupt_shadow,
 	.get_interrupt_shadow = svm_get_interrupt_shadow,
 	.patch_hypercall = svm_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index aba16599ca69..0de71b207b2a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3599,8 +3599,15 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 	unsigned long exit_qual;
 	bool block_nested_events =
 	    vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
+	bool mtf_pending = vmx->nested.mtf_pending;
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
+	/*
+	 * Clear the MTF state. If a higher priority VM-exit is delivered first,
+	 * this state is discarded.
+	 */
+	vmx->nested.mtf_pending = false;
+
 	if (lapic_in_kernel(vcpu) &&
 		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
 		if (block_nested_events)
@@ -3612,8 +3619,30 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 		return 0;
 	}
 
+	/*
+	 * Process non-debug exceptions first before MTF.
+	 */
 	if (vcpu->arch.exception.pending &&
-		nested_vmx_check_exception(vcpu, &exit_qual)) {
+	    !nested_vmx_check_pending_dbg(vcpu) &&
+	    nested_vmx_check_exception(vcpu, &exit_qual)) {
+		if (block_nested_events)
+			return -EBUSY;
+		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+		return 0;
+	}
+
+	if (mtf_pending) {
+		if (block_nested_events)
+			return -EBUSY;
+		if (nested_vmx_check_pending_dbg(vcpu))
+			nested_vmx_set_pending_dbg(vcpu);
+		nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
+		return 0;
+	}
+
+	if (vcpu->arch.exception.pending &&
+	    nested_vmx_check_pending_dbg(vcpu) &&
+	    nested_vmx_check_exception(vcpu, &exit_qual)) {
 		if (block_nested_events)
 			return -EBUSY;
 		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
@@ -5705,6 +5734,9 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 
 			if (vmx->nested.nested_run_pending)
 				kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
+
+			if (vmx->nested.mtf_pending)
+				kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
 		}
 	}
 
@@ -5885,6 +5917,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 	vmx->nested.nested_run_pending =
 		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
 
+	vmx->nested.mtf_pending =
+		!!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
+
 	ret = -EINVAL;
 	if (nested_cpu_has_shadow_vmcs(vmcs12) &&
 	    vmcs12->vmcs_link_pointer != -1ull) {
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index fc874d4ead0f..e12461776151 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -175,6 +175,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
 	return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
 }
 
+static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
+{
+	return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
+}
+
 static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
 {
 	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 802ba97ac7f2..5735d1a1af05 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1601,6 +1601,27 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static void vmx_do_singlestep(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx;
+
+	if (!(is_guest_mode(vcpu) &&
+	      nested_cpu_has_mtf(get_vmcs12(vcpu))))
+		return;
+
+	vmx = to_vmx(vcpu);
+
+	/*
+	 * Per the SDM, MTF takes priority over debug-trap exception besides
+	 * T-bit traps. As instruction emulation is completed (i.e. at the end
+	 * of an instruction boundary), any #DB exception pending delivery must
+	 * be a debug-trap. Record the pending MTF state to be delivered in
+	 * vmx_check_nested_events().
+	 */
+	vmx->nested.mtf_pending = (!vcpu->arch.exception.pending ||
+				   vcpu->arch.exception.nr == DB_VECTOR);
+}
+
 static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -7797,6 +7818,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.run = vmx_vcpu_run,
 	.handle_exit = vmx_handle_exit,
 	.skip_emulated_instruction = skip_emulated_instruction,
+	.do_singlestep = vmx_do_singlestep,
 	.set_interrupt_shadow = vmx_set_interrupt_shadow,
 	.get_interrupt_shadow = vmx_get_interrupt_shadow,
 	.patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a4f7f737c5d4..401e9ca23779 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -150,6 +150,9 @@ struct nested_vmx {
 	/* L2 must run next, and mustn't decide to exit to L1. */
 	bool nested_run_pending;
 
+	/* Pending MTF VM-exit into L1.  */
+	bool mtf_pending;
+
 	struct loaded_vmcs vmcs02;
 
 	/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9f080101618c..e5c859f9b3bf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6626,10 +6626,15 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
 	return dr6;
 }
 
-static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu)
+static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, bool tf)
 {
 	struct kvm_run *kvm_run = vcpu->run;
 
+	if (kvm_x86_ops->do_singlestep)
+		kvm_x86_ops->do_singlestep(vcpu);
+	if (!tf)
+		return 1;
+
 	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
 		kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1 | DR6_RTM;
 		kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
@@ -6658,9 +6663,7 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	 * processor will not generate this exception after the instruction
 	 * that sets the TF flag".
 	 */
-	if (unlikely(rflags & X86_EFLAGS_TF))
-		r = kvm_vcpu_do_singlestep(vcpu);
-	return r;
+	return kvm_vcpu_do_singlestep(vcpu, rflags & X86_EFLAGS_TF);
 }
 EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
 
@@ -6876,8 +6879,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		if (!ctxt->have_exception ||
 		    exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
 			kvm_rip_write(vcpu, ctxt->eip);
-			if (r && ctxt->tf)
-				r = kvm_vcpu_do_singlestep(vcpu);
+			if (r)
+				r = kvm_vcpu_do_singlestep(vcpu, ctxt->tf);
 			__kvm_set_rflags(vcpu, ctxt->eflags);
 		}
 
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 5/5] x86: VMX: Add tests for monitor trap flag
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (3 preceding siblings ...)
  2020-01-28  9:27 ` [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
@ 2020-01-28  9:27 ` Oliver Upton
  2020-01-28  9:39 ` [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:27 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

Test to verify that MTF VM-exits into host are synthesized when the
'monitor trap flag' processor-based VM-execution control is set under
various conditions.

Expect an MTF VM-exit if instruction execution produces no events other
than MTF. Should instruction execution produce a concurrent debug-trap
and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
VMCS field set. Expect an MTF VM-exit to follow event delivery should
instruction execution generate a higher-priority event, such as a
general-protection fault. Lastly, expect an MTF VM-exit to follow
delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx.h       |   1 +
 x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 158 insertions(+)

diff --git a/x86/vmx.h b/x86/vmx.h
index 6214400f2b53..6adf0916564b 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -399,6 +399,7 @@ enum Ctrl0 {
 	CPU_NMI_WINDOW		= 1ul << 22,
 	CPU_IO			= 1ul << 24,
 	CPU_IO_BITMAP		= 1ul << 25,
+	CPU_MTF			= 1ul << 27,
 	CPU_MSR_BITMAP		= 1ul << 28,
 	CPU_MONITOR		= 1ul << 29,
 	CPU_PAUSE		= 1ul << 30,
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index b31c360c5f3c..0e2c2f8a7d34 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
 	vmcs_write(EXI_CONTROLS, saved_exit);
 }
 
+extern unsigned char test_mtf1;
+extern unsigned char test_mtf2;
+extern unsigned char test_mtf3;
+
+__attribute__((noclone)) static void test_mtf_guest(void)
+{
+	asm ("vmcall;\n\t"
+	     "out %al, $0x80;\n\t"
+	     "test_mtf1:\n\t"
+	     "vmcall;\n\t"
+	     "out %al, $0x80;\n\t"
+	     "test_mtf2:\n\t"
+	     /*
+	      * Prepare for the 'MOV CR3' test. Attempt to induce a
+	      * general-protection fault by moving a non-canonical address into
+	      * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
+	      * so we must MOV the desired value into a register first.
+	      *
+	      * MOV RAX is done before the VMCALL such that MTF is only enabled
+	      * for the instruction under test.
+	      */
+	     "mov $0x8000000000000000, %rax;\n\t"
+	     "vmcall;\n\t"
+	     "mov %rax, %cr3;\n\t"
+	     "test_mtf3:\n\t"
+	     "vmcall;\n\t"
+	     /*
+	      * ICEBP/INT1 instruction. Though the instruction is now
+	      * documented, don't rely on assemblers enumerating the
+	      * instruction. Resort to hand assembly.
+	      */
+	     ".byte 0xf1;\n\t");
+}
+
+static void test_mtf_gp_handler(struct ex_regs *regs)
+{
+	regs->rip = (unsigned long) &test_mtf3;
+}
+
+static void test_mtf_db_handler(struct ex_regs *regs)
+{
+}
+
+static void enable_mtf(void)
+{
+	u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
+
+	vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
+}
+
+static void disable_mtf(void)
+{
+	u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
+
+	vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
+}
+
+static void enable_tf(void)
+{
+	unsigned long rflags = vmcs_read(GUEST_RFLAGS);
+
+	vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
+}
+
+static void disable_tf(void)
+{
+	unsigned long rflags = vmcs_read(GUEST_RFLAGS);
+
+	vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
+}
+
+static void report_mtf(const char *insn_name, unsigned long exp_rip)
+{
+	unsigned long rip = vmcs_read(GUEST_RIP);
+
+	assert_exit_reason(VMX_MTF);
+	report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
+	       insn_name, rip, exp_rip);
+}
+
+static void vmx_mtf_test(void)
+{
+	unsigned long pending_dbg;
+	handler old_gp, old_db;
+
+	if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
+		printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
+		return;
+	}
+
+	test_set_guest(test_mtf_guest);
+
+	/* Expect an MTF VM-exit after OUT instruction */
+	enter_guest();
+	skip_exit_vmcall();
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("OUT", (unsigned long) &test_mtf1);
+	disable_mtf();
+
+	/*
+	 * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
+	 * VM-exit with populated 'pending debug exceptions' VMCS field.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	enable_mtf();
+	enable_tf();
+
+	enter_guest();
+	report_mtf("OUT", (unsigned long) &test_mtf2);
+	pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
+	report(pending_dbg & DR_STEP,
+               "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
+	       pending_dbg, (unsigned long) DR_STEP);
+
+	disable_mtf();
+	disable_tf();
+	vmcs_write(GUEST_PENDING_DEBUG, 0);
+
+	/*
+	 * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
+	 * advanced to first instruction of #GP handler.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
+	disable_mtf();
+
+	/*
+	 * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
+	 * MTF should follow the delivery of #DB trap, though the SDM doesn't
+	 * provide clear indication of the relative priority.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	handle_exception(GP_VECTOR, old_gp);
+	old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
+	disable_mtf();
+
+	enter_guest();
+	handle_exception(DB_VECTOR, old_db);
+}
+
 /*
  * Tests for VM-execution control fields
  */
@@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
 	TEST(atomic_switch_max_msrs_test),
 	TEST(atomic_switch_overflow_msrs_test),
 	TEST(rdtsc_vmexit_diff_test),
+	TEST(vmx_mtf_test),
 	{ NULL, NULL, NULL, NULL, NULL, {0} },
 };
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/5] Handle monitor trap flag during instruction emulation
  2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (4 preceding siblings ...)
  2020-01-28  9:27 ` [kvm-unit-tests PATCH v2 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
@ 2020-01-28  9:39 ` Oliver Upton
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-01-28  9:39 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson

On Tue, Jan 28, 2020 at 01:27:10AM -0800, Oliver Upton wrote:
> v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@google.com
> 
> v1 => v2:
>  - Don't split the #DB delivery by vendors. Unconditionally injecting
>    #DB payloads into the 'pending debug exceptions' field will cause KVM
>    to get stuck in a loop. Per the SDM, when hardware injects an event
>    resulting from this field's value, it is checked against the
>    exception interception bitmap.
>  - Address Sean's comments by injecting the VM-exit into L1 from
>    vmx_check_nested_events().
>  - Added fix for nested INIT VM-exits + 'pending debug exceptions' field
>    as it was noticed in implementing v2.
>  - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed
>    since v1.
> 
> KVM already provides guests the ability to use the 'monitor trap flag'
> VM-execution control. Support for this flag is provided by the fact that
> KVM unconditionally forwards MTF VM-exits to the guest (if requested),
> as KVM doesn't utilize MTF. While this provides support during hardware
> instruction execution, it is insufficient for instruction emulation.
> 
> Should L0 emulate an instruction on the behalf of L2, L0 should also
> synthesize an MTF VM-exit into L1, should control be set.
> 
> The first patch corrects a nuanced difference between the definition of
> a #DB exception payload field and DR6 register. Mask off bit 12 which is
> defined in the 'pending debug exceptions' field when applying to DR6,
> since the payload field is said to be compatible with the aforementioned
> VMCS field.
> 
> The second patch sets the 'pending debug exceptions' VMCS field when
> delivering an INIT signal VM-exit to L1, as described in the SDM. This
> patch also introduces helpers for setting the 'pending debug exceptions'
> VMCS field.
> 
> The third patch massages KVM's handling of exception payloads with
> regard to API compatibility. Rather than immediately injecting the
> payload w/o opt-in, instead defer the payload + immediately inject
> before completing a KVM_GET_VCPU_EVENTS. This maintains API
> compatibility whilst correcting #DB behavior with regard to higher
> priority VM-exit events.
> 
> Fourth patch introduces MTF implementation for emulated instructions.
> Identify if an MTF is due on an instruction boundary from
> kvm_vcpu_do_singlestep(), however only deliver this VM-exit from
> vmx_check_nested_events() to respect the relative prioritization to
> other VM-exits. Since this augments the nested state, introduce a new
> flag for (de)serialization.
> 
> Last patch adds tests to kvm-unit-tests to assert the correctness of MTF
> under several conditions (concurrent #DB trap, #DB fault, etc). These
> tests pass under virtualization with this series as well as on
> bare-metal.

Based on tip of kvm/next as of Jan 28:

f1b0bc14f2db ("KVM: x86: Use a typedef for fastop functions")


> Oliver Upton (4):
>   KVM: x86: Mask off reserved bit from #DB exception payload
>   KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
>   KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
>   KVM: nVMX: Emulate MTF when performing instruction emulation
> 
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/include/uapi/asm/kvm.h |  1 +
>  arch/x86/kvm/svm.c              |  1 +
>  arch/x86/kvm/vmx/nested.c       | 60 ++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/nested.h       |  5 +++
>  arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++
>  arch/x86/kvm/vmx/vmx.h          |  3 ++
>  arch/x86/kvm/x86.c              | 52 +++++++++++++++++-----------
>  8 files changed, 125 insertions(+), 20 deletions(-)
> 
> -- 
> 2.25.0.341.g760bfbb309-goog
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  2020-01-28  9:27 ` [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
@ 2020-02-03 19:13   ` Sean Christopherson
  2020-02-03 23:00     ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2020-02-03 19:13 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm, Paolo Bonzini, Jim Mattson, Peter Shier

On Tue, Jan 28, 2020 at 01:27:12AM -0800, Oliver Upton wrote:
> SDM 27.3.4 states that the 'pending debug exceptions' VMCS field will
> be populated if a VM-exit caused by an INIT signal takes priority over a
> debug-trap. Emulate this behavior when synthesizing an INIT signal
> VM-exit into L1.
> 
> Fixes: 558b8d50dbff ("KVM: x86: Fix INIT signal handling in various CPU states")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 95b3f4306ac2..aba16599ca69 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3572,6 +3572,27 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
>  	nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual);
>  }
>  
> +static inline bool nested_vmx_check_pending_dbg(struct kvm_vcpu *vcpu)

Really dislike the name, partially because the code checks @has_payload and
partially because the part, nested_vmx_set_pending_dbg() "sets" completely
different state than this checks.

Checking has_payload may also be wrong, e.g. wouldn't it make sense to
update GUEST_PENDING_DBG_EXCEPTIONS, even if we crush it with '0'?

> +{
> +	return vcpu->arch.exception.nr == DB_VECTOR &&
> +			vcpu->arch.exception.pending &&
> +			vcpu->arch.exception.has_payload;
> +}
> +
> +/*
> + * If a higher priority VM-exit is delivered before a debug-trap, hardware will
> + * set the 'pending debug exceptions' field appropriately for reinjection on the
> + * next VM-entry.
> + */
> +static void nested_vmx_set_pending_dbg(struct kvm_vcpu *vcpu)
> +{
> +	vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, vcpu->arch.exception.payload);
> +	vcpu->arch.exception.has_payload = false;
> +	vcpu->arch.exception.payload = 0;
> +	vcpu->arch.exception.pending = false;
> +	vcpu->arch.exception.injected = true;

This looks wrong.  The #DB hasn't been injected, KVM is simply emulating
the side effect of the VMCS field being updated.  E.g. KVM will have
different architecturally visible behavior depending on @has_payload.

It's not KVM's responsibility to update the pending exceptions, e.g. if
L1 wants to emulate INIT for L2 then it needs to purge the VMCS fields that
are affected by INIT.

> +}
> +
>  static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -3584,6 +3605,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>  		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
>  		if (block_nested_events)
>  			return -EBUSY;
> +		if (nested_vmx_check_pending_dbg(vcpu))
> +			nested_vmx_set_pending_dbg(vcpu);

I'd strongly prefer to open code the whole thing.  Alternatively, if there
are other events that will need similar logic, add a single helper to take
care of everything.

E.g.

		if (vcpu->arch.exception.pending &&
		    vcpu->arch.exception.nr == DB_VECTOR &&
		    vcpu->arch.exception.has_payload)
			vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
				    vcpu->arch.exception.payload);

or

                if (vcpu->arch.exception.pending &&
                    vcpu->arch.exception.nr == DB_VECTOR)
                        vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
                                    vcpu->arch.exception.payload);

or
		nested_vmx_update_guest_pending_dbg();

>  		clear_bit(KVM_APIC_INIT, &apic->pending_events);
>  		nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0);
>  		return 0;
> -- 
> 2.25.0.341.g760bfbb309-goog
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
  2020-01-28  9:27 ` [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
@ 2020-02-03 19:48   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-02-03 19:48 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm, Paolo Bonzini, Jim Mattson, Peter Shier

On Tue, Jan 28, 2020 at 01:27:13AM -0800, Oliver Upton wrote:
> KVM doesn't utilize exception payloads by default, as this behavior
> diverges from the expectations of the userspace API. However, this
> constraint only applies if KVM is servicing a KVM_GET_VCPU_EVENTS ioctl
> before delivering the exception.
> 
> Use exception payloads unconditionally if the vcpu is in guest mode.

This sentence is super confusing.  It doesn't align with the code, which
is clearly handling "not is in guest mode".  And KVM already uses payloads
unconditionally, it's only the deferring behavior that is changing.

> Deliver the exception payload before completing a KVM_GET_VCPU_EVENTS
> to ensure API compatibility.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/kvm/x86.c | 29 ++++++++++++++++-------------
>  1 file changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7a341c0c978a..9f080101618c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -497,19 +497,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
>  		vcpu->arch.exception.error_code = error_code;
>  		vcpu->arch.exception.has_payload = has_payload;
>  		vcpu->arch.exception.payload = payload;
> -		/*
> -		 * In guest mode, payload delivery should be deferred,
> -		 * so that the L1 hypervisor can intercept #PF before
> -		 * CR2 is modified (or intercept #DB before DR6 is
> -		 * modified under nVMX).  However, for ABI
> -		 * compatibility with KVM_GET_VCPU_EVENTS and
> -		 * KVM_SET_VCPU_EVENTS, we can't delay payload
> -		 * delivery unless userspace has enabled this
> -		 * functionality via the per-VM capability,
> -		 * KVM_CAP_EXCEPTION_PAYLOAD.
> -		 */
> -		if (!vcpu->kvm->arch.exception_payload_enabled ||
> -		    !is_guest_mode(vcpu))
> +		if (!is_guest_mode(vcpu))
>  			kvm_deliver_exception_payload(vcpu);
>  		return;
>  	}
> @@ -3790,6 +3778,21 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>  {
>  	process_nmi(vcpu);
>  
> +	/*
> +	 * In guest mode, payload delivery should be deferred,
> +	 * so that the L1 hypervisor can intercept #PF before
> +	 * CR2 is modified (or intercept #DB before DR6 is
> +	 * modified under nVMX).  However, for ABI
> +	 * compatibility with KVM_GET_VCPU_EVENTS and
> +	 * KVM_SET_VCPU_EVENTS, we can't delay payload
> +	 * delivery unless userspace has enabled this
> +	 * functionality via the per-VM capability,
> +	 * KVM_CAP_EXCEPTION_PAYLOAD.
> +	 */

This comment needs to be rewritten.  It makes sense in the context of
kvm_multiple_exception(), here it's just confusing.

> +	if (!vcpu->kvm->arch.exception_payload_enabled &&
> +	    vcpu->arch.exception.pending && vcpu->arch.exception.has_payload)
> +		kvm_deliver_exception_payload(vcpu);

Crushing CR2/DR6 just because userspace is retrieving info can't possibly
be correct.  If it somehow is correct then this needs big fat comment.

What is the ABI compatibility issue?  E.g. can KVM simply hide the payload
info on KVM_GET_VCPU_EVENT and then drop it on KVM_SET_VCPU_EVENTS?

> +
>  	/*
>  	 * The API doesn't provide the instruction length for software
>  	 * exceptions, so don't report them. As long as the guest RIP
> -- 
> 2.25.0.341.g760bfbb309-goog
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2020-01-28  9:27 ` [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
@ 2020-02-03 22:58   ` Sean Christopherson
  2020-02-06 10:42     ` Oliver Upton
  2021-08-13  0:23   ` Jim Mattson
  1 sibling, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2020-02-03 22:58 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm, Paolo Bonzini, Jim Mattson, Peter Shier

On Tue, Jan 28, 2020 at 01:27:14AM -0800, Oliver Upton wrote:
> Since commit 5f3d45e7f282 ("kvm/x86: add support for
> MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> flag processor-based execution control for its L2 guest. KVM simply
> forwards any MTF VM-exits to the L1 guest, which works for normal
> instruction execution.
> 
> However, when KVM needs to emulate an instruction on the behalf of an L2
> guest, the monitor trap flag is not emulated. Add the necessary logic to
> kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> instruction emulation for L2.
> 
> Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/include/uapi/asm/kvm.h |  1 +
>  arch/x86/kvm/svm.c              |  1 +
>  arch/x86/kvm/vmx/nested.c       | 37 ++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/nested.h       |  5 +++++
>  arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h          |  3 +++
>  arch/x86/kvm/x86.c              | 15 +++++++------
>  8 files changed, 78 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 69e31dbdfdc2..e1061ebc1b4b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1103,6 +1103,7 @@ struct kvm_x86_ops {
>  	int (*handle_exit)(struct kvm_vcpu *vcpu,
>  		enum exit_fastpath_completion exit_fastpath);
>  	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
> +	void (*do_singlestep)(struct kvm_vcpu *vcpu);
>  	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
>  	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
>  	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 503d3f42da16..3f3f780c8c65 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -390,6 +390,7 @@ struct kvm_sync_regs {
>  #define KVM_STATE_NESTED_GUEST_MODE	0x00000001
>  #define KVM_STATE_NESTED_RUN_PENDING	0x00000002
>  #define KVM_STATE_NESTED_EVMCS		0x00000004
> +#define KVM_STATE_NESTED_MTF_PENDING	0x00000008
>  
>  #define KVM_STATE_NESTED_SMM_GUEST_MODE	0x00000001
>  #define KVM_STATE_NESTED_SMM_VMXON	0x00000002
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 9dbb990c319a..3653e230d3d5 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7316,6 +7316,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.run = svm_vcpu_run,
>  	.handle_exit = handle_exit,
>  	.skip_emulated_instruction = skip_emulated_instruction,
> +	.do_singlestep = NULL,
>  	.set_interrupt_shadow = svm_set_interrupt_shadow,
>  	.get_interrupt_shadow = svm_get_interrupt_shadow,
>  	.patch_hypercall = svm_patch_hypercall,
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index aba16599ca69..0de71b207b2a 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3599,8 +3599,15 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>  	unsigned long exit_qual;
>  	bool block_nested_events =
>  	    vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
> +	bool mtf_pending = vmx->nested.mtf_pending;
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  
> +	/*
> +	 * Clear the MTF state. If a higher priority VM-exit is delivered first,
> +	 * this state is discarded.
> +	 */
> +	vmx->nested.mtf_pending = false;
> +
>  	if (lapic_in_kernel(vcpu) &&
>  		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
>  		if (block_nested_events)
> @@ -3612,8 +3619,30 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
>  		return 0;
>  	}
>  
> +	/*
> +	 * Process non-debug exceptions first before MTF.
> +	 */
>  	if (vcpu->arch.exception.pending &&
> -		nested_vmx_check_exception(vcpu, &exit_qual)) {
> +	    !nested_vmx_check_pending_dbg(vcpu) &&

Oof.  So @has_payload is set %true only by single-step #DB and #PF, and by
#DBs injected from userspace.  Now I understand where the "pending_dbg()
comes from.

The part that gets really confusing is that there's "pending" from KVM's
perspective, which can be any kind of #DB, e.g. DR7.GD and icebrk, and
pending from an architectural perspective, which is single-step #DB and
data #DBs, the latter of which isn't manually emulated by KVM (I think?).

Not sure if there's a better name than check_pending_dbg(), but either way
I think a function comment is in order.

> +	    nested_vmx_check_exception(vcpu, &exit_qual)) {
> +		if (block_nested_events)
> +			return -EBUSY;
> +		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> +		return 0;
> +	}
> +
> +	if (mtf_pending) {
> +		if (block_nested_events)
> +			return -EBUSY;
> +		if (nested_vmx_check_pending_dbg(vcpu))
> +			nested_vmx_set_pending_dbg(vcpu);
> +		nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
> +		return 0;
> +	}
> +
> +	if (vcpu->arch.exception.pending &&
> +	    nested_vmx_check_pending_dbg(vcpu) &&
> +	    nested_vmx_check_exception(vcpu, &exit_qual)) {
>  		if (block_nested_events)
>  			return -EBUSY;
>  		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> @@ -5705,6 +5734,9 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>  
>  			if (vmx->nested.nested_run_pending)
>  				kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
> +
> +			if (vmx->nested.mtf_pending)
> +				kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
>  		}
>  	}
>  
> @@ -5885,6 +5917,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>  	vmx->nested.nested_run_pending =
>  		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
>  
> +	vmx->nested.mtf_pending =
> +		!!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
> +
>  	ret = -EINVAL;
>  	if (nested_cpu_has_shadow_vmcs(vmcs12) &&
>  	    vmcs12->vmcs_link_pointer != -1ull) {
> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> index fc874d4ead0f..e12461776151 100644
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -175,6 +175,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
>  	return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
>  }
>  
> +static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
> +{
> +	return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
> +}
> +
>  static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
>  {
>  	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 802ba97ac7f2..5735d1a1af05 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1601,6 +1601,27 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  	return 1;
>  }
>  
> +static void vmx_do_singlestep(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_vmx *vmx;
> +
> +	if (!(is_guest_mode(vcpu) &&
> +	      nested_cpu_has_mtf(get_vmcs12(vcpu))))

Haven't followed all the paths, but does nested.mtf_pending need to be
cleared here in case L1 disabled MTF?1

> +		return;
> +
> +	vmx = to_vmx(vcpu);
> +
> +	/*
> +	 * Per the SDM, MTF takes priority over debug-trap exception besides
> +	 * T-bit traps. As instruction emulation is completed (i.e. at the end
> +	 * of an instruction boundary), any #DB exception pending delivery must
> +	 * be a debug-trap. Record the pending MTF state to be delivered in
> +	 * vmx_check_nested_events().
> +	 */
> +	vmx->nested.mtf_pending = (!vcpu->arch.exception.pending ||
> +				   vcpu->arch.exception.nr == DB_VECTOR);
> +}
> +
>  static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
>  {
>  	/*
> @@ -7797,6 +7818,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
>  	.run = vmx_vcpu_run,
>  	.handle_exit = vmx_handle_exit,
>  	.skip_emulated_instruction = skip_emulated_instruction,
> +	.do_singlestep = vmx_do_singlestep,
>  	.set_interrupt_shadow = vmx_set_interrupt_shadow,
>  	.get_interrupt_shadow = vmx_get_interrupt_shadow,
>  	.patch_hypercall = vmx_patch_hypercall,
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index a4f7f737c5d4..401e9ca23779 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -150,6 +150,9 @@ struct nested_vmx {
>  	/* L2 must run next, and mustn't decide to exit to L1. */
>  	bool nested_run_pending;
>  
> +	/* Pending MTF VM-exit into L1.  */
> +	bool mtf_pending;
> +
>  	struct loaded_vmcs vmcs02;
>  
>  	/*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9f080101618c..e5c859f9b3bf 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6626,10 +6626,15 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
>  	return dr6;
>  }
>  
> -static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu)
> +static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, bool tf)
>  {
>  	struct kvm_run *kvm_run = vcpu->run;
>  
> +	if (kvm_x86_ops->do_singlestep)
> +		kvm_x86_ops->do_singlestep(vcpu);
> +	if (!tf)
> +		return 1;
> +
>  	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
>  		kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1 | DR6_RTM;
>  		kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
> @@ -6658,9 +6663,7 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  	 * processor will not generate this exception after the instruction
>  	 * that sets the TF flag".
>  	 */
> -	if (unlikely(rflags & X86_EFLAGS_TF))
> -		r = kvm_vcpu_do_singlestep(vcpu);
> -	return r;
> +	return kvm_vcpu_do_singlestep(vcpu, rflags & X86_EFLAGS_TF);

The extra retpoline, i.e. ->do_singlestep(), can be avoided by handling
the kvm_skip_emulated_instruction() purely in VMX:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 802ba97ac7f2..4e6373caea53 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1601,6 +1601,20 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
        return 1;
 }

+static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+       if (is_guest_mode(vcpu)) {
+               if (nested_cpu_has_mtf(get_vmcs12(vcpu) &&
+                   (!vcpu->arch.exception.pending ||
+                    vcpu->arch.exception.nr == DB_VECTOR)))
+                       to_vmx(vcpu)->nested.mtf_pending = true;
+               else
+                       to_vmx(vcpu)->nested.mtf_pending = false;
+       }
+
+       return skip_emulated_instruction(vcpu);
+}
+
 static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 {
        /*
@@ -7796,7 +7810,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {

        .run = vmx_vcpu_run,
        .handle_exit = vmx_handle_exit,
-       .skip_emulated_instruction = skip_emulated_instruction,
+       .skip_emulated_instruction = vmx_skip_emulated_instruction,
        .set_interrupt_shadow = vmx_set_interrupt_shadow,
        .get_interrupt_shadow = vmx_get_interrupt_shadow,
        .patch_hypercall = vmx_patch_hypercall,

>  }
>  EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
>  
> @@ -6876,8 +6879,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  		if (!ctxt->have_exception ||
>  		    exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
>  			kvm_rip_write(vcpu, ctxt->eip);
> -			if (r && ctxt->tf)
> -				r = kvm_vcpu_do_singlestep(vcpu);
> +			if (r)
> +				r = kvm_vcpu_do_singlestep(vcpu, ctxt->tf);

Hrm.  I like the current code of calling do_singlestep() iff EFLAGS.TF=1.
The non-emulator case can be handled purely in VMX, per above.  Maybe do
something similar for the emulator with a generic "emulated instruction"
hook?  Not sure what to call it...

			kvm_x86_ops->update_emulated_instruction(vcpu);

>  			__kvm_set_rflags(vcpu, ctxt->eflags);
>  		}
>  
> -- 
> 2.25.0.341.g760bfbb309-goog
> 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  2020-02-03 19:13   ` Sean Christopherson
@ 2020-02-03 23:00     ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-02-03 23:00 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm, Paolo Bonzini, Jim Mattson, Peter Shier

On Mon, Feb 03, 2020 at 11:13:30AM -0800, Sean Christopherson wrote:
> On Tue, Jan 28, 2020 at 01:27:12AM -0800, Oliver Upton wrote:
> > SDM 27.3.4 states that the 'pending debug exceptions' VMCS field will
> > be populated if a VM-exit caused by an INIT signal takes priority over a
> > debug-trap. Emulate this behavior when synthesizing an INIT signal
> > VM-exit into L1.
> > 
> > Fixes: 558b8d50dbff ("KVM: x86: Fix INIT signal handling in various CPU states")
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 23 +++++++++++++++++++++++
> >  1 file changed, 23 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 95b3f4306ac2..aba16599ca69 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3572,6 +3572,27 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
> >  	nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual);
> >  }
> >  
> > +static inline bool nested_vmx_check_pending_dbg(struct kvm_vcpu *vcpu)
> 
> Really dislike the name, partially because the code checks @has_payload and
> partially because the part, nested_vmx_set_pending_dbg() "sets" completely
> different state than this checks.
> 
> Checking has_payload may also be wrong, e.g. wouldn't it make sense to
> update GUEST_PENDING_DBG_EXCEPTIONS, even if we crush it with '0'?
> 
> > +{
> > +	return vcpu->arch.exception.nr == DB_VECTOR &&
> > +			vcpu->arch.exception.pending &&
> > +			vcpu->arch.exception.has_payload;
> > +}
> > +
> > +/*
> > + * If a higher priority VM-exit is delivered before a debug-trap, hardware will
> > + * set the 'pending debug exceptions' field appropriately for reinjection on the
> > + * next VM-entry.
> > + */
> > +static void nested_vmx_set_pending_dbg(struct kvm_vcpu *vcpu)
> > +{
> > +	vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, vcpu->arch.exception.payload);
> > +	vcpu->arch.exception.has_payload = false;
> > +	vcpu->arch.exception.payload = 0;
> > +	vcpu->arch.exception.pending = false;
> > +	vcpu->arch.exception.injected = true;
> 
> This looks wrong.  The #DB hasn't been injected, KVM is simply emulating
> the side effect of the VMCS field being updated.  E.g. KVM will have
> different architecturally visible behavior depending on @has_payload.

My head is spinning trying to work through the #DB/MTF interactions.  I
think this ends up being a moot point because prepare_vmcs12() will purge
the pending exceptions.  If it is a moot point, then I'd prefer to not do
the explicit arch.exception updates so as to keep this similar to other
exceptions.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2020-02-03 22:58   ` Sean Christopherson
@ 2020-02-06 10:42     ` Oliver Upton
  0 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-06 10:42 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Paolo Bonzini, Jim Mattson, Peter Shier

On Mon, Feb 03, 2020 at 02:58:14PM -0800, Sean Christopherson wrote:
> On Tue, Jan 28, 2020 at 01:27:14AM -0800, Oliver Upton wrote:
> > Since commit 5f3d45e7f282 ("kvm/x86: add support for
> > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> > flag processor-based execution control for its L2 guest. KVM simply
> > forwards any MTF VM-exits to the L1 guest, which works for normal
> > instruction execution.
> > 
> > However, when KVM needs to emulate an instruction on the behalf of an L2
> > guest, the monitor trap flag is not emulated. Add the necessary logic to
> > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> > instruction emulation for L2.
> > 
> > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  1 +
> >  arch/x86/include/uapi/asm/kvm.h |  1 +
> >  arch/x86/kvm/svm.c              |  1 +
> >  arch/x86/kvm/vmx/nested.c       | 37 ++++++++++++++++++++++++++++++++-
> >  arch/x86/kvm/vmx/nested.h       |  5 +++++
> >  arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++++++++++
> >  arch/x86/kvm/vmx/vmx.h          |  3 +++
> >  arch/x86/kvm/x86.c              | 15 +++++++------
> >  8 files changed, 78 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 69e31dbdfdc2..e1061ebc1b4b 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1103,6 +1103,7 @@ struct kvm_x86_ops {
> >  	int (*handle_exit)(struct kvm_vcpu *vcpu,
> >  		enum exit_fastpath_completion exit_fastpath);
> >  	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
> > +	void (*do_singlestep)(struct kvm_vcpu *vcpu);
> >  	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
> >  	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
> >  	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
> > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > index 503d3f42da16..3f3f780c8c65 100644
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -390,6 +390,7 @@ struct kvm_sync_regs {
> >  #define KVM_STATE_NESTED_GUEST_MODE	0x00000001
> >  #define KVM_STATE_NESTED_RUN_PENDING	0x00000002
> >  #define KVM_STATE_NESTED_EVMCS		0x00000004
> > +#define KVM_STATE_NESTED_MTF_PENDING	0x00000008
> >  
> >  #define KVM_STATE_NESTED_SMM_GUEST_MODE	0x00000001
> >  #define KVM_STATE_NESTED_SMM_VMXON	0x00000002
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 9dbb990c319a..3653e230d3d5 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -7316,6 +7316,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
> >  	.run = svm_vcpu_run,
> >  	.handle_exit = handle_exit,
> >  	.skip_emulated_instruction = skip_emulated_instruction,
> > +	.do_singlestep = NULL,
> >  	.set_interrupt_shadow = svm_set_interrupt_shadow,
> >  	.get_interrupt_shadow = svm_get_interrupt_shadow,
> >  	.patch_hypercall = svm_patch_hypercall,
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index aba16599ca69..0de71b207b2a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3599,8 +3599,15 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
> >  	unsigned long exit_qual;
> >  	bool block_nested_events =
> >  	    vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
> > +	bool mtf_pending = vmx->nested.mtf_pending;
> >  	struct kvm_lapic *apic = vcpu->arch.apic;
> >  
> > +	/*
> > +	 * Clear the MTF state. If a higher priority VM-exit is delivered first,
> > +	 * this state is discarded.
> > +	 */
> > +	vmx->nested.mtf_pending = false;
> > +
> >  	if (lapic_in_kernel(vcpu) &&
> >  		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
> >  		if (block_nested_events)
> > @@ -3612,8 +3619,30 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
> >  		return 0;
> >  	}
> >  
> > +	/*
> > +	 * Process non-debug exceptions first before MTF.
> > +	 */
> >  	if (vcpu->arch.exception.pending &&
> > -		nested_vmx_check_exception(vcpu, &exit_qual)) {
> > +	    !nested_vmx_check_pending_dbg(vcpu) &&
> 
> Oof.  So @has_payload is set %true only by single-step #DB and #PF, and by
> #DBs injected from userspace.  Now I understand where the "pending_dbg()
> comes from.
> 
> The part that gets really confusing is that there's "pending" from KVM's
> perspective, which can be any kind of #DB, e.g. DR7.GD and icebrk, and
> pending from an architectural perspective, which is single-step #DB and
> data #DBs, the latter of which isn't manually emulated by KVM (I think?).
> 
> Not sure if there's a better name than check_pending_dbg(), but either way
> I think a function comment is in order.
>

Completely agree. To somewhat disambiguate the overload of "pending",
I'll add a descriptive comment.

> > +	    nested_vmx_check_exception(vcpu, &exit_qual)) {
> > +		if (block_nested_events)
> > +			return -EBUSY;
> > +		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> > +		return 0;
> > +	}
> > +
> > +	if (mtf_pending) {
> > +		if (block_nested_events)
> > +			return -EBUSY;
> > +		if (nested_vmx_check_pending_dbg(vcpu))
> > +			nested_vmx_set_pending_dbg(vcpu);
> > +		nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
> > +		return 0;
> > +	}
> > +
> > +	if (vcpu->arch.exception.pending &&
> > +	    nested_vmx_check_pending_dbg(vcpu) &&
> > +	    nested_vmx_check_exception(vcpu, &exit_qual)) {
> >  		if (block_nested_events)
> >  			return -EBUSY;
> >  		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> > @@ -5705,6 +5734,9 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
> >  
> >  			if (vmx->nested.nested_run_pending)
> >  				kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
> > +
> > +			if (vmx->nested.mtf_pending)
> > +				kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
> >  		}
> >  	}
> >  
> > @@ -5885,6 +5917,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> >  	vmx->nested.nested_run_pending =
> >  		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
> >  
> > +	vmx->nested.mtf_pending =
> > +		!!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
> > +
> >  	ret = -EINVAL;
> >  	if (nested_cpu_has_shadow_vmcs(vmcs12) &&
> >  	    vmcs12->vmcs_link_pointer != -1ull) {
> > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > index fc874d4ead0f..e12461776151 100644
> > --- a/arch/x86/kvm/vmx/nested.h
> > +++ b/arch/x86/kvm/vmx/nested.h
> > @@ -175,6 +175,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
> >  	return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
> >  }
> >  
> > +static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
> > +{
> > +	return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
> > +}
> > +
> >  static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
> >  {
> >  	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 802ba97ac7f2..5735d1a1af05 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1601,6 +1601,27 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
> >  	return 1;
> >  }
> >  
> > +static void vmx_do_singlestep(struct kvm_vcpu *vcpu)
> > +{
> > +	struct vcpu_vmx *vmx;
> > +
> > +	if (!(is_guest_mode(vcpu) &&
> > +	      nested_cpu_has_mtf(get_vmcs12(vcpu))))
> 
> Haven't followed all the paths, but does nested.mtf_pending need to be
> cleared here in case L1 disabled MTF?1

Hmm. I'm unsure of a path that would allow the mtf_pending flag to be
raised but not lowered in vmx_check_nested_events(). Regardless, I can
lower it here should L1 change its MTF control.

> > +		return;
> > +
> > +	vmx = to_vmx(vcpu);
> > +
> > +	/*
> > +	 * Per the SDM, MTF takes priority over debug-trap exception besides
> > +	 * T-bit traps. As instruction emulation is completed (i.e. at the end
> > +	 * of an instruction boundary), any #DB exception pending delivery must
> > +	 * be a debug-trap. Record the pending MTF state to be delivered in
> > +	 * vmx_check_nested_events().
> > +	 */
> > +	vmx->nested.mtf_pending = (!vcpu->arch.exception.pending ||
> > +				   vcpu->arch.exception.nr == DB_VECTOR);
> > +}
> > +
> >  static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
> >  {
> >  	/*
> > @@ -7797,6 +7818,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
> >  	.run = vmx_vcpu_run,
> >  	.handle_exit = vmx_handle_exit,
> >  	.skip_emulated_instruction = skip_emulated_instruction,
> > +	.do_singlestep = vmx_do_singlestep,
> >  	.set_interrupt_shadow = vmx_set_interrupt_shadow,
> >  	.get_interrupt_shadow = vmx_get_interrupt_shadow,
> >  	.patch_hypercall = vmx_patch_hypercall,
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index a4f7f737c5d4..401e9ca23779 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -150,6 +150,9 @@ struct nested_vmx {
> >  	/* L2 must run next, and mustn't decide to exit to L1. */
> >  	bool nested_run_pending;
> >  
> > +	/* Pending MTF VM-exit into L1.  */
> > +	bool mtf_pending;
> > +
> >  	struct loaded_vmcs vmcs02;
> >  
> >  	/*
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 9f080101618c..e5c859f9b3bf 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -6626,10 +6626,15 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
> >  	return dr6;
> >  }
> >  
> > -static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu)
> > +static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, bool tf)
> >  {
> >  	struct kvm_run *kvm_run = vcpu->run;
> >  
> > +	if (kvm_x86_ops->do_singlestep)
> > +		kvm_x86_ops->do_singlestep(vcpu);
> > +	if (!tf)
> > +		return 1;
> > +
> >  	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> >  		kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1 | DR6_RTM;
> >  		kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
> > @@ -6658,9 +6663,7 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
> >  	 * processor will not generate this exception after the instruction
> >  	 * that sets the TF flag".
> >  	 */
> > -	if (unlikely(rflags & X86_EFLAGS_TF))
> > -		r = kvm_vcpu_do_singlestep(vcpu);
> > -	return r;
> > +	return kvm_vcpu_do_singlestep(vcpu, rflags & X86_EFLAGS_TF);
> 
> The extra retpoline, i.e. ->do_singlestep(), can be avoided by handling
> the kvm_skip_emulated_instruction() purely in VMX:
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 802ba97ac7f2..4e6373caea53 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1601,6 +1601,20 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
>         return 1;
>  }
> 
> +static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
> +{
> +       if (is_guest_mode(vcpu)) {
> +               if (nested_cpu_has_mtf(get_vmcs12(vcpu) &&
> +                   (!vcpu->arch.exception.pending ||
> +                    vcpu->arch.exception.nr == DB_VECTOR)))
> +                       to_vmx(vcpu)->nested.mtf_pending = true;
> +               else
> +                       to_vmx(vcpu)->nested.mtf_pending = false;
> +       }
> +
> +       return skip_emulated_instruction(vcpu);
> +}
> +
>  static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
>  {
>         /*
> @@ -7796,7 +7810,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
> 
>         .run = vmx_vcpu_run,
>         .handle_exit = vmx_handle_exit,
> -       .skip_emulated_instruction = skip_emulated_instruction,
> +       .skip_emulated_instruction = vmx_skip_emulated_instruction,
>         .set_interrupt_shadow = vmx_set_interrupt_shadow,
>         .get_interrupt_shadow = vmx_get_interrupt_shadow,
>         .patch_hypercall = vmx_patch_hypercall,
> 
> >  }

Awesome, thank you for the suggestion. I'll work this in :)

> >  EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
> >  
> > @@ -6876,8 +6879,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> >  		if (!ctxt->have_exception ||
> >  		    exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
> >  			kvm_rip_write(vcpu, ctxt->eip);
> > -			if (r && ctxt->tf)
> > -				r = kvm_vcpu_do_singlestep(vcpu);
> > +			if (r)
> > +				r = kvm_vcpu_do_singlestep(vcpu, ctxt->tf);
> 
> Hrm.  I like the current code of calling do_singlestep() iff EFLAGS.TF=1.
> The non-emulator case can be handled purely in VMX, per above.  Maybe do
> something similar for the emulator with a generic "emulated instruction"
> hook?  Not sure what to call it...
> 
> 			kvm_x86_ops->update_emulated_instruction(vcpu);

Yeah, agreed. I'll add in a hook for the emulator to do this rather than
reworking do_singlestep().

> >  			__kvm_set_rflags(vcpu, ctxt->eflags);
> >  		}
> >  
> > -- 
> > 2.25.0.341.g760bfbb309-goog

Thanks Sean!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2020-01-28  9:27 ` [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
  2020-02-03 22:58   ` Sean Christopherson
@ 2021-08-13  0:23   ` Jim Mattson
  2021-08-13 16:35     ` Sean Christopherson
  1 sibling, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2021-08-13  0:23 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm, Paolo Bonzini, Peter Shier, Sean Christopherson

On Tue, Jan 28, 2020 at 1:27 AM Oliver Upton <oupton@google.com> wrote:
>
> Since commit 5f3d45e7f282 ("kvm/x86: add support for
> MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> flag processor-based execution control for its L2 guest. KVM simply
> forwards any MTF VM-exits to the L1 guest, which works for normal
> instruction execution.
>
> However, when KVM needs to emulate an instruction on the behalf of an L2
> guest, the monitor trap flag is not emulated. Add the necessary logic to
> kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> instruction emulation for L2.
>
> Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/include/uapi/asm/kvm.h |  1 +
>  arch/x86/kvm/svm.c              |  1 +
>  arch/x86/kvm/vmx/nested.c       | 37 ++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/nested.h       |  5 +++++
>  arch/x86/kvm/vmx/vmx.c          | 22 ++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h          |  3 +++
>  arch/x86/kvm/x86.c              | 15 +++++++------
>  8 files changed, 78 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 69e31dbdfdc2..e1061ebc1b4b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1103,6 +1103,7 @@ struct kvm_x86_ops {
>         int (*handle_exit)(struct kvm_vcpu *vcpu,
>                 enum exit_fastpath_completion exit_fastpath);
>         int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
> +       void (*do_singlestep)(struct kvm_vcpu *vcpu);
>         void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
>         u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
>         void (*patch_hypercall)(struct kvm_vcpu *vcpu,
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 503d3f42da16..3f3f780c8c65 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -390,6 +390,7 @@ struct kvm_sync_regs {
>  #define KVM_STATE_NESTED_GUEST_MODE    0x00000001
>  #define KVM_STATE_NESTED_RUN_PENDING   0x00000002
>  #define KVM_STATE_NESTED_EVMCS         0x00000004
> +#define KVM_STATE_NESTED_MTF_PENDING   0x00000008

Maybe I don't understand the distinction, but shouldn't this new flag
have a KVM_STATE_NESTED_VMX prefix and live with
KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE, below?

>
>  #define KVM_STATE_NESTED_SMM_GUEST_MODE        0x00000001
>  #define KVM_STATE_NESTED_SMM_VMXON     0x00000002

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2021-08-13  0:23   ` Jim Mattson
@ 2021-08-13 16:35     ` Sean Christopherson
  2021-08-13 17:03       ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2021-08-13 16:35 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Oliver Upton, kvm, Paolo Bonzini, Peter Shier

On Thu, Aug 12, 2021, Jim Mattson wrote:
> On Tue, Jan 28, 2020 at 1:27 AM Oliver Upton <oupton@google.com> wrote:
> >
> > Since commit 5f3d45e7f282 ("kvm/x86: add support for
> > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> > flag processor-based execution control for its L2 guest. KVM simply
> > forwards any MTF VM-exits to the L1 guest, which works for normal
> > instruction execution.
> >
> > However, when KVM needs to emulate an instruction on the behalf of an L2
> > guest, the monitor trap flag is not emulated. Add the necessary logic to
> > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> > instruction emulation for L2.
> >
> > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---

...

> > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > index 503d3f42da16..3f3f780c8c65 100644
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -390,6 +390,7 @@ struct kvm_sync_regs {
> >  #define KVM_STATE_NESTED_GUEST_MODE    0x00000001
> >  #define KVM_STATE_NESTED_RUN_PENDING   0x00000002
> >  #define KVM_STATE_NESTED_EVMCS         0x00000004
> > +#define KVM_STATE_NESTED_MTF_PENDING   0x00000008
> 
> Maybe I don't understand the distinction, but shouldn't this new flag
> have a KVM_STATE_NESTED_VMX prefix and live with
> KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE, below?

That does seem to be the case, seems highly unlikely SVM will add MTF.  And SVM's
KVM_STATE_NESTED_GIF_SET should have been SVM specific, but kvm_svm_nested_state_hdr
doesn't even reserve a flags field :-/

> >  #define KVM_STATE_NESTED_SMM_GUEST_MODE        0x00000001
> >  #define KVM_STATE_NESTED_SMM_VMXON     0x00000002

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2021-08-13 16:35     ` Sean Christopherson
@ 2021-08-13 17:03       ` Jim Mattson
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2021-08-13 17:03 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Oliver Upton, kvm, Paolo Bonzini, Peter Shier

On Fri, Aug 13, 2021 at 9:35 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Aug 12, 2021, Jim Mattson wrote:
> > On Tue, Jan 28, 2020 at 1:27 AM Oliver Upton <oupton@google.com> wrote:
> > >
> > > Since commit 5f3d45e7f282 ("kvm/x86: add support for
> > > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> > > flag processor-based execution control for its L2 guest. KVM simply
> > > forwards any MTF VM-exits to the L1 guest, which works for normal
> > > instruction execution.
> > >
> > > However, when KVM needs to emulate an instruction on the behalf of an L2
> > > guest, the monitor trap flag is not emulated. Add the necessary logic to
> > > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> > > instruction emulation for L2.
> > >
> > > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> > > Signed-off-by: Oliver Upton <oupton@google.com>
> > > ---
>
> ...
>
> > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > > index 503d3f42da16..3f3f780c8c65 100644
> > > --- a/arch/x86/include/uapi/asm/kvm.h
> > > +++ b/arch/x86/include/uapi/asm/kvm.h
> > > @@ -390,6 +390,7 @@ struct kvm_sync_regs {
> > >  #define KVM_STATE_NESTED_GUEST_MODE    0x00000001
> > >  #define KVM_STATE_NESTED_RUN_PENDING   0x00000002
> > >  #define KVM_STATE_NESTED_EVMCS         0x00000004
> > > +#define KVM_STATE_NESTED_MTF_PENDING   0x00000008
> >
> > Maybe I don't understand the distinction, but shouldn't this new flag
> > have a KVM_STATE_NESTED_VMX prefix and live with
> > KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE, below?
>
> That does seem to be the case, seems highly unlikely SVM will add MTF.  And SVM's
> KVM_STATE_NESTED_GIF_SET should have been SVM specific, but kvm_svm_nested_state_hdr
> doesn't even reserve a flags field :-/

APIs are hard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-08-13 17:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28  9:27 [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
2020-01-28  9:27 ` [PATCH v2 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
2020-01-28  9:27 ` [PATCH v2 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
2020-02-03 19:13   ` Sean Christopherson
2020-02-03 23:00     ` Sean Christopherson
2020-01-28  9:27 ` [PATCH v2 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
2020-02-03 19:48   ` Sean Christopherson
2020-01-28  9:27 ` [PATCH v2 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
2020-02-03 22:58   ` Sean Christopherson
2020-02-06 10:42     ` Oliver Upton
2021-08-13  0:23   ` Jim Mattson
2021-08-13 16:35     ` Sean Christopherson
2021-08-13 17:03       ` Jim Mattson
2020-01-28  9:27 ` [kvm-unit-tests PATCH v2 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
2020-01-28  9:39 ` [PATCH v2 0/5] Handle monitor trap flag during instruction emulation Oliver Upton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.