All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Handle monitor trap flag during instruction emulation
@ 2020-02-07 10:36 Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@google.com
v2: http://lore.kernel.org/r/20200128092715.69429-1-oupton@google.com

v1 => v2:
 - Don't split the #DB delivery by vendors. Unconditionally injecting
   #DB payloads into the 'pending debug exceptions' field will cause KVM
   to get stuck in a loop. Per the SDM, when hardware injects an event
   resulting from this field's value, it is checked against the
   exception interception bitmap.
 - Address Sean's comments by injecting the VM-exit into L1 from
   vmx_check_nested_events().
 - Added fix for nested INIT VM-exits + 'pending debug exceptions' field
   as it was noticed in implementing v2.
 - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed
   since v1.

v2 => v3:
 - Merge the check/set_pending_dbg helpers into a single helper,
   vmx_update_pending_dbg(). Add clarifying comment to this helper.
 - Rewrite commit message, descriptive comment for change in 3/5 to
   explicitly describe the reason for mutating payload delivery
   behavior.
 - Undo the changes to kvm_vcpu_do_singlestep(). Instead, add a new hook
   to call for 'full' instruction emulation + 'fast' emulation.

KVM already provides guests the ability to use the 'monitor trap flag'
VM-execution control. Support for this flag is provided by the fact that
KVM unconditionally forwards MTF VM-exits to the guest (if requested),
as KVM doesn't utilize MTF. While this provides support during hardware
instruction execution, it is insufficient for instruction emulation.

Should L0 emulate an instruction on the behalf of L2, L0 should also
synthesize an MTF VM-exit into L1, should control be set.

The first patch corrects a nuanced difference between the definition of
a #DB exception payload field and DR6 register. Mask off bit 12 which is
defined in the 'pending debug exceptions' field when applying to DR6,
since the payload field is said to be compatible with the aforementioned
VMCS field.

The second patch sets the 'pending debug exceptions' VMCS field when
delivering an INIT signal VM-exit to L1, as described in the SDM. This
patch also introduces helpers for setting the 'pending debug exceptions'
VMCS field.

The third patch massages KVM's handling of exception payloads with
regard to API compatibility. Rather than immediately delivering the
payload w/o opt-in, instead defer the payload + inject
before completing a KVM_GET_VCPU_EVENTS. This maintains API
compatibility whilst correcting #DB behavior with regard to higher
priority VM-exit events.

Fourth patch introduces MTF implementation for emulated instructions.
Identify if an MTF is due on an instruction boundary from
kvm_vcpu_do_singlestep(), however only deliver this VM-exit from
vmx_check_nested_events() to respect the relative prioritization to
other VM-exits. Since this augments the nested state, introduce a new
flag for (de)serialization.

Last patch adds tests to kvm-unit-tests to assert the correctness of MTF
under several conditions (concurrent #DB trap, #DB fault, etc). These
tests pass under virtualization with this series as well as on
bare-metal.

Based on commit 2c2787938512 ("KVM: selftests: Stop memslot creation in
KVM internal memslot region").

Oliver Upton (4):
  KVM: x86: Mask off reserved bit from #DB exception payload
  KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
  KVM: nVMX: Emulate MTF when performing instruction emulation

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm.c              |  1 +
 arch/x86/kvm/vmx/nested.c       | 54 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/nested.h       |  5 +++
 arch/x86/kvm/vmx/vmx.c          | 37 +++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h          |  3 ++
 arch/x86/kvm/x86.c              | 39 ++++++++++++++++--------
 8 files changed, 126 insertions(+), 15 deletions(-)

-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/5] KVM: x86: Mask off reserved bit from #DB exception payload
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
@ 2020-02-07 10:36 ` Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

KVM defines the #DB payload as compatible with the 'pending debug
exceptions' field under VMX, not DR6. Mask off bit 12 when applying the
payload to DR6, as it is reserved on DR6 but not the 'pending debug
exceptions' field.

Fixes: f10c729ff965 ("kvm: vmx: Defer setting of DR6 until #DB delivery")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fbabb2f06273..95b753dab207 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -438,6 +438,14 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu)
 		 * for #DB exceptions under VMX.
 		 */
 		vcpu->arch.dr6 ^= payload & DR6_RTM;
+
+		/*
+		 * The #DB payload is defined as compatible with the 'pending
+		 * debug exceptions' field under VMX, not DR6. While bit 12 is
+		 * defined in the 'pending debug exceptions' field (enabled
+		 * breakpoint), it is reserved and must be zero in DR6.
+		 */
+		vcpu->arch.dr6 &= ~BIT(12);
 		break;
 	case PF_VECTOR:
 		vcpu->arch.cr2 = payload;
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
@ 2020-02-07 10:36 ` Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

SDM 27.3.4 states that the 'pending debug exceptions' VMCS field will
be populated if a VM-exit caused by an INIT signal takes priority over a
debug-trap. Emulate this behavior when synthesizing an INIT signal
VM-exit into L1.

Fixes: 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 657c2eda357c..1586aaae3a6f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3575,6 +3575,33 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
 	nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual);
 }
 
+/*
+ * Returns true if a debug trap is pending delivery.
+ *
+ * In KVM, debug traps bear an exception payload. As such, the class of a #DB
+ * exception may be inferred from the presence of an exception payload.
+ */
+static inline bool vmx_pending_dbg_trap(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.exception.pending &&
+			vcpu->arch.exception.nr == DB_VECTOR &&
+			vcpu->arch.exception.payload;
+}
+
+/*
+ * Certain VM-exits set the 'pending debug exceptions' field to indicate a
+ * recognized #DB (data or single-step) that has yet to be delivered. Since KVM
+ * represents these debug traps with a payload that is said to be compatible
+ * with the 'pending debug exceptions' field, write the payload to the VMCS
+ * field if a VM-exit is delivered before the debug trap.
+ */
+static void nested_vmx_update_pending_dbg(struct kvm_vcpu *vcpu)
+{
+	if (vmx_pending_dbg_trap(vcpu))
+		vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
+			    vcpu->arch.exception.payload);
+}
+
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3587,6 +3614,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
 		if (block_nested_events)
 			return -EBUSY;
+		nested_vmx_update_pending_dbg(vcpu);
 		clear_bit(KVM_APIC_INIT, &apic->pending_events);
 		nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0);
 		return 0;
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
@ 2020-02-07 10:36 ` Oliver Upton
  2020-02-07 10:36 ` [PATCH v3 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

KVM allows the deferral of exception payloads when a vCPU is in guest
mode to allow the L1 hypervisor to intercept certain events (#PF, #DB)
before register state has been modified. However, this behavior is
incompatible with the KVM_{GET,SET}_VCPU_EVENTS ABI, as userspace
expects register state to have been immediately modified. Userspace may
opt-in for the payload deferral behavior with the
KVM_CAP_EXCEPTION_PAYLOAD per-VM capability. As such,
kvm_multiple_exception() will immediately manipulate guest registers if
the capability hasn't been requested.

Since the deferral is only necessary if a userspace ioctl were to be
serviced at the same as a payload bearing exception is recognized, this
behavior can be relaxed. Instead, opportunistically defer the payload
from kvm_multiple_exception() and deliver the payload before completing
a KVM_GET_VCPU_EVENTS ioctl.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 95b753dab207..4d3310df1758 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -498,19 +498,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 		vcpu->arch.exception.error_code = error_code;
 		vcpu->arch.exception.has_payload = has_payload;
 		vcpu->arch.exception.payload = payload;
-		/*
-		 * In guest mode, payload delivery should be deferred,
-		 * so that the L1 hypervisor can intercept #PF before
-		 * CR2 is modified (or intercept #DB before DR6 is
-		 * modified under nVMX).  However, for ABI
-		 * compatibility with KVM_GET_VCPU_EVENTS and
-		 * KVM_SET_VCPU_EVENTS, we can't delay payload
-		 * delivery unless userspace has enabled this
-		 * functionality via the per-VM capability,
-		 * KVM_CAP_EXCEPTION_PAYLOAD.
-		 */
-		if (!vcpu->kvm->arch.exception_payload_enabled ||
-		    !is_guest_mode(vcpu))
+		if (!is_guest_mode(vcpu))
 			kvm_deliver_exception_payload(vcpu);
 		return;
 	}
@@ -3803,6 +3791,21 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
 {
 	process_nmi(vcpu);
 
+	/*
+	 * In guest mode, payload delivery should be deferred,
+	 * so that the L1 hypervisor can intercept #PF before
+	 * CR2 is modified (or intercept #DB before DR6 is
+	 * modified under nVMX). Unless the per-VM capability,
+	 * KVM_CAP_EXCEPTION_PAYLOAD, is set, we may not defer the delivery of
+	 * an exception payload and handle after a KVM_GET_VCPU_EVENTS. Since we
+	 * opportunistically defer the exception payload, deliver it if the
+	 * capability hasn't been requested before processing a
+	 * KVM_GET_VCPU_EVENTS.
+	 */
+	if (!vcpu->kvm->arch.exception_payload_enabled &&
+	    vcpu->arch.exception.pending && vcpu->arch.exception.has_payload)
+		kvm_deliver_exception_payload(vcpu);
+
 	/*
 	 * The API doesn't provide the instruction length for software
 	 * exceptions, so don't report them. As long as the guest RIP
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (2 preceding siblings ...)
  2020-02-07 10:36 ` [PATCH v3 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
@ 2020-02-07 10:36 ` Oliver Upton
  2020-02-07 10:36 ` [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
  2020-02-12 11:34 ` [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Paolo Bonzini
  5 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

Since commit 5f3d45e7f282 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm.c              |  1 +
 arch/x86/kvm/vmx/nested.c       | 35 ++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/nested.h       |  5 +++++
 arch/x86/kvm/vmx/vmx.c          | 37 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h          |  3 +++
 arch/x86/kvm/x86.c              |  2 ++
 8 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4dffbc10d3f8..b930c548674c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1112,6 +1112,7 @@ struct kvm_x86_ops {
 	int (*handle_exit)(struct kvm_vcpu *vcpu,
 		enum exit_fastpath_completion exit_fastpath);
 	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+	void (*update_emulated_instruction)(struct kvm_vcpu *vcpu);
 	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
 	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
 	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 503d3f42da16..3f3f780c8c65 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -390,6 +390,7 @@ struct kvm_sync_regs {
 #define KVM_STATE_NESTED_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_RUN_PENDING	0x00000002
 #define KVM_STATE_NESTED_EVMCS		0x00000004
+#define KVM_STATE_NESTED_MTF_PENDING	0x00000008
 
 #define KVM_STATE_NESTED_SMM_GUEST_MODE	0x00000001
 #define KVM_STATE_NESTED_SMM_VMXON	0x00000002
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a3e32d61d60c..fa923b5f8158 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7433,6 +7433,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.run = svm_vcpu_run,
 	.handle_exit = handle_exit,
 	.skip_emulated_instruction = skip_emulated_instruction,
+	.update_emulated_instruction = NULL,
 	.set_interrupt_shadow = svm_set_interrupt_shadow,
 	.get_interrupt_shadow = svm_get_interrupt_shadow,
 	.patch_hypercall = svm_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1586aaae3a6f..93441821b24e 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3608,8 +3608,15 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 	unsigned long exit_qual;
 	bool block_nested_events =
 	    vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
+	bool mtf_pending = vmx->nested.mtf_pending;
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
+	/*
+	 * Clear the MTF state. If a higher priority VM-exit is delivered first,
+	 * this state is discarded.
+	 */
+	vmx->nested.mtf_pending = false;
+
 	if (lapic_in_kernel(vcpu) &&
 		test_bit(KVM_APIC_INIT, &apic->pending_events)) {
 		if (block_nested_events)
@@ -3620,8 +3627,28 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 		return 0;
 	}
 
+	/*
+	 * Process any exceptions that are not debug traps before MTF.
+	 */
+	if (vcpu->arch.exception.pending &&
+	    !vmx_pending_dbg_trap(vcpu) &&
+	    nested_vmx_check_exception(vcpu, &exit_qual)) {
+		if (block_nested_events)
+			return -EBUSY;
+		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+		return 0;
+	}
+
+	if (mtf_pending) {
+		if (block_nested_events)
+			return -EBUSY;
+		nested_vmx_update_pending_dbg(vcpu);
+		nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
+		return 0;
+	}
+
 	if (vcpu->arch.exception.pending &&
-		nested_vmx_check_exception(vcpu, &exit_qual)) {
+	    nested_vmx_check_exception(vcpu, &exit_qual)) {
 		if (block_nested_events)
 			return -EBUSY;
 		nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
@@ -5711,6 +5738,9 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 
 			if (vmx->nested.nested_run_pending)
 				kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
+
+			if (vmx->nested.mtf_pending)
+				kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
 		}
 	}
 
@@ -5891,6 +5921,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 	vmx->nested.nested_run_pending =
 		!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
 
+	vmx->nested.mtf_pending =
+		!!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
+
 	ret = -EINVAL;
 	if (nested_cpu_has_shadow_vmcs(vmcs12) &&
 	    vmcs12->vmcs_link_pointer != -1ull) {
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index fc874d4ead0f..e12461776151 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -175,6 +175,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
 	return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
 }
 
+static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
+{
+	return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
+}
+
 static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
 {
 	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9a6664886f2e..90102830eeb3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1599,6 +1599,40 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+
+/*
+ * Recognizes a pending MTF VM-exit and records the nested state for later
+ * delivery.
+ */
+static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (!is_guest_mode(vcpu))
+		return;
+
+	/*
+	 * Per the SDM, MTF takes priority over debug-trap exceptions besides
+	 * T-bit traps. As instruction emulation is completed (i.e. at the
+	 * instruction boundary), any #DB exception pending delivery must be a
+	 * debug-trap. Record the pending MTF state to be delivered in
+	 * vmx_check_nested_events().
+	 */
+	if (nested_cpu_has_mtf(vmcs12) &&
+	    (!vcpu->arch.exception.pending ||
+	     vcpu->arch.exception.nr == DB_VECTOR))
+		vmx->nested.mtf_pending = true;
+	else
+		vmx->nested.mtf_pending = false;
+}
+
+static void vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+	vmx_update_emulated_instruction(vcpu);
+	return skip_emulated_instruction(vcpu);
+}
+
 static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -7783,7 +7817,8 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 
 	.run = vmx_vcpu_run,
 	.handle_exit = vmx_handle_exit,
-	.skip_emulated_instruction = skip_emulated_instruction,
+	.skip_emulated_instruction = vmx_skip_emulated_instruction,
+	.update_emulated_instruction = vmx_update_emulated_instruction,
 	.set_interrupt_shadow = vmx_set_interrupt_shadow,
 	.get_interrupt_shadow = vmx_get_interrupt_shadow,
 	.patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 7f42cf3dcd70..e64da06c7009 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -150,6 +150,9 @@ struct nested_vmx {
 	/* L2 must run next, and mustn't decide to exit to L1. */
 	bool nested_run_pending;
 
+	/* Pending MTF VM-exit into L1.  */
+	bool mtf_pending;
+
 	struct loaded_vmcs vmcs02;
 
 	/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4d3310df1758..7dbe48072c54 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6891,6 +6891,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 			kvm_rip_write(vcpu, ctxt->eip);
 			if (r && ctxt->tf)
 				r = kvm_vcpu_do_singlestep(vcpu);
+			if (kvm_x86_ops->update_emulated_instruction)
+				kvm_x86_ops->update_emulated_instruction(vcpu);
 			__kvm_set_rflags(vcpu, ctxt->eflags);
 		}
 
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (3 preceding siblings ...)
  2020-02-07 10:36 ` [PATCH v3 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
@ 2020-02-07 10:36 ` Oliver Upton
  2020-02-25  0:09   ` Oliver Upton
  2020-02-12 11:34 ` [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Paolo Bonzini
  5 siblings, 1 reply; 15+ messages in thread
From: Oliver Upton @ 2020-02-07 10:36 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson,
	Oliver Upton

Test to verify that MTF VM-exits into host are synthesized when the
'monitor trap flag' processor-based VM-execution control is set under
various conditions.

Expect an MTF VM-exit if instruction execution produces no events other
than MTF. Should instruction execution produce a concurrent debug-trap
and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
VMCS field set. Expect an MTF VM-exit to follow event delivery should
instruction execution generate a higher-priority event, such as a
general-protection fault. Lastly, expect an MTF VM-exit to follow
delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx.h       |   1 +
 x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 158 insertions(+)

diff --git a/x86/vmx.h b/x86/vmx.h
index 6214400f2b53..6adf0916564b 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -399,6 +399,7 @@ enum Ctrl0 {
 	CPU_NMI_WINDOW		= 1ul << 22,
 	CPU_IO			= 1ul << 24,
 	CPU_IO_BITMAP		= 1ul << 25,
+	CPU_MTF			= 1ul << 27,
 	CPU_MSR_BITMAP		= 1ul << 28,
 	CPU_MONITOR		= 1ul << 29,
 	CPU_PAUSE		= 1ul << 30,
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index b31c360c5f3c..0e2c2f8a7d34 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
 	vmcs_write(EXI_CONTROLS, saved_exit);
 }
 
+extern unsigned char test_mtf1;
+extern unsigned char test_mtf2;
+extern unsigned char test_mtf3;
+
+__attribute__((noclone)) static void test_mtf_guest(void)
+{
+	asm ("vmcall;\n\t"
+	     "out %al, $0x80;\n\t"
+	     "test_mtf1:\n\t"
+	     "vmcall;\n\t"
+	     "out %al, $0x80;\n\t"
+	     "test_mtf2:\n\t"
+	     /*
+	      * Prepare for the 'MOV CR3' test. Attempt to induce a
+	      * general-protection fault by moving a non-canonical address into
+	      * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
+	      * so we must MOV the desired value into a register first.
+	      *
+	      * MOV RAX is done before the VMCALL such that MTF is only enabled
+	      * for the instruction under test.
+	      */
+	     "mov $0x8000000000000000, %rax;\n\t"
+	     "vmcall;\n\t"
+	     "mov %rax, %cr3;\n\t"
+	     "test_mtf3:\n\t"
+	     "vmcall;\n\t"
+	     /*
+	      * ICEBP/INT1 instruction. Though the instruction is now
+	      * documented, don't rely on assemblers enumerating the
+	      * instruction. Resort to hand assembly.
+	      */
+	     ".byte 0xf1;\n\t");
+}
+
+static void test_mtf_gp_handler(struct ex_regs *regs)
+{
+	regs->rip = (unsigned long) &test_mtf3;
+}
+
+static void test_mtf_db_handler(struct ex_regs *regs)
+{
+}
+
+static void enable_mtf(void)
+{
+	u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
+
+	vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
+}
+
+static void disable_mtf(void)
+{
+	u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
+
+	vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
+}
+
+static void enable_tf(void)
+{
+	unsigned long rflags = vmcs_read(GUEST_RFLAGS);
+
+	vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
+}
+
+static void disable_tf(void)
+{
+	unsigned long rflags = vmcs_read(GUEST_RFLAGS);
+
+	vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
+}
+
+static void report_mtf(const char *insn_name, unsigned long exp_rip)
+{
+	unsigned long rip = vmcs_read(GUEST_RIP);
+
+	assert_exit_reason(VMX_MTF);
+	report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
+	       insn_name, rip, exp_rip);
+}
+
+static void vmx_mtf_test(void)
+{
+	unsigned long pending_dbg;
+	handler old_gp, old_db;
+
+	if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
+		printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
+		return;
+	}
+
+	test_set_guest(test_mtf_guest);
+
+	/* Expect an MTF VM-exit after OUT instruction */
+	enter_guest();
+	skip_exit_vmcall();
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("OUT", (unsigned long) &test_mtf1);
+	disable_mtf();
+
+	/*
+	 * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
+	 * VM-exit with populated 'pending debug exceptions' VMCS field.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	enable_mtf();
+	enable_tf();
+
+	enter_guest();
+	report_mtf("OUT", (unsigned long) &test_mtf2);
+	pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
+	report(pending_dbg & DR_STEP,
+	       "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
+	       pending_dbg, (unsigned long) DR_STEP);
+
+	disable_mtf();
+	disable_tf();
+	vmcs_write(GUEST_PENDING_DEBUG, 0);
+
+	/*
+	 * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
+	 * advanced to first instruction of #GP handler.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
+	disable_mtf();
+
+	/*
+	 * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
+	 * MTF should follow the delivery of #DB trap, though the SDM doesn't
+	 * provide clear indication of the relative priority.
+	 */
+	enter_guest();
+	skip_exit_vmcall();
+
+	handle_exception(GP_VECTOR, old_gp);
+	old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
+
+	enable_mtf();
+	enter_guest();
+	report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
+	disable_mtf();
+
+	enter_guest();
+	handle_exception(DB_VECTOR, old_db);
+}
+
 /*
  * Tests for VM-execution control fields
  */
@@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
 	TEST(atomic_switch_max_msrs_test),
 	TEST(atomic_switch_overflow_msrs_test),
 	TEST(rdtsc_vmexit_diff_test),
+	TEST(vmx_mtf_test),
 	{ NULL, NULL, NULL, NULL, NULL, {0} },
 };
-- 
2.25.0.341.g760bfbb309-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 0/5] Handle monitor trap flag during instruction emulation
  2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
                   ` (4 preceding siblings ...)
  2020-02-07 10:36 ` [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
@ 2020-02-12 11:34 ` Paolo Bonzini
  2020-02-27  0:22   ` Oliver Upton
  5 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-02-12 11:34 UTC (permalink / raw)
  To: Oliver Upton, kvm; +Cc: Jim Mattson, Peter Shier, Sean Christopherson

On 07/02/20 11:36, Oliver Upton wrote:
> v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@google.com
> v2: http://lore.kernel.org/r/20200128092715.69429-1-oupton@google.com
> 
> v1 => v2:
>  - Don't split the #DB delivery by vendors. Unconditionally injecting
>    #DB payloads into the 'pending debug exceptions' field will cause KVM
>    to get stuck in a loop. Per the SDM, when hardware injects an event
>    resulting from this field's value, it is checked against the
>    exception interception bitmap.
>  - Address Sean's comments by injecting the VM-exit into L1 from
>    vmx_check_nested_events().
>  - Added fix for nested INIT VM-exits + 'pending debug exceptions' field
>    as it was noticed in implementing v2.
>  - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed
>    since v1.
> 
> v2 => v3:
>  - Merge the check/set_pending_dbg helpers into a single helper,
>    vmx_update_pending_dbg(). Add clarifying comment to this helper.
>  - Rewrite commit message, descriptive comment for change in 3/5 to
>    explicitly describe the reason for mutating payload delivery
>    behavior.
>  - Undo the changes to kvm_vcpu_do_singlestep(). Instead, add a new hook
>    to call for 'full' instruction emulation + 'fast' emulation.
> 
> KVM already provides guests the ability to use the 'monitor trap flag'
> VM-execution control. Support for this flag is provided by the fact that
> KVM unconditionally forwards MTF VM-exits to the guest (if requested),
> as KVM doesn't utilize MTF. While this provides support during hardware
> instruction execution, it is insufficient for instruction emulation.
> 
> Should L0 emulate an instruction on the behalf of L2, L0 should also
> synthesize an MTF VM-exit into L1, should control be set.
> 
> The first patch corrects a nuanced difference between the definition of
> a #DB exception payload field and DR6 register. Mask off bit 12 which is
> defined in the 'pending debug exceptions' field when applying to DR6,
> since the payload field is said to be compatible with the aforementioned
> VMCS field.
> 
> The second patch sets the 'pending debug exceptions' VMCS field when
> delivering an INIT signal VM-exit to L1, as described in the SDM. This
> patch also introduces helpers for setting the 'pending debug exceptions'
> VMCS field.
> 
> The third patch massages KVM's handling of exception payloads with
> regard to API compatibility. Rather than immediately delivering the
> payload w/o opt-in, instead defer the payload + inject
> before completing a KVM_GET_VCPU_EVENTS. This maintains API
> compatibility whilst correcting #DB behavior with regard to higher
> priority VM-exit events.
> 
> Fourth patch introduces MTF implementation for emulated instructions.
> Identify if an MTF is due on an instruction boundary from
> kvm_vcpu_do_singlestep(), however only deliver this VM-exit from
> vmx_check_nested_events() to respect the relative prioritization to
> other VM-exits. Since this augments the nested state, introduce a new
> flag for (de)serialization.
> 
> Last patch adds tests to kvm-unit-tests to assert the correctness of MTF
> under several conditions (concurrent #DB trap, #DB fault, etc). These
> tests pass under virtualization with this series as well as on
> bare-metal.
> 
> Based on commit 2c2787938512 ("KVM: selftests: Stop memslot creation in
> KVM internal memslot region").
> 
> Oliver Upton (4):
>   KVM: x86: Mask off reserved bit from #DB exception payload
>   KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
>   KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
>   KVM: nVMX: Emulate MTF when performing instruction emulation
> 
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/include/uapi/asm/kvm.h |  1 +
>  arch/x86/kvm/svm.c              |  1 +
>  arch/x86/kvm/vmx/nested.c       | 54 ++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/nested.h       |  5 +++
>  arch/x86/kvm/vmx/vmx.c          | 37 +++++++++++++++++++++-
>  arch/x86/kvm/vmx/vmx.h          |  3 ++
>  arch/x86/kvm/x86.c              | 39 ++++++++++++++++--------
>  8 files changed, 126 insertions(+), 15 deletions(-)
> 

Queued (for 5.6-rc2), thanks.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-07 10:36 ` [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
@ 2020-02-25  0:09   ` Oliver Upton
  2020-02-25  7:36     ` Paolo Bonzini
  2020-02-26  0:13     ` Krish Sadhukhan
  0 siblings, 2 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-25  0:09 UTC (permalink / raw)
  To: kvm list; +Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson

On Fri, Feb 7, 2020 at 2:36 AM Oliver Upton <oupton@google.com> wrote:
>
> Test to verify that MTF VM-exits into host are synthesized when the
> 'monitor trap flag' processor-based VM-execution control is set under
> various conditions.
>
> Expect an MTF VM-exit if instruction execution produces no events other
> than MTF. Should instruction execution produce a concurrent debug-trap
> and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
> VMCS field set. Expect an MTF VM-exit to follow event delivery should
> instruction execution generate a higher-priority event, such as a
> general-protection fault. Lastly, expect an MTF VM-exit to follow
> delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).
>
> Signed-off-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> ---
>  x86/vmx.h       |   1 +
>  x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 158 insertions(+)
>
> diff --git a/x86/vmx.h b/x86/vmx.h
> index 6214400f2b53..6adf0916564b 100644
> --- a/x86/vmx.h
> +++ b/x86/vmx.h
> @@ -399,6 +399,7 @@ enum Ctrl0 {
>         CPU_NMI_WINDOW          = 1ul << 22,
>         CPU_IO                  = 1ul << 24,
>         CPU_IO_BITMAP           = 1ul << 25,
> +       CPU_MTF                 = 1ul << 27,
>         CPU_MSR_BITMAP          = 1ul << 28,
>         CPU_MONITOR             = 1ul << 29,
>         CPU_PAUSE               = 1ul << 30,
> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> index b31c360c5f3c..0e2c2f8a7d34 100644
> --- a/x86/vmx_tests.c
> +++ b/x86/vmx_tests.c
> @@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
>         vmcs_write(EXI_CONTROLS, saved_exit);
>  }
>
> +extern unsigned char test_mtf1;
> +extern unsigned char test_mtf2;
> +extern unsigned char test_mtf3;
> +
> +__attribute__((noclone)) static void test_mtf_guest(void)
> +{
> +       asm ("vmcall;\n\t"
> +            "out %al, $0x80;\n\t"
> +            "test_mtf1:\n\t"
> +            "vmcall;\n\t"
> +            "out %al, $0x80;\n\t"
> +            "test_mtf2:\n\t"
> +            /*
> +             * Prepare for the 'MOV CR3' test. Attempt to induce a
> +             * general-protection fault by moving a non-canonical address into
> +             * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
> +             * so we must MOV the desired value into a register first.
> +             *
> +             * MOV RAX is done before the VMCALL such that MTF is only enabled
> +             * for the instruction under test.
> +             */
> +            "mov $0x8000000000000000, %rax;\n\t"
> +            "vmcall;\n\t"
> +            "mov %rax, %cr3;\n\t"
> +            "test_mtf3:\n\t"
> +            "vmcall;\n\t"
> +            /*
> +             * ICEBP/INT1 instruction. Though the instruction is now
> +             * documented, don't rely on assemblers enumerating the
> +             * instruction. Resort to hand assembly.
> +             */
> +            ".byte 0xf1;\n\t");
> +}
> +
> +static void test_mtf_gp_handler(struct ex_regs *regs)
> +{
> +       regs->rip = (unsigned long) &test_mtf3;
> +}
> +
> +static void test_mtf_db_handler(struct ex_regs *regs)
> +{
> +}
> +
> +static void enable_mtf(void)
> +{
> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> +
> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
> +}
> +
> +static void disable_mtf(void)
> +{
> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> +
> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
> +}
> +
> +static void enable_tf(void)
> +{
> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> +
> +       vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
> +}
> +
> +static void disable_tf(void)
> +{
> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> +
> +       vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
> +}
> +
> +static void report_mtf(const char *insn_name, unsigned long exp_rip)
> +{
> +       unsigned long rip = vmcs_read(GUEST_RIP);
> +
> +       assert_exit_reason(VMX_MTF);
> +       report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
> +              insn_name, rip, exp_rip);
> +}
> +
> +static void vmx_mtf_test(void)
> +{
> +       unsigned long pending_dbg;
> +       handler old_gp, old_db;
> +
> +       if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
> +               printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
> +               return;
> +       }
> +
> +       test_set_guest(test_mtf_guest);
> +
> +       /* Expect an MTF VM-exit after OUT instruction */
> +       enter_guest();
> +       skip_exit_vmcall();
> +
> +       enable_mtf();
> +       enter_guest();
> +       report_mtf("OUT", (unsigned long) &test_mtf1);
> +       disable_mtf();
> +
> +       /*
> +        * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
> +        * VM-exit with populated 'pending debug exceptions' VMCS field.
> +        */
> +       enter_guest();
> +       skip_exit_vmcall();
> +
> +       enable_mtf();
> +       enable_tf();
> +
> +       enter_guest();
> +       report_mtf("OUT", (unsigned long) &test_mtf2);
> +       pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
> +       report(pending_dbg & DR_STEP,
> +              "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
> +              pending_dbg, (unsigned long) DR_STEP);
> +
> +       disable_mtf();
> +       disable_tf();
> +       vmcs_write(GUEST_PENDING_DEBUG, 0);
> +
> +       /*
> +        * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
> +        * advanced to first instruction of #GP handler.
> +        */
> +       enter_guest();
> +       skip_exit_vmcall();
> +
> +       old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
> +
> +       enable_mtf();
> +       enter_guest();
> +       report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
> +       disable_mtf();
> +
> +       /*
> +        * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
> +        * MTF should follow the delivery of #DB trap, though the SDM doesn't
> +        * provide clear indication of the relative priority.
> +        */
> +       enter_guest();
> +       skip_exit_vmcall();
> +
> +       handle_exception(GP_VECTOR, old_gp);
> +       old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
> +
> +       enable_mtf();
> +       enter_guest();
> +       report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
> +       disable_mtf();
> +
> +       enter_guest();
> +       handle_exception(DB_VECTOR, old_db);
> +}
> +
>  /*
>   * Tests for VM-execution control fields
>   */
> @@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
>         TEST(atomic_switch_max_msrs_test),
>         TEST(atomic_switch_overflow_msrs_test),
>         TEST(rdtsc_vmexit_diff_test),
> +       TEST(vmx_mtf_test),
>         { NULL, NULL, NULL, NULL, NULL, {0} },
>  };
> --
> 2.25.0.341.g760bfbb309-goog
>

Friendly ping :) (Just on this patch, 1-4 have already been applied)

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-25  0:09   ` Oliver Upton
@ 2020-02-25  7:36     ` Paolo Bonzini
  2020-02-25  7:52       ` Oliver Upton
  2020-02-26  0:13     ` Krish Sadhukhan
  1 sibling, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-02-25  7:36 UTC (permalink / raw)
  To: Oliver Upton, kvm list; +Cc: Jim Mattson, Peter Shier, Sean Christopherson

On 25/02/20 01:09, Oliver Upton wrote:
> On Fri, Feb 7, 2020 at 2:36 AM Oliver Upton <oupton@google.com> wrote:
>>
>> Test to verify that MTF VM-exits into host are synthesized when the
>> 'monitor trap flag' processor-based VM-execution control is set under
>> various conditions.
>>
>> Expect an MTF VM-exit if instruction execution produces no events other
>> than MTF. Should instruction execution produce a concurrent debug-trap
>> and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
>> VMCS field set. Expect an MTF VM-exit to follow event delivery should
>> instruction execution generate a higher-priority event, such as a
>> general-protection fault. Lastly, expect an MTF VM-exit to follow
>> delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).
>>
>> Signed-off-by: Oliver Upton <oupton@google.com>
>> Reviewed-by: Jim Mattson <jmattson@google.com>
>> ---
>>  x86/vmx.h       |   1 +
>>  x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 158 insertions(+)
>>
>> diff --git a/x86/vmx.h b/x86/vmx.h
>> index 6214400f2b53..6adf0916564b 100644
>> --- a/x86/vmx.h
>> +++ b/x86/vmx.h
>> @@ -399,6 +399,7 @@ enum Ctrl0 {
>>         CPU_NMI_WINDOW          = 1ul << 22,
>>         CPU_IO                  = 1ul << 24,
>>         CPU_IO_BITMAP           = 1ul << 25,
>> +       CPU_MTF                 = 1ul << 27,
>>         CPU_MSR_BITMAP          = 1ul << 28,
>>         CPU_MONITOR             = 1ul << 29,
>>         CPU_PAUSE               = 1ul << 30,
>> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
>> index b31c360c5f3c..0e2c2f8a7d34 100644
>> --- a/x86/vmx_tests.c
>> +++ b/x86/vmx_tests.c
>> @@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
>>         vmcs_write(EXI_CONTROLS, saved_exit);
>>  }
>>
>> +extern unsigned char test_mtf1;
>> +extern unsigned char test_mtf2;
>> +extern unsigned char test_mtf3;
>> +
>> +__attribute__((noclone)) static void test_mtf_guest(void)
>> +{
>> +       asm ("vmcall;\n\t"
>> +            "out %al, $0x80;\n\t"
>> +            "test_mtf1:\n\t"
>> +            "vmcall;\n\t"
>> +            "out %al, $0x80;\n\t"
>> +            "test_mtf2:\n\t"
>> +            /*
>> +             * Prepare for the 'MOV CR3' test. Attempt to induce a
>> +             * general-protection fault by moving a non-canonical address into
>> +             * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
>> +             * so we must MOV the desired value into a register first.
>> +             *
>> +             * MOV RAX is done before the VMCALL such that MTF is only enabled
>> +             * for the instruction under test.
>> +             */
>> +            "mov $0x8000000000000000, %rax;\n\t"
>> +            "vmcall;\n\t"
>> +            "mov %rax, %cr3;\n\t"
>> +            "test_mtf3:\n\t"
>> +            "vmcall;\n\t"
>> +            /*
>> +             * ICEBP/INT1 instruction. Though the instruction is now
>> +             * documented, don't rely on assemblers enumerating the
>> +             * instruction. Resort to hand assembly.
>> +             */
>> +            ".byte 0xf1;\n\t");
>> +}
>> +
>> +static void test_mtf_gp_handler(struct ex_regs *regs)
>> +{
>> +       regs->rip = (unsigned long) &test_mtf3;
>> +}
>> +
>> +static void test_mtf_db_handler(struct ex_regs *regs)
>> +{
>> +}
>> +
>> +static void enable_mtf(void)
>> +{
>> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
>> +
>> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
>> +}
>> +
>> +static void disable_mtf(void)
>> +{
>> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
>> +
>> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
>> +}
>> +
>> +static void enable_tf(void)
>> +{
>> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
>> +
>> +       vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
>> +}
>> +
>> +static void disable_tf(void)
>> +{
>> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
>> +
>> +       vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
>> +}
>> +
>> +static void report_mtf(const char *insn_name, unsigned long exp_rip)
>> +{
>> +       unsigned long rip = vmcs_read(GUEST_RIP);
>> +
>> +       assert_exit_reason(VMX_MTF);
>> +       report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
>> +              insn_name, rip, exp_rip);
>> +}
>> +
>> +static void vmx_mtf_test(void)
>> +{
>> +       unsigned long pending_dbg;
>> +       handler old_gp, old_db;
>> +
>> +       if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
>> +               printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
>> +               return;
>> +       }
>> +
>> +       test_set_guest(test_mtf_guest);
>> +
>> +       /* Expect an MTF VM-exit after OUT instruction */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("OUT", (unsigned long) &test_mtf1);
>> +       disable_mtf();
>> +
>> +       /*
>> +        * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
>> +        * VM-exit with populated 'pending debug exceptions' VMCS field.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       enable_mtf();
>> +       enable_tf();
>> +
>> +       enter_guest();
>> +       report_mtf("OUT", (unsigned long) &test_mtf2);
>> +       pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
>> +       report(pending_dbg & DR_STEP,
>> +              "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
>> +              pending_dbg, (unsigned long) DR_STEP);
>> +
>> +       disable_mtf();
>> +       disable_tf();
>> +       vmcs_write(GUEST_PENDING_DEBUG, 0);
>> +
>> +       /*
>> +        * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
>> +        * advanced to first instruction of #GP handler.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
>> +       disable_mtf();
>> +
>> +       /*
>> +        * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
>> +        * MTF should follow the delivery of #DB trap, though the SDM doesn't
>> +        * provide clear indication of the relative priority.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       handle_exception(GP_VECTOR, old_gp);
>> +       old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
>> +       disable_mtf();
>> +
>> +       enter_guest();
>> +       handle_exception(DB_VECTOR, old_db);
>> +}
>> +
>>  /*
>>   * Tests for VM-execution control fields
>>   */
>> @@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
>>         TEST(atomic_switch_max_msrs_test),
>>         TEST(atomic_switch_overflow_msrs_test),
>>         TEST(rdtsc_vmexit_diff_test),
>> +       TEST(vmx_mtf_test),
>>         { NULL, NULL, NULL, NULL, NULL, {0} },
>>  };
>> --
>> 2.25.0.341.g760bfbb309-goog
>>
> 
> Friendly ping :) (Just on this patch, 1-4 have already been applied)

Done, thanks.  I only had to push, since I had applied it already to
test the KVM changes.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-25  7:36     ` Paolo Bonzini
@ 2020-02-25  7:52       ` Oliver Upton
  0 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-25  7:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm list, Jim Mattson, Peter Shier, Sean Christopherson

On Mon, Feb 24, 2020 at 11:36 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 25/02/20 01:09, Oliver Upton wrote:
> > On Fri, Feb 7, 2020 at 2:36 AM Oliver Upton <oupton@google.com> wrote:
> >>
> >> Test to verify that MTF VM-exits into host are synthesized when the
> >> 'monitor trap flag' processor-based VM-execution control is set under
> >> various conditions.
> >>
> >> Expect an MTF VM-exit if instruction execution produces no events other
> >> than MTF. Should instruction execution produce a concurrent debug-trap
> >> and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
> >> VMCS field set. Expect an MTF VM-exit to follow event delivery should
> >> instruction execution generate a higher-priority event, such as a
> >> general-protection fault. Lastly, expect an MTF VM-exit to follow
> >> delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).
> >>
> >> Signed-off-by: Oliver Upton <oupton@google.com>
> >> Reviewed-by: Jim Mattson <jmattson@google.com>
> >> ---
> >>  x86/vmx.h       |   1 +
> >>  x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 158 insertions(+)
> >>
> >> diff --git a/x86/vmx.h b/x86/vmx.h
> >> index 6214400f2b53..6adf0916564b 100644
> >> --- a/x86/vmx.h
> >> +++ b/x86/vmx.h
> >> @@ -399,6 +399,7 @@ enum Ctrl0 {
> >>         CPU_NMI_WINDOW          = 1ul << 22,
> >>         CPU_IO                  = 1ul << 24,
> >>         CPU_IO_BITMAP           = 1ul << 25,
> >> +       CPU_MTF                 = 1ul << 27,
> >>         CPU_MSR_BITMAP          = 1ul << 28,
> >>         CPU_MONITOR             = 1ul << 29,
> >>         CPU_PAUSE               = 1ul << 30,
> >> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> >> index b31c360c5f3c..0e2c2f8a7d34 100644
> >> --- a/x86/vmx_tests.c
> >> +++ b/x86/vmx_tests.c
> >> @@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
> >>         vmcs_write(EXI_CONTROLS, saved_exit);
> >>  }
> >>
> >> +extern unsigned char test_mtf1;
> >> +extern unsigned char test_mtf2;
> >> +extern unsigned char test_mtf3;
> >> +
> >> +__attribute__((noclone)) static void test_mtf_guest(void)
> >> +{
> >> +       asm ("vmcall;\n\t"
> >> +            "out %al, $0x80;\n\t"
> >> +            "test_mtf1:\n\t"
> >> +            "vmcall;\n\t"
> >> +            "out %al, $0x80;\n\t"
> >> +            "test_mtf2:\n\t"
> >> +            /*
> >> +             * Prepare for the 'MOV CR3' test. Attempt to induce a
> >> +             * general-protection fault by moving a non-canonical address into
> >> +             * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
> >> +             * so we must MOV the desired value into a register first.
> >> +             *
> >> +             * MOV RAX is done before the VMCALL such that MTF is only enabled
> >> +             * for the instruction under test.
> >> +             */
> >> +            "mov $0x8000000000000000, %rax;\n\t"
> >> +            "vmcall;\n\t"
> >> +            "mov %rax, %cr3;\n\t"
> >> +            "test_mtf3:\n\t"
> >> +            "vmcall;\n\t"
> >> +            /*
> >> +             * ICEBP/INT1 instruction. Though the instruction is now
> >> +             * documented, don't rely on assemblers enumerating the
> >> +             * instruction. Resort to hand assembly.
> >> +             */
> >> +            ".byte 0xf1;\n\t");
> >> +}
> >> +
> >> +static void test_mtf_gp_handler(struct ex_regs *regs)
> >> +{
> >> +       regs->rip = (unsigned long) &test_mtf3;
> >> +}
> >> +
> >> +static void test_mtf_db_handler(struct ex_regs *regs)
> >> +{
> >> +}
> >> +
> >> +static void enable_mtf(void)
> >> +{
> >> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> >> +
> >> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
> >> +}
> >> +
> >> +static void disable_mtf(void)
> >> +{
> >> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> >> +
> >> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
> >> +}
> >> +
> >> +static void enable_tf(void)
> >> +{
> >> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> >> +
> >> +       vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
> >> +}
> >> +
> >> +static void disable_tf(void)
> >> +{
> >> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> >> +
> >> +       vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
> >> +}
> >> +
> >> +static void report_mtf(const char *insn_name, unsigned long exp_rip)
> >> +{
> >> +       unsigned long rip = vmcs_read(GUEST_RIP);
> >> +
> >> +       assert_exit_reason(VMX_MTF);
> >> +       report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
> >> +              insn_name, rip, exp_rip);
> >> +}
> >> +
> >> +static void vmx_mtf_test(void)
> >> +{
> >> +       unsigned long pending_dbg;
> >> +       handler old_gp, old_db;
> >> +
> >> +       if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
> >> +               printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
> >> +               return;
> >> +       }
> >> +
> >> +       test_set_guest(test_mtf_guest);
> >> +
> >> +       /* Expect an MTF VM-exit after OUT instruction */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("OUT", (unsigned long) &test_mtf1);
> >> +       disable_mtf();
> >> +
> >> +       /*
> >> +        * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
> >> +        * VM-exit with populated 'pending debug exceptions' VMCS field.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       enable_mtf();
> >> +       enable_tf();
> >> +
> >> +       enter_guest();
> >> +       report_mtf("OUT", (unsigned long) &test_mtf2);
> >> +       pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
> >> +       report(pending_dbg & DR_STEP,
> >> +              "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
> >> +              pending_dbg, (unsigned long) DR_STEP);
> >> +
> >> +       disable_mtf();
> >> +       disable_tf();
> >> +       vmcs_write(GUEST_PENDING_DEBUG, 0);
> >> +
> >> +       /*
> >> +        * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
> >> +        * advanced to first instruction of #GP handler.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
> >> +       disable_mtf();
> >> +
> >> +       /*
> >> +        * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
> >> +        * MTF should follow the delivery of #DB trap, though the SDM doesn't
> >> +        * provide clear indication of the relative priority.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       handle_exception(GP_VECTOR, old_gp);
> >> +       old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
> >> +       disable_mtf();
> >> +
> >> +       enter_guest();
> >> +       handle_exception(DB_VECTOR, old_db);
> >> +}
> >> +
> >>  /*
> >>   * Tests for VM-execution control fields
> >>   */
> >> @@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
> >>         TEST(atomic_switch_max_msrs_test),
> >>         TEST(atomic_switch_overflow_msrs_test),
> >>         TEST(rdtsc_vmexit_diff_test),
> >> +       TEST(vmx_mtf_test),
> >>         { NULL, NULL, NULL, NULL, NULL, {0} },
> >>  };
> >> --
> >> 2.25.0.341.g760bfbb309-goog
> >>
> >
> > Friendly ping :) (Just on this patch, 1-4 have already been applied)
>
> Done, thanks.  I only had to push, since I had applied it already to
> test the KVM changes.
>
> Paolo
>

Thanks Paolo :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-25  0:09   ` Oliver Upton
  2020-02-25  7:36     ` Paolo Bonzini
@ 2020-02-26  0:13     ` Krish Sadhukhan
  2020-02-26  0:55       ` Oliver Upton
  1 sibling, 1 reply; 15+ messages in thread
From: Krish Sadhukhan @ 2020-02-26  0:13 UTC (permalink / raw)
  To: Oliver Upton, kvm list
  Cc: Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson



On 02/24/2020 04:09 PM, Oliver Upton wrote:
> On Fri, Feb 7, 2020 at 2:36 AM Oliver Upton <oupton@google.com> wrote:
>> Test to verify that MTF VM-exits into host are synthesized when the
>> 'monitor trap flag' processor-based VM-execution control is set under
>> various conditions.
>>
>> Expect an MTF VM-exit if instruction execution produces no events other
>> than MTF. Should instruction execution produce a concurrent debug-trap
>> and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
>> VMCS field set. Expect an MTF VM-exit to follow event delivery should
>> instruction execution generate a higher-priority event, such as a
>> general-protection fault. Lastly, expect an MTF VM-exit to follow
>> delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).
>>
>> Signed-off-by: Oliver Upton <oupton@google.com>
>> Reviewed-by: Jim Mattson <jmattson@google.com>
>> ---
>>   x86/vmx.h       |   1 +
>>   x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 158 insertions(+)
>>
>> diff --git a/x86/vmx.h b/x86/vmx.h
>> index 6214400f2b53..6adf0916564b 100644
>> --- a/x86/vmx.h
>> +++ b/x86/vmx.h
>> @@ -399,6 +399,7 @@ enum Ctrl0 {
>>          CPU_NMI_WINDOW          = 1ul << 22,
>>          CPU_IO                  = 1ul << 24,
>>          CPU_IO_BITMAP           = 1ul << 25,
>> +       CPU_MTF                 = 1ul << 27,
>>          CPU_MSR_BITMAP          = 1ul << 28,
>>          CPU_MONITOR             = 1ul << 29,
>>          CPU_PAUSE               = 1ul << 30,
>> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
>> index b31c360c5f3c..0e2c2f8a7d34 100644
>> --- a/x86/vmx_tests.c
>> +++ b/x86/vmx_tests.c
>> @@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
>>          vmcs_write(EXI_CONTROLS, saved_exit);
>>   }
>>
>> +extern unsigned char test_mtf1;
>> +extern unsigned char test_mtf2;
>> +extern unsigned char test_mtf3;
>> +
>> +__attribute__((noclone)) static void test_mtf_guest(void)
>> +{
>> +       asm ("vmcall;\n\t"
>> +            "out %al, $0x80;\n\t"
>> +            "test_mtf1:\n\t"
>> +            "vmcall;\n\t"
>> +            "out %al, $0x80;\n\t"
>> +            "test_mtf2:\n\t"
>> +            /*
>> +             * Prepare for the 'MOV CR3' test. Attempt to induce a
>> +             * general-protection fault by moving a non-canonical address into
>> +             * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
>> +             * so we must MOV the desired value into a register first.
>> +             *
>> +             * MOV RAX is done before the VMCALL such that MTF is only enabled
>> +             * for the instruction under test.
>> +             */
>> +            "mov $0x8000000000000000, %rax;\n\t"
>> +            "vmcall;\n\t"
>> +            "mov %rax, %cr3;\n\t"
>> +            "test_mtf3:\n\t"
>> +            "vmcall;\n\t"
>> +            /*
>> +             * ICEBP/INT1 instruction. Though the instruction is now
>> +             * documented, don't rely on assemblers enumerating the
>> +             * instruction. Resort to hand assembly.
>> +             */
>> +            ".byte 0xf1;\n\t");
>> +}
>> +
>> +static void test_mtf_gp_handler(struct ex_regs *regs)
>> +{
>> +       regs->rip = (unsigned long) &test_mtf3;
>> +}
>> +
>> +static void test_mtf_db_handler(struct ex_regs *regs)
>> +{
>> +}
>> +
>> +static void enable_mtf(void)
>> +{
>> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
>> +
>> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
>> +}
>> +
>> +static void disable_mtf(void)
>> +{
>> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
>> +
>> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
>> +}
>> +
>> +static void enable_tf(void)
>> +{
>> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
>> +
>> +       vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
>> +}
>> +
>> +static void disable_tf(void)
>> +{
>> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
>> +
>> +       vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
>> +}
>> +
>> +static void report_mtf(const char *insn_name, unsigned long exp_rip)
>> +{
>> +       unsigned long rip = vmcs_read(GUEST_RIP);
>> +
>> +       assert_exit_reason(VMX_MTF);
>> +       report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
>> +              insn_name, rip, exp_rip);
>> +}
>> +
>> +static void vmx_mtf_test(void)
>> +{
>> +       unsigned long pending_dbg;
>> +       handler old_gp, old_db;
>> +
>> +       if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
>> +               printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
>> +               return;
>> +       }
>> +
>> +       test_set_guest(test_mtf_guest);
>> +
>> +       /* Expect an MTF VM-exit after OUT instruction */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("OUT", (unsigned long) &test_mtf1);
>> +       disable_mtf();
>> +
>> +       /*
>> +        * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
>> +        * VM-exit with populated 'pending debug exceptions' VMCS field.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       enable_mtf();
>> +       enable_tf();
>> +
>> +       enter_guest();
>> +       report_mtf("OUT", (unsigned long) &test_mtf2);
>> +       pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
>> +       report(pending_dbg & DR_STEP,
>> +              "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
>> +              pending_dbg, (unsigned long) DR_STEP);
>> +
>> +       disable_mtf();
>> +       disable_tf();
>> +       vmcs_write(GUEST_PENDING_DEBUG, 0);
>> +
>> +       /*
>> +        * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
>> +        * advanced to first instruction of #GP handler.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
>> +       disable_mtf();
>> +
>> +       /*
>> +        * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
>> +        * MTF should follow the delivery of #DB trap, though the SDM doesn't
>> +        * provide clear indication of the relative priority.
>> +        */
>> +       enter_guest();
>> +       skip_exit_vmcall();
>> +
>> +       handle_exception(GP_VECTOR, old_gp);
>> +       old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
>> +
>> +       enable_mtf();
>> +       enter_guest();
>> +       report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
>> +       disable_mtf();
>> +
>> +       enter_guest();
>> +       handle_exception(DB_VECTOR, old_db);
>> +}
>> +
>>   /*
>>    * Tests for VM-execution control fields
>>    */
>> @@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
>>          TEST(atomic_switch_max_msrs_test),
>>          TEST(atomic_switch_overflow_msrs_test),
>>          TEST(rdtsc_vmexit_diff_test),
>> +       TEST(vmx_mtf_test),
>>          { NULL, NULL, NULL, NULL, NULL, {0} },
>>   };
>> --
>> 2.25.0.341.g760bfbb309-goog
>>
> Friendly ping :) (Just on this patch, 1-4 have already been applied)
>
> --
> Thanks,
> Oliver
I know Paolo has already applied this patch. Wanted to send out the 
comments anyway as I was reviewing it...

     1. The commit header should have "nVMX" instead of "VMX". That's 
important for searching git history.
     2. It would be better for readability if the commit message lists 
the test scenarios as numbered items or bullet points instead of putting 
them all in the same paragraph.
     3. Section 25.5.2 in the SDM has more scenarios for MTF VM-exits, 
but the test covers only a few scenarios like #DB, #GP etc.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag
  2020-02-26  0:13     ` Krish Sadhukhan
@ 2020-02-26  0:55       ` Oliver Upton
  0 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-26  0:55 UTC (permalink / raw)
  To: Krish Sadhukhan
  Cc: kvm list, Paolo Bonzini, Jim Mattson, Peter Shier, Sean Christopherson

On Tue, Feb 25, 2020 at 4:13 PM Krish Sadhukhan
<krish.sadhukhan@oracle.com> wrote:
>
>
>
> On 02/24/2020 04:09 PM, Oliver Upton wrote:
> > On Fri, Feb 7, 2020 at 2:36 AM Oliver Upton <oupton@google.com> wrote:
> >> Test to verify that MTF VM-exits into host are synthesized when the
> >> 'monitor trap flag' processor-based VM-execution control is set under
> >> various conditions.
> >>
> >> Expect an MTF VM-exit if instruction execution produces no events other
> >> than MTF. Should instruction execution produce a concurrent debug-trap
> >> and MTF event, expect an MTF VM-exit with the 'pending debug exceptions'
> >> VMCS field set. Expect an MTF VM-exit to follow event delivery should
> >> instruction execution generate a higher-priority event, such as a
> >> general-protection fault. Lastly, expect an MTF VM-exit to follow
> >> delivery of a debug-trap software exception (INT1/INT3/INTO/INT n).
> >>
> >> Signed-off-by: Oliver Upton <oupton@google.com>
> >> Reviewed-by: Jim Mattson <jmattson@google.com>
> >> ---
> >>   x86/vmx.h       |   1 +
> >>   x86/vmx_tests.c | 157 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>   2 files changed, 158 insertions(+)
> >>
> >> diff --git a/x86/vmx.h b/x86/vmx.h
> >> index 6214400f2b53..6adf0916564b 100644
> >> --- a/x86/vmx.h
> >> +++ b/x86/vmx.h
> >> @@ -399,6 +399,7 @@ enum Ctrl0 {
> >>          CPU_NMI_WINDOW          = 1ul << 22,
> >>          CPU_IO                  = 1ul << 24,
> >>          CPU_IO_BITMAP           = 1ul << 25,
> >> +       CPU_MTF                 = 1ul << 27,
> >>          CPU_MSR_BITMAP          = 1ul << 28,
> >>          CPU_MONITOR             = 1ul << 29,
> >>          CPU_PAUSE               = 1ul << 30,
> >> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> >> index b31c360c5f3c..0e2c2f8a7d34 100644
> >> --- a/x86/vmx_tests.c
> >> +++ b/x86/vmx_tests.c
> >> @@ -4970,6 +4970,162 @@ static void test_vmx_preemption_timer(void)
> >>          vmcs_write(EXI_CONTROLS, saved_exit);
> >>   }
> >>
> >> +extern unsigned char test_mtf1;
> >> +extern unsigned char test_mtf2;
> >> +extern unsigned char test_mtf3;
> >> +
> >> +__attribute__((noclone)) static void test_mtf_guest(void)
> >> +{
> >> +       asm ("vmcall;\n\t"
> >> +            "out %al, $0x80;\n\t"
> >> +            "test_mtf1:\n\t"
> >> +            "vmcall;\n\t"
> >> +            "out %al, $0x80;\n\t"
> >> +            "test_mtf2:\n\t"
> >> +            /*
> >> +             * Prepare for the 'MOV CR3' test. Attempt to induce a
> >> +             * general-protection fault by moving a non-canonical address into
> >> +             * CR3. The 'MOV CR3' instruction does not take an imm64 operand,
> >> +             * so we must MOV the desired value into a register first.
> >> +             *
> >> +             * MOV RAX is done before the VMCALL such that MTF is only enabled
> >> +             * for the instruction under test.
> >> +             */
> >> +            "mov $0x8000000000000000, %rax;\n\t"
> >> +            "vmcall;\n\t"
> >> +            "mov %rax, %cr3;\n\t"
> >> +            "test_mtf3:\n\t"
> >> +            "vmcall;\n\t"
> >> +            /*
> >> +             * ICEBP/INT1 instruction. Though the instruction is now
> >> +             * documented, don't rely on assemblers enumerating the
> >> +             * instruction. Resort to hand assembly.
> >> +             */
> >> +            ".byte 0xf1;\n\t");
> >> +}
> >> +
> >> +static void test_mtf_gp_handler(struct ex_regs *regs)
> >> +{
> >> +       regs->rip = (unsigned long) &test_mtf3;
> >> +}
> >> +
> >> +static void test_mtf_db_handler(struct ex_regs *regs)
> >> +{
> >> +}
> >> +
> >> +static void enable_mtf(void)
> >> +{
> >> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> >> +
> >> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 | CPU_MTF);
> >> +}
> >> +
> >> +static void disable_mtf(void)
> >> +{
> >> +       u32 ctrl0 = vmcs_read(CPU_EXEC_CTRL0);
> >> +
> >> +       vmcs_write(CPU_EXEC_CTRL0, ctrl0 & ~CPU_MTF);
> >> +}
> >> +
> >> +static void enable_tf(void)
> >> +{
> >> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> >> +
> >> +       vmcs_write(GUEST_RFLAGS, rflags | X86_EFLAGS_TF);
> >> +}
> >> +
> >> +static void disable_tf(void)
> >> +{
> >> +       unsigned long rflags = vmcs_read(GUEST_RFLAGS);
> >> +
> >> +       vmcs_write(GUEST_RFLAGS, rflags & ~X86_EFLAGS_TF);
> >> +}
> >> +
> >> +static void report_mtf(const char *insn_name, unsigned long exp_rip)
> >> +{
> >> +       unsigned long rip = vmcs_read(GUEST_RIP);
> >> +
> >> +       assert_exit_reason(VMX_MTF);
> >> +       report(rip == exp_rip, "MTF VM-exit after %s instruction. RIP: 0x%lx (expected 0x%lx)",
> >> +              insn_name, rip, exp_rip);
> >> +}
> >> +
> >> +static void vmx_mtf_test(void)
> >> +{
> >> +       unsigned long pending_dbg;
> >> +       handler old_gp, old_db;
> >> +
> >> +       if (!(ctrl_cpu_rev[0].clr & CPU_MTF)) {
> >> +               printf("CPU does not support the 'monitor trap flag' processor-based VM-execution control.\n");
> >> +               return;
> >> +       }
> >> +
> >> +       test_set_guest(test_mtf_guest);
> >> +
> >> +       /* Expect an MTF VM-exit after OUT instruction */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("OUT", (unsigned long) &test_mtf1);
> >> +       disable_mtf();
> >> +
> >> +       /*
> >> +        * Concurrent #DB trap and MTF on instruction boundary. Expect MTF
> >> +        * VM-exit with populated 'pending debug exceptions' VMCS field.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       enable_mtf();
> >> +       enable_tf();
> >> +
> >> +       enter_guest();
> >> +       report_mtf("OUT", (unsigned long) &test_mtf2);
> >> +       pending_dbg = vmcs_read(GUEST_PENDING_DEBUG);
> >> +       report(pending_dbg & DR_STEP,
> >> +              "'pending debug exceptions' field after MTF VM-exit: 0x%lx (expected 0x%lx)",
> >> +              pending_dbg, (unsigned long) DR_STEP);
> >> +
> >> +       disable_mtf();
> >> +       disable_tf();
> >> +       vmcs_write(GUEST_PENDING_DEBUG, 0);
> >> +
> >> +       /*
> >> +        * #GP exception takes priority over MTF. Expect MTF VM-exit with RIP
> >> +        * advanced to first instruction of #GP handler.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       old_gp = handle_exception(GP_VECTOR, test_mtf_gp_handler);
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("MOV CR3", (unsigned long) get_idt_addr(&boot_idt[GP_VECTOR]));
> >> +       disable_mtf();
> >> +
> >> +       /*
> >> +        * Concurrent MTF and privileged software exception (i.e. ICEBP/INT1).
> >> +        * MTF should follow the delivery of #DB trap, though the SDM doesn't
> >> +        * provide clear indication of the relative priority.
> >> +        */
> >> +       enter_guest();
> >> +       skip_exit_vmcall();
> >> +
> >> +       handle_exception(GP_VECTOR, old_gp);
> >> +       old_db = handle_exception(DB_VECTOR, test_mtf_db_handler);
> >> +
> >> +       enable_mtf();
> >> +       enter_guest();
> >> +       report_mtf("INT1", (unsigned long) get_idt_addr(&boot_idt[DB_VECTOR]));
> >> +       disable_mtf();
> >> +
> >> +       enter_guest();
> >> +       handle_exception(DB_VECTOR, old_db);
> >> +}
> >> +
> >>   /*
> >>    * Tests for VM-execution control fields
> >>    */
> >> @@ -9505,5 +9661,6 @@ struct vmx_test vmx_tests[] = {
> >>          TEST(atomic_switch_max_msrs_test),
> >>          TEST(atomic_switch_overflow_msrs_test),
> >>          TEST(rdtsc_vmexit_diff_test),
> >> +       TEST(vmx_mtf_test),
> >>          { NULL, NULL, NULL, NULL, NULL, {0} },
> >>   };
> >> --
> >> 2.25.0.341.g760bfbb309-goog
> >>
> > Friendly ping :) (Just on this patch, 1-4 have already been applied)
> >
> > --
> > Thanks,
> > Oliver
> I know Paolo has already applied this patch. Wanted to send out the
> comments anyway as I was reviewing it...
>
>      1. The commit header should have "nVMX" instead of "VMX". That's
> important for searching git history.

"VMX" was intentional, as these tests need not imply any use of nested
virtualization. They're simply tests for an implementation of VMX, be
it KVM nVMX or Intel hardware VMX. In fact, I booted kvm-unit-tests on
bare-metal before sending out to ensure the series behaved
identically.

>      2. It would be better for readability if the commit message lists
> the test scenarios as numbered items or bullet points instead of putting
> them all in the same paragraph.

Sounds good, I'll conform to the suggested style in future contributions.

>      3. Section 25.5.2 in the SDM has more scenarios for MTF VM-exits,
> but the test covers only a few scenarios like #DB, #GP etc.

Yes, this test covers a limited set of scenarios for MTF. The cases
were chosen to exercise the emulation added to KVM nVMX as part of the
series rather than to exercise hardware-generated MTF VM-exits
unconditionally forwarded to L1. However, using such contrived
scenarios doesn't quite align with my response to (1) where the tests
haven't any implications of nested virtualization.

Thanks Krish :)

--
Best,
Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 0/5] Handle monitor trap flag during instruction emulation
  2020-02-12 11:34 ` [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Paolo Bonzini
@ 2020-02-27  0:22   ` Oliver Upton
  2020-02-27  6:37     ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Oliver Upton @ 2020-02-27  0:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm list, Jim Mattson, Peter Shier, Sean Christopherson

On Wed, Feb 12, 2020 at 3:34 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 07/02/20 11:36, Oliver Upton wrote:
> > v1: http://lore.kernel.org/r/20200113221053.22053-1-oupton@google.com
> > v2: http://lore.kernel.org/r/20200128092715.69429-1-oupton@google.com
> >
> > v1 => v2:
> >  - Don't split the #DB delivery by vendors. Unconditionally injecting
> >    #DB payloads into the 'pending debug exceptions' field will cause KVM
> >    to get stuck in a loop. Per the SDM, when hardware injects an event
> >    resulting from this field's value, it is checked against the
> >    exception interception bitmap.
> >  - Address Sean's comments by injecting the VM-exit into L1 from
> >    vmx_check_nested_events().
> >  - Added fix for nested INIT VM-exits + 'pending debug exceptions' field
> >    as it was noticed in implementing v2.
> >  - Drop Peter + Jim's Reviewed-by tags, as the patch set has changed
> >    since v1.
> >
> > v2 => v3:
> >  - Merge the check/set_pending_dbg helpers into a single helper,
> >    vmx_update_pending_dbg(). Add clarifying comment to this helper.
> >  - Rewrite commit message, descriptive comment for change in 3/5 to
> >    explicitly describe the reason for mutating payload delivery
> >    behavior.
> >  - Undo the changes to kvm_vcpu_do_singlestep(). Instead, add a new hook
> >    to call for 'full' instruction emulation + 'fast' emulation.
> >
> > KVM already provides guests the ability to use the 'monitor trap flag'
> > VM-execution control. Support for this flag is provided by the fact that
> > KVM unconditionally forwards MTF VM-exits to the guest (if requested),
> > as KVM doesn't utilize MTF. While this provides support during hardware
> > instruction execution, it is insufficient for instruction emulation.
> >
> > Should L0 emulate an instruction on the behalf of L2, L0 should also
> > synthesize an MTF VM-exit into L1, should control be set.
> >
> > The first patch corrects a nuanced difference between the definition of
> > a #DB exception payload field and DR6 register. Mask off bit 12 which is
> > defined in the 'pending debug exceptions' field when applying to DR6,
> > since the payload field is said to be compatible with the aforementioned
> > VMCS field.
> >
> > The second patch sets the 'pending debug exceptions' VMCS field when
> > delivering an INIT signal VM-exit to L1, as described in the SDM. This
> > patch also introduces helpers for setting the 'pending debug exceptions'
> > VMCS field.
> >
> > The third patch massages KVM's handling of exception payloads with
> > regard to API compatibility. Rather than immediately delivering the
> > payload w/o opt-in, instead defer the payload + inject
> > before completing a KVM_GET_VCPU_EVENTS. This maintains API
> > compatibility whilst correcting #DB behavior with regard to higher
> > priority VM-exit events.
> >
> > Fourth patch introduces MTF implementation for emulated instructions.
> > Identify if an MTF is due on an instruction boundary from
> > kvm_vcpu_do_singlestep(), however only deliver this VM-exit from
> > vmx_check_nested_events() to respect the relative prioritization to
> > other VM-exits. Since this augments the nested state, introduce a new
> > flag for (de)serialization.
> >
> > Last patch adds tests to kvm-unit-tests to assert the correctness of MTF
> > under several conditions (concurrent #DB trap, #DB fault, etc). These
> > tests pass under virtualization with this series as well as on
> > bare-metal.
> >
> > Based on commit 2c2787938512 ("KVM: selftests: Stop memslot creation in
> > KVM internal memslot region").
> >
> > Oliver Upton (4):
> >   KVM: x86: Mask off reserved bit from #DB exception payload
> >   KVM: nVMX: Handle pending #DB when injecting INIT VM-exit
> >   KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS
> >   KVM: nVMX: Emulate MTF when performing instruction emulation
> >
> >  arch/x86/include/asm/kvm_host.h |  1 +
> >  arch/x86/include/uapi/asm/kvm.h |  1 +
> >  arch/x86/kvm/svm.c              |  1 +
> >  arch/x86/kvm/vmx/nested.c       | 54 ++++++++++++++++++++++++++++++++-
> >  arch/x86/kvm/vmx/nested.h       |  5 +++
> >  arch/x86/kvm/vmx/vmx.c          | 37 +++++++++++++++++++++-
> >  arch/x86/kvm/vmx/vmx.h          |  3 ++
> >  arch/x86/kvm/x86.c              | 39 ++++++++++++++++--------
> >  8 files changed, 126 insertions(+), 15 deletions(-)
> >
>
> Queued (for 5.6-rc2), thanks.
>
> Paolo
>

Are there any strong opinions about how the newly introduced nested
state should be handled across live migrations? When applying this
patch set internally I realized live migration would be busted in the
case of a kernel rollback (i.e. a kernel with this patchset emits the
nested state, kernel w/o receives it + refuses).

Easy fix is to only turn on the feature once it is rollback-proof, but
I wonder if there is any room for improvement on this topic..

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 0/5] Handle monitor trap flag during instruction emulation
  2020-02-27  0:22   ` Oliver Upton
@ 2020-02-27  6:37     ` Paolo Bonzini
  2020-02-27  9:11       ` Oliver Upton
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-02-27  6:37 UTC (permalink / raw)
  To: Oliver Upton; +Cc: kvm list, Jim Mattson, Peter Shier, Sean Christopherson

On 27/02/20 01:22, Oliver Upton wrote:
> Are there any strong opinions about how the newly introduced nested
> state should be handled across live migrations? When applying this
> patch set internally I realized live migration would be busted in the
> case of a kernel rollback (i.e. a kernel with this patchset emits the
> nested state, kernel w/o receives it + refuses).

Only if you use MTF + emulation.  In this case it's a pure bugfix so
it's okay to break backwards migration.  If it's really a new feature,
it should support KVM_ENABLE_CAP to enable it.

Paolo

> Easy fix is to only turn on the feature once it is rollback-proof, but
> I wonder if there is any room for improvement on this topic..


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 0/5] Handle monitor trap flag during instruction emulation
  2020-02-27  6:37     ` Paolo Bonzini
@ 2020-02-27  9:11       ` Oliver Upton
  0 siblings, 0 replies; 15+ messages in thread
From: Oliver Upton @ 2020-02-27  9:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm list, Jim Mattson, Peter Shier, Sean Christopherson

On Wed, Feb 26, 2020 at 10:38 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 27/02/20 01:22, Oliver Upton wrote:
> > Are there any strong opinions about how the newly introduced nested
> > state should be handled across live migrations? When applying this
> > patch set internally I realized live migration would be busted in the
> > case of a kernel rollback (i.e. a kernel with this patchset emits the
> > nested state, kernel w/o receives it + refuses).
>
> Only if you use MTF + emulation.  In this case it's a pure bugfix so
> it's okay to break backwards migration.  If it's really a new feature,
> it should support KVM_ENABLE_CAP to enable it.
>
> Paolo
>

True. I suppose I've conflated the pure bugfix here with the fact that
MTF is new to us.

Thanks Paolo!

--
Best,
Oliver

> > Easy fix is to only turn on the feature once it is rollback-proof, but
> > I wonder if there is any room for improvement on this topic..
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-02-27  9:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07 10:36 [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Oliver Upton
2020-02-07 10:36 ` [PATCH v3 1/5] KVM: x86: Mask off reserved bit from #DB exception payload Oliver Upton
2020-02-07 10:36 ` [PATCH v3 2/5] KVM: nVMX: Handle pending #DB when injecting INIT VM-exit Oliver Upton
2020-02-07 10:36 ` [PATCH v3 3/5] KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS Oliver Upton
2020-02-07 10:36 ` [PATCH v3 4/5] KVM: nVMX: Emulate MTF when performing instruction emulation Oliver Upton
2020-02-07 10:36 ` [kvm-unit-tests PATCH v3 5/5] x86: VMX: Add tests for monitor trap flag Oliver Upton
2020-02-25  0:09   ` Oliver Upton
2020-02-25  7:36     ` Paolo Bonzini
2020-02-25  7:52       ` Oliver Upton
2020-02-26  0:13     ` Krish Sadhukhan
2020-02-26  0:55       ` Oliver Upton
2020-02-12 11:34 ` [PATCH v3 0/5] Handle monitor trap flag during instruction emulation Paolo Bonzini
2020-02-27  0:22   ` Oliver Upton
2020-02-27  6:37     ` Paolo Bonzini
2020-02-27  9:11       ` Oliver Upton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.