kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info
@ 2020-06-01 22:24 Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 1/4] kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id() Jim Mattson
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Jim Mattson @ 2020-06-01 22:24 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Liran Alon, Oliver Upton, Peter Shier, Jim Mattson

It's been about 6 months since v2. Sorry for the delay. Initially, this
was a single patch to add information to KVM_EXIT_FAIL_ENTRY to help
identify a defective CPU. It has gotten a little more complicated,
since Peter Shier pointed out that the vCPU thread may have migrated
between the time of failure and the KVM exit. Fortunately, the SEV folks
started to make the necessary information available with "last_cpu," but
only on AMD and only with SEV. The current version expands upon that by
making "last_cpu" available in all configurations on AMD and Intel.

v2: Use vcpu->cpu rather than raw_smp_processor_id() (Liran).
v3: Record the last logical processor to run the vCPU thread (Peter).
    Add the "last CPU" information to KVM_EXIT_INTERNAL_ERROR exits as
    well as KVM_EXIT_FAIL_ENTRY [except for "EMULATION" errors].
    (Liran & Paolo).

Jim Mattson (4):
  kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id()
  kvm: svm: Always set svm->last_cpu on VMRUN
  kvm: vmx: Add last_cpu to struct vcpu_vmx
  kvm: x86: Add "last CPU" to some KVM_EXIT information

 Documentation/virt/kvm/api.rst |  1 +
 arch/x86/kvm/svm/sev.c         |  1 -
 arch/x86/kvm/svm/svm.c         | 14 +++++++-------
 arch/x86/kvm/vmx/vmx.c         | 11 +++++++++--
 arch/x86/kvm/vmx/vmx.h         |  3 +++
 arch/x86/kvm/x86.c             |  1 +
 include/uapi/linux/kvm.h       |  2 ++
 7 files changed, 23 insertions(+), 10 deletions(-)

-- 
2.27.0.rc2.251.g90737beb825-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id()
  2020-06-01 22:24 [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info Jim Mattson
@ 2020-06-01 22:24 ` Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 2/4] kvm: svm: Always set svm->last_cpu on VMRUN Jim Mattson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2020-06-01 22:24 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Liran Alon, Oliver Upton, Peter Shier, Jim Mattson

The current logical processor id is cached in vcpu->cpu. Use it
instead of raw_smp_processor_id() when a kvm_vcpu struct is available.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/svm/svm.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9e333b91ff78..f0dd481be435 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2990,21 +2990,18 @@ static int handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 
 static void reload_tss(struct kvm_vcpu *vcpu)
 {
-	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 	sd->tss_desc->type = 9; /* available 32/64-bit TSS */
 	load_TR_desc();
 }
 
 static void pre_svm_run(struct vcpu_svm *svm)
 {
-	int cpu = raw_smp_processor_id();
-
-	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+	struct svm_cpu_data *sd = per_cpu(svm_data, svm->vcpu.cpu);
 
 	if (sev_guest(svm->vcpu.kvm))
-		return pre_sev_run(svm, cpu);
+		return pre_sev_run(svm, svm->vcpu.cpu);
 
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
-- 
2.27.0.rc2.251.g90737beb825-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] kvm: svm: Always set svm->last_cpu on VMRUN
  2020-06-01 22:24 [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 1/4] kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id() Jim Mattson
@ 2020-06-01 22:24 ` Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 4/4] kvm: x86: Add "last CPU" to some KVM_EXIT information Jim Mattson
  3 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2020-06-01 22:24 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Liran Alon, Oliver Upton, Peter Shier, Jim Mattson

Previously, this field was only set when using SEV. Set it for all
vCPU configurations, so that it can be communicated to userspace for
diagnosing potential hardware errors.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
---
 arch/x86/kvm/svm/sev.c | 1 -
 arch/x86/kvm/svm/svm.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 89f7f3aebd31..aa61d5d1e7f3 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1184,7 +1184,6 @@ void pre_sev_run(struct vcpu_svm *svm, int cpu)
 	    svm->last_cpu == cpu)
 		return;
 
-	svm->last_cpu = cpu;
 	sd->sev_vmcbs[asid] = svm->vmcb;
 	svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ASID;
 	mark_dirty(svm->vmcb, VMCB_ASID);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f0dd481be435..442dbb763639 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3394,6 +3394,7 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
 
+	svm->last_cpu = vcpu->cpu;
 	__svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
 
 #ifdef CONFIG_X86_64
-- 
2.27.0.rc2.251.g90737beb825-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-01 22:24 [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 1/4] kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id() Jim Mattson
  2020-06-01 22:24 ` [PATCH v3 2/4] kvm: svm: Always set svm->last_cpu on VMRUN Jim Mattson
@ 2020-06-01 22:24 ` Jim Mattson
  2020-06-02  1:21   ` Sean Christopherson
  2020-06-01 22:24 ` [PATCH v3 4/4] kvm: x86: Add "last CPU" to some KVM_EXIT information Jim Mattson
  3 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2020-06-01 22:24 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Liran Alon, Oliver Upton, Peter Shier, Jim Mattson

As we already do in svm, record the last logical processor on which a
vCPU has run, so that it can be communicated to userspace for
potential hardware errors.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 1 +
 arch/x86/kvm/vmx/vmx.h | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 170cc76a581f..42856970d3b8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.cr2 != read_cr2())
 		write_cr2(vcpu->arch.cr2);
 
+	vmx->last_cpu = vcpu->cpu;
 	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
 				   vmx->loaded_vmcs->launched);
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 672c28f17e49..8a1e833cf4fb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -302,6 +302,9 @@ struct vcpu_vmx {
 	u64 ept_pointer;
 
 	struct pt_desc pt_desc;
+
+	/* which host CPU was used for running this vcpu */
+	unsigned int last_cpu;
 };
 
 enum ept_pointers_status {
-- 
2.27.0.rc2.251.g90737beb825-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] kvm: x86: Add "last CPU" to some KVM_EXIT information
  2020-06-01 22:24 [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info Jim Mattson
                   ` (2 preceding siblings ...)
  2020-06-01 22:24 ` [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx Jim Mattson
@ 2020-06-01 22:24 ` Jim Mattson
  3 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2020-06-01 22:24 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Liran Alon, Oliver Upton, Peter Shier, Jim Mattson

More often than not, a failed VM-entry in an x86 production
environment is induced by a defective CPU. To help identify the bad
hardware, include the id of the last logical CPU to run a vCPU in the
information provided to userspace on a KVM exit for failed VM-entry or
for KVM internal errors not associated with emulation. The presence of
this additional information is indicated by a new capability,
KVM_CAP_LAST_CPU.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
---
 Documentation/virt/kvm/api.rst |  1 +
 arch/x86/kvm/svm/svm.c         |  4 +++-
 arch/x86/kvm/vmx/vmx.c         | 10 ++++++++--
 arch/x86/kvm/x86.c             |  1 +
 include/uapi/linux/kvm.h       |  2 ++
 5 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index d280af5345df..17db8b68c165 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4792,6 +4792,7 @@ hardware_exit_reason.
 		/* KVM_EXIT_FAIL_ENTRY */
 		struct {
 			__u64 hardware_entry_failure_reason;
+			__u32 cpu; /* if KVM_LAST_CPU */
 		} fail_entry;
 
 If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 442dbb763639..938be4172bab 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2945,6 +2945,7 @@ static int handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		kvm_run->fail_entry.hardware_entry_failure_reason
 			= svm->vmcb->control.exit_code;
+		kvm_run->fail_entry.cpu = svm->last_cpu;
 		dump_vmcb(vcpu);
 		return 0;
 	}
@@ -2968,8 +2969,9 @@ static int handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		vcpu->run->internal.suberror =
 			KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
-		vcpu->run->internal.ndata = 1;
+		vcpu->run->internal.ndata = 2;
 		vcpu->run->internal.data[0] = exit_code;
+		vcpu->run->internal.data[1] = svm->last_cpu;
 		return 0;
 	}
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 42856970d3b8..da5490b94704 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4758,10 +4758,11 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 	    !(is_page_fault(intr_info) && !(error_code & PFERR_RSVD_MASK))) {
 		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_SIMUL_EX;
-		vcpu->run->internal.ndata = 3;
+		vcpu->run->internal.ndata = 4;
 		vcpu->run->internal.data[0] = vect_info;
 		vcpu->run->internal.data[1] = intr_info;
 		vcpu->run->internal.data[2] = error_code;
+		vcpu->run->internal.data[3] = vmx->last_cpu;
 		return 0;
 	}
 
@@ -5983,6 +5984,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		vcpu->run->fail_entry.hardware_entry_failure_reason
 			= exit_reason;
+		vcpu->run->fail_entry.cpu = vmx->last_cpu;
 		return 0;
 	}
 
@@ -5991,6 +5993,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		vcpu->run->fail_entry.hardware_entry_failure_reason
 			= vmcs_read32(VM_INSTRUCTION_ERROR);
+		vcpu->run->fail_entry.cpu = vmx->last_cpu;
 		return 0;
 	}
 
@@ -6017,6 +6020,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 			vcpu->run->internal.data[3] =
 				vmcs_read64(GUEST_PHYSICAL_ADDRESS);
 		}
+		vcpu->run->internal.data[vcpu->run->internal.ndata++] =
+			vmx->last_cpu;
 		return 0;
 	}
 
@@ -6072,8 +6077,9 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 	vcpu->run->internal.suberror =
 			KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
-	vcpu->run->internal.ndata = 1;
+	vcpu->run->internal.ndata = 2;
 	vcpu->run->internal.data[0] = exit_reason;
+	vcpu->run->internal.data[1] = vmx->last_cpu;
 	return 0;
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e41b5135340..20c420a45847 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3472,6 +3472,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_MSR_PLATFORM_INFO:
 	case KVM_CAP_EXCEPTION_PAYLOAD:
 	case KVM_CAP_SET_GUEST_DEBUG:
+	case KVM_CAP_LAST_CPU:
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6721eb563eda..3edbd44d85bf 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -289,6 +289,7 @@ struct kvm_run {
 		/* KVM_EXIT_FAIL_ENTRY */
 		struct {
 			__u64 hardware_entry_failure_reason;
+			__u32 cpu;
 		} fail_entry;
 		/* KVM_EXIT_EXCEPTION */
 		struct {
@@ -1031,6 +1032,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_SECURE_GUEST 181
 #define KVM_CAP_HALT_POLL 182
 #define KVM_CAP_ASYNC_PF_INT 183
+#define KVM_CAP_LAST_CPU 184
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.27.0.rc2.251.g90737beb825-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-01 22:24 ` [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx Jim Mattson
@ 2020-06-02  1:21   ` Sean Christopherson
  2020-06-02 17:33     ` Jim Mattson
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-06-02  1:21 UTC (permalink / raw)
  To: Jim Mattson; +Cc: kvm, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote:
> As we already do in svm, record the last logical processor on which a
> vCPU has run, so that it can be communicated to userspace for
> potential hardware errors.
> 
> Signed-off-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Oliver Upton <oupton@google.com>
> Reviewed-by: Peter Shier <pshier@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 1 +
>  arch/x86/kvm/vmx/vmx.h | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 170cc76a581f..42856970d3b8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.cr2 != read_cr2())
>  		write_cr2(vcpu->arch.cr2);
>  
> +	vmx->last_cpu = vcpu->cpu;

This is redundant in the EXIT_FASTPATH_REENTER_GUEST case.  Setting it
before reenter_guest is technically wrong if emulation_required is true, but
that doesn't seem like it'd be an issue in practice.

>  	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
>  				   vmx->loaded_vmcs->launched);
>  
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 672c28f17e49..8a1e833cf4fb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -302,6 +302,9 @@ struct vcpu_vmx {
>  	u64 ept_pointer;
>  
>  	struct pt_desc pt_desc;
> +
> +	/* which host CPU was used for running this vcpu */
> +	unsigned int last_cpu;

Why not put this in struct kvm_vcpu_arch?  I'd also vote to name it
last_run_cpu, as last_cpu is super misleading.

And if it's in arch, what about setting it vcpu_enter_guest?

>  };
>  
>  enum ept_pointers_status {
> -- 
> 2.27.0.rc2.251.g90737beb825-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-02  1:21   ` Sean Christopherson
@ 2020-06-02 17:33     ` Jim Mattson
  2020-06-03  2:24       ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2020-06-02 17:33 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote:
> > As we already do in svm, record the last logical processor on which a
> > vCPU has run, so that it can be communicated to userspace for
> > potential hardware errors.
> >
> > Signed-off-by: Jim Mattson <jmattson@google.com>
> > Reviewed-by: Oliver Upton <oupton@google.com>
> > Reviewed-by: Peter Shier <pshier@google.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 1 +
> >  arch/x86/kvm/vmx/vmx.h | 3 +++
> >  2 files changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 170cc76a581f..42856970d3b8 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> >       if (vcpu->arch.cr2 != read_cr2())
> >               write_cr2(vcpu->arch.cr2);
> >
> > +     vmx->last_cpu = vcpu->cpu;
>
> This is redundant in the EXIT_FASTPATH_REENTER_GUEST case.  Setting it
> before reenter_guest is technically wrong if emulation_required is true, but
> that doesn't seem like it'd be an issue in practice.

I really would like to capture the last logical processor to execute
VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU.

> >       vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
> >                                  vmx->loaded_vmcs->launched);
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 672c28f17e49..8a1e833cf4fb 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -302,6 +302,9 @@ struct vcpu_vmx {
> >       u64 ept_pointer;
> >
> >       struct pt_desc pt_desc;
> > +
> > +     /* which host CPU was used for running this vcpu */
> > +     unsigned int last_cpu;
>
> Why not put this in struct kvm_vcpu_arch?  I'd also vote to name it
> last_run_cpu, as last_cpu is super misleading.

I think last_run_cpu may also be misleading, since in the cases of
interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu?

> And if it's in arch, what about setting it vcpu_enter_guest?

As you point out above, this isn't entirely accurate. (But that's the
way we roll in kvm, isn't it? :-)

> >  };
> >
> >  enum ept_pointers_status {
> > --
> > 2.27.0.rc2.251.g90737beb825-goog
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-02 17:33     ` Jim Mattson
@ 2020-06-03  2:24       ` Sean Christopherson
  2020-06-03 20:18         ` Jim Mattson
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-06-03  2:24 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Tue, Jun 02, 2020 at 10:33:51AM -0700, Jim Mattson wrote:
> On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote:
> > > As we already do in svm, record the last logical processor on which a
> > > vCPU has run, so that it can be communicated to userspace for
> > > potential hardware errors.
> > >
> > > Signed-off-by: Jim Mattson <jmattson@google.com>
> > > Reviewed-by: Oliver Upton <oupton@google.com>
> > > Reviewed-by: Peter Shier <pshier@google.com>
> > > ---
> > >  arch/x86/kvm/vmx/vmx.c | 1 +
> > >  arch/x86/kvm/vmx/vmx.h | 3 +++
> > >  2 files changed, 4 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index 170cc76a581f..42856970d3b8 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > >       if (vcpu->arch.cr2 != read_cr2())
> > >               write_cr2(vcpu->arch.cr2);
> > >
> > > +     vmx->last_cpu = vcpu->cpu;
> >
> > This is redundant in the EXIT_FASTPATH_REENTER_GUEST case.  Setting it
> > before reenter_guest is technically wrong if emulation_required is true, but
> > that doesn't seem like it'd be an issue in practice.
> 
> I really would like to capture the last logical processor to execute
> VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU.

Does it matter though?  The flows that consume the variable are all directly
in the VM-Exit path.

> > >       vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
> > >                                  vmx->loaded_vmcs->launched);
> > >
> > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > > index 672c28f17e49..8a1e833cf4fb 100644
> > > --- a/arch/x86/kvm/vmx/vmx.h
> > > +++ b/arch/x86/kvm/vmx/vmx.h
> > > @@ -302,6 +302,9 @@ struct vcpu_vmx {
> > >       u64 ept_pointer;
> > >
> > >       struct pt_desc pt_desc;
> > > +
> > > +     /* which host CPU was used for running this vcpu */
> > > +     unsigned int last_cpu;
> >
> > Why not put this in struct kvm_vcpu_arch?  I'd also vote to name it
> > last_run_cpu, as last_cpu is super misleading.
> 
> I think last_run_cpu may also be misleading, since in the cases of
> interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu?

Ya, that thought crossed my mind as well.

> > And if it's in arch, what about setting it vcpu_enter_guest?
> 
> As you point out above, this isn't entirely accurate. (But that's the
> way we roll in kvm, isn't it? :-)

As an alternative to storing the last run/attempted CPU, what about moving
the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
that is called after IRQs are enabled but before preemption is enabled, e.g.
detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
moved out of handle_exit.  VMX would require a little bit of refacotring for
it's "no handler" check, but that should be minor.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-03  2:24       ` Sean Christopherson
@ 2020-06-03 20:18         ` Jim Mattson
  2020-06-04 18:46           ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2020-06-03 20:18 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Jun 02, 2020 at 10:33:51AM -0700, Jim Mattson wrote:
> > On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote:
> > > > As we already do in svm, record the last logical processor on which a
> > > > vCPU has run, so that it can be communicated to userspace for
> > > > potential hardware errors.
> > > >
> > > > Signed-off-by: Jim Mattson <jmattson@google.com>
> > > > Reviewed-by: Oliver Upton <oupton@google.com>
> > > > Reviewed-by: Peter Shier <pshier@google.com>
> > > > ---
> > > >  arch/x86/kvm/vmx/vmx.c | 1 +
> > > >  arch/x86/kvm/vmx/vmx.h | 3 +++
> > > >  2 files changed, 4 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > > index 170cc76a581f..42856970d3b8 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.c
> > > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > > >       if (vcpu->arch.cr2 != read_cr2())
> > > >               write_cr2(vcpu->arch.cr2);
> > > >
> > > > +     vmx->last_cpu = vcpu->cpu;
> > >
> > > This is redundant in the EXIT_FASTPATH_REENTER_GUEST case.  Setting it
> > > before reenter_guest is technically wrong if emulation_required is true, but
> > > that doesn't seem like it'd be an issue in practice.
> >
> > I really would like to capture the last logical processor to execute
> > VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU.
>
> Does it matter though?  The flows that consume the variable are all directly
> in the VM-Exit path.
>
> > > >       vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
> > > >                                  vmx->loaded_vmcs->launched);
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > > > index 672c28f17e49..8a1e833cf4fb 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.h
> > > > +++ b/arch/x86/kvm/vmx/vmx.h
> > > > @@ -302,6 +302,9 @@ struct vcpu_vmx {
> > > >       u64 ept_pointer;
> > > >
> > > >       struct pt_desc pt_desc;
> > > > +
> > > > +     /* which host CPU was used for running this vcpu */
> > > > +     unsigned int last_cpu;
> > >
> > > Why not put this in struct kvm_vcpu_arch?  I'd also vote to name it
> > > last_run_cpu, as last_cpu is super misleading.
> >
> > I think last_run_cpu may also be misleading, since in the cases of
> > interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu?
>
> Ya, that thought crossed my mind as well.
>
> > > And if it's in arch, what about setting it vcpu_enter_guest?
> >
> > As you point out above, this isn't entirely accurate. (But that's the
> > way we roll in kvm, isn't it? :-)
>
> As an alternative to storing the last run/attempted CPU, what about moving
> the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> that is called after IRQs are enabled but before preemption is enabled, e.g.
> detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> moved out of handle_exit.  VMX would require a little bit of refacotring for
> it's "no handler" check, but that should be minor.

Given the alternatives, I'm willing to compromise my principles wrt
emulation_required. :-) I'll send out v4 soon.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-03 20:18         ` Jim Mattson
@ 2020-06-04 18:46           ` Sean Christopherson
  2020-06-04 19:00             ` Jim Mattson
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-06-04 18:46 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote:
> On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> > As an alternative to storing the last run/attempted CPU, what about moving
> > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> > that is called after IRQs are enabled but before preemption is enabled, e.g.
> > detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> > moved out of handle_exit.  VMX would require a little bit of refacotring for
> > it's "no handler" check, but that should be minor.
> 
> Given the alternatives, I'm willing to compromise my principles wrt
> emulation_required. :-) I'll send out v4 soon.

What do you dislike about the alternative approach?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-04 18:46           ` Sean Christopherson
@ 2020-06-04 19:00             ` Jim Mattson
  2020-06-04 19:26               ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2020-06-04 19:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote:
> > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > > As an alternative to storing the last run/attempted CPU, what about moving
> > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> > > that is called after IRQs are enabled but before preemption is enabled, e.g.
> > > detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> > > moved out of handle_exit.  VMX would require a little bit of refacotring for
> > > it's "no handler" check, but that should be minor.
> >
> > Given the alternatives, I'm willing to compromise my principles wrt
> > emulation_required. :-) I'll send out v4 soon.
>
> What do you dislike about the alternative approach?

Mainly, I wanted to stash this in a common location so that I could
print it out in our local version of dump_vmcs(). Ideally, we'd like
to be able to identify the bad part(s) just from the kernel logs.
That, and I wouldn't have been as comfortable with the refactoring
without a lot more testing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-04 19:00             ` Jim Mattson
@ 2020-06-04 19:26               ` Sean Christopherson
  2020-06-04 20:54                 ` Jim Mattson
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-06-04 19:26 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Thu, Jun 04, 2020 at 12:00:33PM -0700, Jim Mattson wrote:
> On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote:
> > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
> > > <sean.j.christopherson@intel.com> wrote:
> > > > As an alternative to storing the last run/attempted CPU, what about moving
> > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> > > > that is called after IRQs are enabled but before preemption is enabled, e.g.
> > > > detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> > > > moved out of handle_exit.  VMX would require a little bit of refacotring for
> > > > it's "no handler" check, but that should be minor.
> > >
> > > Given the alternatives, I'm willing to compromise my principles wrt
> > > emulation_required. :-) I'll send out v4 soon.
> >
> > What do you dislike about the alternative approach?
> 
> Mainly, I wanted to stash this in a common location so that I could
> print it out in our local version of dump_vmcs(). Ideally, we'd like
> to be able to identify the bad part(s) just from the kernel logs.

But this would also move dump_vmcs() to before preemption is enabled, i.e.
your version could read the CPU directly.

And actually, if we're talking about ferreting out hardware issues, you
really do want this happening before preemption is enabled so that the VMCS
dump comes from the failing CPU.  If the vCPU is migrated, the VMCS will be
dumped after a VMCLEAR->VMPTRLD, i.e. will be written to memory and pulled
back into the VMCS cache on a different CPU, and will also have been written
to by the new CPU to update host state.  Odds are that wouldn't affect the
dump in a meaningful way, but never say never.

Tangentially related, what about adding an option to do VMCLEAR at the end
of dump_vmcs(), followed by a dump of raw memory?  It'd be useless for
debugging software issues, but might be potentially useful/interesting for
triaging hardware problems.

> That, and I wouldn't have been as comfortable with the refactoring
> without a lot more testing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx
  2020-06-04 19:26               ` Sean Christopherson
@ 2020-06-04 20:54                 ` Jim Mattson
  0 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2020-06-04 20:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm list, Paolo Bonzini, Liran Alon, Oliver Upton, Peter Shier

On Thu, Jun 4, 2020 at 12:26 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Thu, Jun 04, 2020 at 12:00:33PM -0700, Jim Mattson wrote:
> > On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote:
> > > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson
> > > > <sean.j.christopherson@intel.com> wrote:
> > > > > As an alternative to storing the last run/attempted CPU, what about moving
> > > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook
> > > > > that is called after IRQs are enabled but before preemption is enabled, e.g.
> > > > > detect_bad_exit or something?  All of the paths in patch 4/4 can easily be
> > > > > moved out of handle_exit.  VMX would require a little bit of refacotring for
> > > > > it's "no handler" check, but that should be minor.
> > > >
> > > > Given the alternatives, I'm willing to compromise my principles wrt
> > > > emulation_required. :-) I'll send out v4 soon.
> > >
> > > What do you dislike about the alternative approach?
> >
> > Mainly, I wanted to stash this in a common location so that I could
> > print it out in our local version of dump_vmcs(). Ideally, we'd like
> > to be able to identify the bad part(s) just from the kernel logs.
>
> But this would also move dump_vmcs() to before preemption is enabled, i.e.
> your version could read the CPU directly.

If it backports easily. The bigger the change, the less likely that is.

> And actually, if we're talking about ferreting out hardware issues, you
> really do want this happening before preemption is enabled so that the VMCS
> dump comes from the failing CPU.  If the vCPU is migrated, the VMCS will be
> dumped after a VMCLEAR->VMPTRLD, i.e. will be written to memory and pulled
> back into the VMCS cache on a different CPU, and will also have been written
> to by the new CPU to update host state.  Odds are that wouldn't affect the
> dump in a meaningful way, but never say never.

True.

> Tangentially related, what about adding an option to do VMCLEAR at the end
> of dump_vmcs(), followed by a dump of raw memory?  It'd be useless for
> debugging software issues, but might be potentially useful/interesting for
> triaging hardware problems.

Our dump_vmcs() dumps all vmreadable fields, which should be pretty
close to what we can get from a raw memory dump. We do have additional
instrumentation to aid in determining the layout of the VMCS in
memory, but it is too stupid to figure out how access rights are
stored. Maybe it could be beefed up a little, and we could at least
verify that VMCLEAR dumps the same thing to physical memory that we
get from the individual VMREADs.

> > That, and I wouldn't have been as comfortable with the refactoring
> > without a lot more testing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-06-04 20:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 22:24 [PATCH v3 0/4] Add logical CPU to KVM_EXIT_FAIL_ENTRY info Jim Mattson
2020-06-01 22:24 ` [PATCH v3 1/4] kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id() Jim Mattson
2020-06-01 22:24 ` [PATCH v3 2/4] kvm: svm: Always set svm->last_cpu on VMRUN Jim Mattson
2020-06-01 22:24 ` [PATCH v3 3/4] kvm: vmx: Add last_cpu to struct vcpu_vmx Jim Mattson
2020-06-02  1:21   ` Sean Christopherson
2020-06-02 17:33     ` Jim Mattson
2020-06-03  2:24       ` Sean Christopherson
2020-06-03 20:18         ` Jim Mattson
2020-06-04 18:46           ` Sean Christopherson
2020-06-04 19:00             ` Jim Mattson
2020-06-04 19:26               ` Sean Christopherson
2020-06-04 20:54                 ` Jim Mattson
2020-06-01 22:24 ` [PATCH v3 4/4] kvm: x86: Add "last CPU" to some KVM_EXIT information Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).