linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
@ 2020-03-21 19:37 Sean Christopherson
  2020-03-21 19:37 ` [PATCH v2 1/3] KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support Sean Christopherson
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-03-21 19:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Patch 1 fixes a a theoretical bug where a crashdump NMI that arrives
while KVM is messing with the percpu VMCS list would result in one or more
VMCSes not being cleared, potentially causing memory corruption in the new
kexec'd kernel.

Patch 2 is cleanup that's made possible by patch 1.

Patch 3 isn't directly related, but it conflicts with the crash cleanup
changes, both from a code and a semantics perspective.  Without the crash
cleanup, IMO hardware_enable() should do crash_disable_local_vmclear()
if VMXON fails, i.e. clean up after itself.  But hardware_disable()
doesn't even do crash_disable_local_vmclear() (which is what got me
looking at that code in the first place).  Basing the VMXON change on top
of the crash cleanup avoids the debate entirely.

v2:
  - Inverted the code flow, i.e. move code from loaded_vmcs_init() to
    __loaded_vmcs_clear().  Trying to share loaded_vmcs_init() with
    alloc_loaded_vmcs() was taking more code than it saved. [Paolo]


Gory details on the crashdump bug:

I verified my analysis of the NMI bug by simulating what would happen if
an NMI arrived in the middle of list_add() and list_del().  The below
output matches expectations, e.g. nothing hangs, the entry being added
doesn't show up, and the entry being deleted _does_ show up.

[    8.205898] KVM: testing NMI in list_add()
[    8.205898] KVM: testing NMI in list_del()
[    8.205899] KVM: found e3
[    8.205899] KVM: found e2
[    8.205899] KVM: found e1
[    8.205900] KVM: found e3
[    8.205900] KVM: found e1

static void vmx_test_list(struct list_head *list, struct list_head *e1,
                          struct list_head *e2, struct list_head *e3)
{
        struct list_head *tmp;

        list_for_each(tmp, list) {
                if (tmp == e1)
                        pr_warn("KVM: found e1\n");
                else if (tmp == e2)
                        pr_warn("KVM: found e2\n");
                else if (tmp == e3)
                        pr_warn("KVM: found e3\n");
                else
                        pr_warn("KVM: kaboom\n");
        }
}

static int __init vmx_init(void)
{
        LIST_HEAD(list);
        LIST_HEAD(e1);
        LIST_HEAD(e2);
        LIST_HEAD(e3);

        pr_warn("KVM: testing NMI in list_add()\n");

        list.next->prev = &e1;
        vmx_test_list(&list, &e1, &e2, &e3);

        e1.next = list.next;
        vmx_test_list(&list, &e1, &e2, &e3);

        e1.prev = &list;
        vmx_test_list(&list, &e1, &e2, &e3);

        INIT_LIST_HEAD(&list);
        INIT_LIST_HEAD(&e1);

        list_add(&e1, &list);
        list_add(&e2, &list);
        list_add(&e3, &list);

        pr_warn("KVM: testing NMI in list_del()\n");

        e3.prev = &e1;
        vmx_test_list(&list, &e1, &e2, &e3);

        list_del(&e2);
        list.prev = &e1;
        vmx_test_list(&list, &e1, &e2, &e3);
}

Sean Christopherson (3):
  KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support
  KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs()
  KVM: VMX: Gracefully handle faults on VMXON

 arch/x86/kvm/vmx/vmx.c | 103 ++++++++++++++++-------------------------
 arch/x86/kvm/vmx/vmx.h |   1 -
 2 files changed, 40 insertions(+), 64 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support
  2020-03-21 19:37 [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Sean Christopherson
@ 2020-03-21 19:37 ` Sean Christopherson
  2020-03-21 19:37 ` [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() Sean Christopherson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-03-21 19:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

VMCLEAR all in-use VMCSes during a crash, even if kdump's NMI shootdown
interrupted a KVM update of the percpu in-use VMCS list.

Because NMIs are not blocked by disabling IRQs, it's possible that
crash_vmclear_local_loaded_vmcss() could be called while the percpu list
of VMCSes is being modified, e.g. in the middle of list_add() in
vmx_vcpu_load_vmcs().  This potential corner case was called out in the
original commit[*], but the analysis of its impact was wrong.

Skipping the VMCLEARs is wrong because it all but guarantees that a
loaded, and therefore cached, VMCS will live across kexec and corrupt
memory in the new kernel.  Corruption will occur because the CPU's VMCS
cache is non-coherent, i.e. not snooped, and so the writeback of VMCS
memory on its eviction will overwrite random memory in the new kernel.
The VMCS will live because the NMI shootdown also disables VMX, i.e. the
in-progress VMCLEAR will #UD, and existing Intel CPUs do not flush the
VMCS cache on VMXOFF.

Furthermore, interrupting list_add() and list_del() is safe due to
crash_vmclear_local_loaded_vmcss() using forward iteration.  list_add()
ensures the new entry is not visible to forward iteration unless the
entire add completes, via WRITE_ONCE(prev->next, new).  A bad "prev"
pointer could be observed if the NMI shootdown interrupted list_del() or
list_add(), but list_for_each_entry() does not consume ->prev.

In addition to removing the temporary disabling of VMCLEAR, open code
loaded_vmcs_init() in __loaded_vmcs_clear() and reorder VMCLEAR so that
the VMCS is deleted from the list only after it's been VMCLEAR'd.
Deleting the VMCS before VMCLEAR would allow a race where the NMI
shootdown could arrive between list_del() and vmcs_clear() and thus
neither flow would execute a successful VMCLEAR.  Alternatively, more
code could be moved into loaded_vmcs_init(), but that gets rather silly
as the only other user, alloc_loaded_vmcs(), doesn't need the smp_wmb()
and would need to work around the list_del().

Update the smp_*() comments related to the list manipulation, and
opportunistically reword them to improve clarity.

[*] https://patchwork.kernel.org/patch/1675731/#3720461

Fixes: 8f536b7697a0 ("KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 67 ++++++++++--------------------------------
 1 file changed, 16 insertions(+), 51 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 07299a957d4a..efaca09455bf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -663,43 +663,15 @@ void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 }
 
 #ifdef CONFIG_KEXEC_CORE
-/*
- * This bitmap is used to indicate whether the vmclear
- * operation is enabled on all cpus. All disabled by
- * default.
- */
-static cpumask_t crash_vmclear_enabled_bitmap = CPU_MASK_NONE;
-
-static inline void crash_enable_local_vmclear(int cpu)
-{
-	cpumask_set_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
-static inline void crash_disable_local_vmclear(int cpu)
-{
-	cpumask_clear_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
-static inline int crash_local_vmclear_enabled(int cpu)
-{
-	return cpumask_test_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
 static void crash_vmclear_local_loaded_vmcss(void)
 {
 	int cpu = raw_smp_processor_id();
 	struct loaded_vmcs *v;
 
-	if (!crash_local_vmclear_enabled(cpu))
-		return;
-
 	list_for_each_entry(v, &per_cpu(loaded_vmcss_on_cpu, cpu),
 			    loaded_vmcss_on_cpu_link)
 		vmcs_clear(v->vmcs);
 }
-#else
-static inline void crash_enable_local_vmclear(int cpu) { }
-static inline void crash_disable_local_vmclear(int cpu) { }
 #endif /* CONFIG_KEXEC_CORE */
 
 static void __loaded_vmcs_clear(void *arg)
@@ -711,19 +683,24 @@ static void __loaded_vmcs_clear(void *arg)
 		return; /* vcpu migration can race with cpu offline */
 	if (per_cpu(current_vmcs, cpu) == loaded_vmcs->vmcs)
 		per_cpu(current_vmcs, cpu) = NULL;
-	crash_disable_local_vmclear(cpu);
+
+	vmcs_clear(loaded_vmcs->vmcs);
+	if (loaded_vmcs->shadow_vmcs && loaded_vmcs->launched)
+		vmcs_clear(loaded_vmcs->shadow_vmcs);
+
 	list_del(&loaded_vmcs->loaded_vmcss_on_cpu_link);
 
 	/*
-	 * we should ensure updating loaded_vmcs->loaded_vmcss_on_cpu_link
-	 * is before setting loaded_vmcs->vcpu to -1 which is done in
-	 * loaded_vmcs_init. Otherwise, other cpu can see vcpu = -1 fist
-	 * then adds the vmcs into percpu list before it is deleted.
+	 * Ensure all writes to loaded_vmcs, including deleting it from its
+	 * current percpu list, complete before setting loaded_vmcs->vcpu to
+	 * -1, otherwise a different cpu can see vcpu == -1 first and add
+	 * loaded_vmcs to its percpu list before it's deleted from this cpu's
+	 * list. Pairs with the smp_rmb() in vmx_vcpu_load_vmcs().
 	 */
 	smp_wmb();
 
-	loaded_vmcs_init(loaded_vmcs);
-	crash_enable_local_vmclear(cpu);
+	loaded_vmcs->cpu = -1;
+	loaded_vmcs->launched = 0;
 }
 
 void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs)
@@ -1342,18 +1319,17 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu)
 	if (!already_loaded) {
 		loaded_vmcs_clear(vmx->loaded_vmcs);
 		local_irq_disable();
-		crash_disable_local_vmclear(cpu);
 
 		/*
-		 * Read loaded_vmcs->cpu should be before fetching
-		 * loaded_vmcs->loaded_vmcss_on_cpu_link.
-		 * See the comments in __loaded_vmcs_clear().
+		 * Ensure loaded_vmcs->cpu is read before adding loaded_vmcs to
+		 * this cpu's percpu list, otherwise it may not yet be deleted
+		 * from its previous cpu's percpu list.  Pairs with the
+		 * smb_wmb() in __loaded_vmcs_clear().
 		 */
 		smp_rmb();
 
 		list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
 			 &per_cpu(loaded_vmcss_on_cpu, cpu));
-		crash_enable_local_vmclear(cpu);
 		local_irq_enable();
 	}
 
@@ -2279,17 +2255,6 @@ static int hardware_enable(void)
 	INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
 	spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
 
-	/*
-	 * Now we can enable the vmclear operation in kdump
-	 * since the loaded_vmcss_on_cpu list on this cpu
-	 * has been initialized.
-	 *
-	 * Though the cpu is not in VMX operation now, there
-	 * is no problem to enable the vmclear operation
-	 * for the loaded_vmcss_on_cpu list is empty!
-	 */
-	crash_enable_local_vmclear(cpu);
-
 	kvm_cpu_vmxon(phys_addr);
 	if (enable_ept)
 		ept_sync_global();
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs()
  2020-03-21 19:37 [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Sean Christopherson
  2020-03-21 19:37 ` [PATCH v2 1/3] KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support Sean Christopherson
@ 2020-03-21 19:37 ` Sean Christopherson
  2020-03-22 13:08   ` Vitaly Kuznetsov
  2020-03-21 19:37 ` [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON Sean Christopherson
  2020-04-07 11:01 ` [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Baoquan He
  3 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-03-21 19:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Subsume loaded_vmcs_init() into alloc_loaded_vmcs(), its only remaining
caller, and drop the VMCLEAR on the shadow VMCS, which is guaranteed to
be NULL.  loaded_vmcs_init() was previously used by loaded_vmcs_clear(),
but loaded_vmcs_clear() also subsumed loaded_vmcs_init() to properly
handle smp_wmb() with respect to VMCLEAR.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 14 ++++----------
 arch/x86/kvm/vmx/vmx.h |  1 -
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index efaca09455bf..07634caa560d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -653,15 +653,6 @@ static int vmx_set_guest_msr(struct vcpu_vmx *vmx, struct shared_msr_entry *msr,
 	return ret;
 }
 
-void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
-{
-	vmcs_clear(loaded_vmcs->vmcs);
-	if (loaded_vmcs->shadow_vmcs && loaded_vmcs->launched)
-		vmcs_clear(loaded_vmcs->shadow_vmcs);
-	loaded_vmcs->cpu = -1;
-	loaded_vmcs->launched = 0;
-}
-
 #ifdef CONFIG_KEXEC_CORE
 static void crash_vmclear_local_loaded_vmcss(void)
 {
@@ -2555,9 +2546,12 @@ int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
 	if (!loaded_vmcs->vmcs)
 		return -ENOMEM;
 
+	vmcs_clear(loaded_vmcs->vmcs);
+
 	loaded_vmcs->shadow_vmcs = NULL;
 	loaded_vmcs->hv_timer_soft_disabled = false;
-	loaded_vmcs_init(loaded_vmcs);
+	loaded_vmcs->cpu = -1;
+	loaded_vmcs->launched = 0;
 
 	if (cpu_has_vmx_msr_bitmap()) {
 		loaded_vmcs->msr_bitmap = (unsigned long *)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index be93d597306c..79d38f41ef7a 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -492,7 +492,6 @@ struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags);
 void free_vmcs(struct vmcs *vmcs);
 int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
 void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
-void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs);
 void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs);
 
 static inline struct vmcs *alloc_vmcs(bool shadow)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON
  2020-03-21 19:37 [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Sean Christopherson
  2020-03-21 19:37 ` [PATCH v2 1/3] KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support Sean Christopherson
  2020-03-21 19:37 ` [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() Sean Christopherson
@ 2020-03-21 19:37 ` Sean Christopherson
  2020-03-22 13:37   ` Vitaly Kuznetsov
  2020-04-07 11:01 ` [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Baoquan He
  3 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-03-21 19:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Gracefully handle faults on VMXON, e.g. #GP due to VMX being disabled by
BIOS, instead of letting the fault crash the system.  Now that KVM uses
cpufeatures to query support instead of reading MSR_IA32_FEAT_CTL
directly, it's possible for a bug in a different subsystem to cause KVM
to incorrectly attempt VMXON[*].  Crashing the system is especially
annoying if the system is configured such that hardware_enable() will
be triggered during boot.

Oppurtunistically rename @addr to @vmxon_pointer and use a named param
to reference it in the inline assembly.

Print 0xdeadbeef in the ultra-"rare" case that reading MSR_IA32_FEAT_CTL
also faults.

[*] https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.com
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 07634caa560d..3aba51d782e2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2218,18 +2218,33 @@ static __init int vmx_disabled_by_bios(void)
 	       !boot_cpu_has(X86_FEATURE_VMX);
 }
 
-static void kvm_cpu_vmxon(u64 addr)
+static int kvm_cpu_vmxon(u64 vmxon_pointer)
 {
+	u64 msr;
+
 	cr4_set_bits(X86_CR4_VMXE);
 	intel_pt_handle_vmx(1);
 
-	asm volatile ("vmxon %0" : : "m"(addr));
+	asm_volatile_goto("1: vmxon %[vmxon_pointer]\n\t"
+			  _ASM_EXTABLE(1b, %l[fault])
+			  : : [vmxon_pointer] "m"(vmxon_pointer)
+			  : : fault);
+	return 0;
+
+fault:
+	WARN_ONCE(1, "VMXON faulted, MSR_IA32_FEAT_CTL (0x3a) = 0x%llx\n",
+		  rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr) ? 0xdeadbeef : msr);
+	intel_pt_handle_vmx(0);
+	cr4_clear_bits(X86_CR4_VMXE);
+
+	return -EFAULT;
 }
 
 static int hardware_enable(void)
 {
 	int cpu = raw_smp_processor_id();
 	u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
+	int r;
 
 	if (cr4_read_shadow() & X86_CR4_VMXE)
 		return -EBUSY;
@@ -2246,7 +2261,10 @@ static int hardware_enable(void)
 	INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
 	spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
 
-	kvm_cpu_vmxon(phys_addr);
+	r = kvm_cpu_vmxon(phys_addr);
+	if (r)
+		return r;
+
 	if (enable_ept)
 		ept_sync_global();
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs()
  2020-03-21 19:37 ` [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() Sean Christopherson
@ 2020-03-22 13:08   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-22 13:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Subsume loaded_vmcs_init() into alloc_loaded_vmcs(), its only remaining
> caller, and drop the VMCLEAR on the shadow VMCS, which is guaranteed to
> be NULL.  loaded_vmcs_init() was previously used by loaded_vmcs_clear(),
> but loaded_vmcs_clear() also subsumed loaded_vmcs_init() to properly
> handle smp_wmb() with respect to VMCLEAR.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 14 ++++----------
>  arch/x86/kvm/vmx/vmx.h |  1 -
>  2 files changed, 4 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index efaca09455bf..07634caa560d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -653,15 +653,6 @@ static int vmx_set_guest_msr(struct vcpu_vmx *vmx, struct shared_msr_entry *msr,
>  	return ret;
>  }
>  
> -void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
> -{
> -	vmcs_clear(loaded_vmcs->vmcs);
> -	if (loaded_vmcs->shadow_vmcs && loaded_vmcs->launched)
> -		vmcs_clear(loaded_vmcs->shadow_vmcs);
> -	loaded_vmcs->cpu = -1;
> -	loaded_vmcs->launched = 0;
> -}
> -
>  #ifdef CONFIG_KEXEC_CORE
>  static void crash_vmclear_local_loaded_vmcss(void)
>  {
> @@ -2555,9 +2546,12 @@ int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
>  	if (!loaded_vmcs->vmcs)
>  		return -ENOMEM;
>  
> +	vmcs_clear(loaded_vmcs->vmcs);
> +
>  	loaded_vmcs->shadow_vmcs = NULL;
>  	loaded_vmcs->hv_timer_soft_disabled = false;
> -	loaded_vmcs_init(loaded_vmcs);
> +	loaded_vmcs->cpu = -1;
> +	loaded_vmcs->launched = 0;
>  
>  	if (cpu_has_vmx_msr_bitmap()) {
>  		loaded_vmcs->msr_bitmap = (unsigned long *)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index be93d597306c..79d38f41ef7a 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -492,7 +492,6 @@ struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags);
>  void free_vmcs(struct vmcs *vmcs);
>  int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
>  void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs);
> -void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs);
>  void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs);
>  
>  static inline struct vmcs *alloc_vmcs(bool shadow)

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON
  2020-03-21 19:37 ` [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON Sean Christopherson
@ 2020-03-22 13:37   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-22 13:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Gracefully handle faults on VMXON, e.g. #GP due to VMX being disabled by
> BIOS, instead of letting the fault crash the system.  Now that KVM uses
> cpufeatures to query support instead of reading MSR_IA32_FEAT_CTL
> directly, it's possible for a bug in a different subsystem to cause KVM
> to incorrectly attempt VMXON[*].  Crashing the system is especially
> annoying if the system is configured such that hardware_enable() will
> be triggered during boot.
>
> Oppurtunistically rename @addr to @vmxon_pointer and use a named param
> to reference it in the inline assembly.
>
> Print 0xdeadbeef in the ultra-"rare" case that reading MSR_IA32_FEAT_CTL
> also faults.
>
> [*] https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.com
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 07634caa560d..3aba51d782e2 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2218,18 +2218,33 @@ static __init int vmx_disabled_by_bios(void)
>  	       !boot_cpu_has(X86_FEATURE_VMX);
>  }
>  
> -static void kvm_cpu_vmxon(u64 addr)
> +static int kvm_cpu_vmxon(u64 vmxon_pointer)
>  {
> +	u64 msr;
> +
>  	cr4_set_bits(X86_CR4_VMXE);
>  	intel_pt_handle_vmx(1);
>  
> -	asm volatile ("vmxon %0" : : "m"(addr));
> +	asm_volatile_goto("1: vmxon %[vmxon_pointer]\n\t"
> +			  _ASM_EXTABLE(1b, %l[fault])
> +			  : : [vmxon_pointer] "m"(vmxon_pointer)
> +			  : : fault);
> +	return 0;
> +
> +fault:
> +	WARN_ONCE(1, "VMXON faulted, MSR_IA32_FEAT_CTL (0x3a) = 0x%llx\n",
> +		  rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr) ? 0xdeadbeef : msr);

We seem to be acting under an assumption that the fault is (likelt)
caused my disabled VMX feature but afaics the fault can be caused by
passing a bogus pointer too (but that would be a KVM bug, of course).

> +	intel_pt_handle_vmx(0);
> +	cr4_clear_bits(X86_CR4_VMXE);
> +
> +	return -EFAULT;
>  }
>  
>  static int hardware_enable(void)
>  {
>  	int cpu = raw_smp_processor_id();
>  	u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
> +	int r;
>  
>  	if (cr4_read_shadow() & X86_CR4_VMXE)
>  		return -EBUSY;
> @@ -2246,7 +2261,10 @@ static int hardware_enable(void)
>  	INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
>  	spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
>  
> -	kvm_cpu_vmxon(phys_addr);
> +	r = kvm_cpu_vmxon(phys_addr);
> +	if (r)
> +		return r;
> +
>  	if (enable_ept)
>  		ept_sync_global();

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-03-21 19:37 [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-03-21 19:37 ` [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON Sean Christopherson
@ 2020-04-07 11:01 ` Baoquan He
  2020-04-07 12:04   ` Vitaly Kuznetsov
  3 siblings, 1 reply; 13+ messages in thread
From: Baoquan He @ 2020-04-07 11:01 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, kexec, dzickus, dyoung

On 03/21/20 at 12:37pm, Sean Christopherson wrote:
> Patch 1 fixes a a theoretical bug where a crashdump NMI that arrives
> while KVM is messing with the percpu VMCS list would result in one or more
> VMCSes not being cleared, potentially causing memory corruption in the new
> kexec'd kernel.

I am wondering if this theoretical bug really exists. Now CKI of Redhat
reported crash dumping failed on below commit. I reserved a
intel machine and can reproduce it always. 

Commit: 5364abc57993 - Merge tag 'arc-5.7-rc1' of

From failure trace, it's the kvm vmx which caused this. I have reverted
them and the failure disappeared.

4f6ea0a87608 KVM: VMX: Gracefully handle faults on VMXON
d260f9ef50c7 KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs()
31603d4fc2bb KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support

The trace is here. 

[  132.476387] sysrq: Trigger a crash 
[  132.479829] Kernel panic - not syncing: sysrq triggered crash 
[  132.480817] CPU: 4 PID: 1975 Comm: runtest.sh Kdump: loaded Not tainted 5.6.0-5364abc.cki #1 
[  132.480817] Hardware name: Dell Inc. Precision R7610/, BIOS A08 11/21/2014 
[  132.480817] Call Trace: 
[  132.480817]  dump_stack+0x66/0x90 
[  132.480817]  panic+0x101/0x2e3 
[  132.480817]  ? printk+0x58/0x6f 
[  132.480817]  sysrq_handle_crash+0x11/0x20 
[  132.480817]  __handle_sysrq.cold+0x48/0x104 
[  132.480817]  write_sysrq_trigger+0x24/0x40 
[  132.480817]  proc_reg_write+0x3c/0x60 
[  132.480817]  vfs_write+0xb6/0x1a0 
[  132.480817]  ksys_write+0x5f/0xe0 
[  132.480817]  do_syscall_64+0x5b/0x1c0 
[  132.480817]  entry_SYSCALL_64_after_hwframe+0x44/0xa9 
[  132.480817] RIP: 0033:0x7f205575d4b7 
[  132.480817] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 
[  132.480817] RSP: 002b:00007ffe6ab44a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 
[  132.480817] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f205575d4b7 
[  132.480817] RDX: 0000000000000002 RSI: 000055e2876e1da0 RDI: 0000000000000001 
[  132.480817] RBP: 000055e2876e1da0 R08: 000000000000000a R09: 0000000000000001 
[  132.480817] R10: 000055e2879bf930 R11: 0000000000000246 R12: 0000000000000002 
[  132.480817] R13: 00007f205582e500 R14: 0000000000000002 R15: 00007f205582e700 
[  132.480817] BUG: unable to handle page fault for address: ffffffffffffffc8 
[  132.480817] #PF: supervisor read access in kernel mode 
[  132.480817] #PF: error_code(0x0000) - not-present page 
[  132.480817] PGD 14460f067 P4D 14460f067 PUD 144611067 PMD 0  
[  132.480817] Oops: 0000 [#12] SMP PTI 
[  132.480817] CPU: 4 PID: 1975 Comm: runtest.sh Kdump: loaded Tainted: G      D           5.6.0-5364abc.cki #1 
[  132.480817] Hardware name: Dell Inc. Precision R7610/, BIOS A08 11/21/2014 
[  132.480817] RIP: 0010:crash_vmclear_local_loaded_vmcss+0x57/0xd0 [kvm_intel] 
[  132.480817] Code: c7 c5 40 e0 02 00 4a 8b 04 f5 80 69 42 85 48 8b 14 28 48 01 e8 48 39 c2 74 4d 4c 8d 6a c8 bb 00 00 00 80 49 c7 c4 00 00 00 80 <49> 8b 7d 00 48 89 fe 48 01 de 72 5c 4c 89 e0 48 2b 05 13 a8 88 c4 
[  132.480817] RSP: 0018:ffffa85d435dfcb0 EFLAGS: 00010007 
[  132.480817] RAX: ffff9c82ebb2e040 RBX: 0000000080000000 RCX: 000000000000080f 
[  132.480817] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 000000000000080f 
[  132.480817] RBP: 000000000002e040 R08: 000000717702c90c R09: 0000000000000004 
[  132.480817] R10: 0000000000000009 R11: 0000000000000000 R12: ffffffff80000000 
[  132.480817] R13: ffffffffffffffc8 R14: 0000000000000004 R15: 0000000000000000 
[  132.480817] FS:  00007f2055668740(0000) GS:ffff9c82ebb00000(0000) knlGS:0000000000000000 
[  132.480817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  132.480817] CR2: ffffffffffffffc8 CR3: 0000000151904003 CR4: 00000000000606e0 
[  132.480817] Call Trace: 
[  132.480817]  native_machine_crash_shutdown+0x45/0x190 
[  132.480817]  __crash_kexec+0x61/0x130 
[  132.480817]  ? __crash_kexec+0x9a/0x130 
[  132.480817]  ? panic+0x11d/0x2e3 
[  132.480817]  ? printk+0x58/0x6f 
[  132.480817]  ? sysrq_handle_crash+0x11/0x20 
[  132.480817]  ? __handle_sysrq.cold+0x48/0x104 
[  132.480817]  ? write_sysrq_trigger+0x24/0x40 
[  132.480817]  ? proc_reg_write+0x3c/0x60 
[  132.480817]  ? vfs_write+0xb6/0x1a0 
[  132.480817]  ? ksys_write+0x5f/0xe0 
[  132.480817]  ? do_syscall_64+0x5b/0x1c0 
[  132.480817]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 

> 
> Patch 2 is cleanup that's made possible by patch 1.
> 
> Patch 3 isn't directly related, but it conflicts with the crash cleanup
> changes, both from a code and a semantics perspective.  Without the crash
> cleanup, IMO hardware_enable() should do crash_disable_local_vmclear()
> if VMXON fails, i.e. clean up after itself.  But hardware_disable()
> doesn't even do crash_disable_local_vmclear() (which is what got me
> looking at that code in the first place).  Basing the VMXON change on top
> of the crash cleanup avoids the debate entirely.
> 
> v2:
>   - Inverted the code flow, i.e. move code from loaded_vmcs_init() to
>     __loaded_vmcs_clear().  Trying to share loaded_vmcs_init() with
>     alloc_loaded_vmcs() was taking more code than it saved. [Paolo]
> 
> 
> Gory details on the crashdump bug:
> 
> I verified my analysis of the NMI bug by simulating what would happen if
> an NMI arrived in the middle of list_add() and list_del().  The below
> output matches expectations, e.g. nothing hangs, the entry being added
> doesn't show up, and the entry being deleted _does_ show up.
> 
> [    8.205898] KVM: testing NMI in list_add()
> [    8.205898] KVM: testing NMI in list_del()
> [    8.205899] KVM: found e3
> [    8.205899] KVM: found e2
> [    8.205899] KVM: found e1
> [    8.205900] KVM: found e3
> [    8.205900] KVM: found e1
> 
> static void vmx_test_list(struct list_head *list, struct list_head *e1,
>                           struct list_head *e2, struct list_head *e3)
> {
>         struct list_head *tmp;
> 
>         list_for_each(tmp, list) {
>                 if (tmp == e1)
>                         pr_warn("KVM: found e1\n");
>                 else if (tmp == e2)
>                         pr_warn("KVM: found e2\n");
>                 else if (tmp == e3)
>                         pr_warn("KVM: found e3\n");
>                 else
>                         pr_warn("KVM: kaboom\n");
>         }
> }
> 
> static int __init vmx_init(void)
> {
>         LIST_HEAD(list);
>         LIST_HEAD(e1);
>         LIST_HEAD(e2);
>         LIST_HEAD(e3);
> 
>         pr_warn("KVM: testing NMI in list_add()\n");
> 
>         list.next->prev = &e1;
>         vmx_test_list(&list, &e1, &e2, &e3);
> 
>         e1.next = list.next;
>         vmx_test_list(&list, &e1, &e2, &e3);
> 
>         e1.prev = &list;
>         vmx_test_list(&list, &e1, &e2, &e3);
> 
>         INIT_LIST_HEAD(&list);
>         INIT_LIST_HEAD(&e1);
> 
>         list_add(&e1, &list);
>         list_add(&e2, &list);
>         list_add(&e3, &list);
> 
>         pr_warn("KVM: testing NMI in list_del()\n");
> 
>         e3.prev = &e1;
>         vmx_test_list(&list, &e1, &e2, &e3);
> 
>         list_del(&e2);
>         list.prev = &e1;
>         vmx_test_list(&list, &e1, &e2, &e3);
> }
> 
> Sean Christopherson (3):
>   KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support
>   KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs()
>   KVM: VMX: Gracefully handle faults on VMXON
> 
>  arch/x86/kvm/vmx/vmx.c | 103 ++++++++++++++++-------------------------
>  arch/x86/kvm/vmx/vmx.h |   1 -
>  2 files changed, 40 insertions(+), 64 deletions(-)
> 
> -- 
> 2.24.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-07 11:01 ` [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Baoquan He
@ 2020-04-07 12:04   ` Vitaly Kuznetsov
  2020-04-08 15:18     ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2020-04-07 12:04 UTC (permalink / raw)
  To: Baoquan He
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

Baoquan He <bhe@redhat.com> writes:

>
> The trace is here. 
>
> [  132.480817] RIP: 0010:crash_vmclear_local_loaded_vmcss+0x57/0xd0 [kvm_intel] 

This is a known bug,

https://lore.kernel.org/kvm/20200401081348.1345307-1-vkuznets@redhat.com/

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-07 12:04   ` Vitaly Kuznetsov
@ 2020-04-08 15:18     ` Baoquan He
  2020-04-08 19:44       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 13+ messages in thread
From: Baoquan He @ 2020-04-08 15:18 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

On 04/07/20 at 02:04pm, Vitaly Kuznetsov wrote:
> Baoquan He <bhe@redhat.com> writes:
> 
> >
> > The trace is here. 
> >
> > [  132.480817] RIP: 0010:crash_vmclear_local_loaded_vmcss+0x57/0xd0 [kvm_intel] 
> 
> This is a known bug,
> 
> https://lore.kernel.org/kvm/20200401081348.1345307-1-vkuznets@redhat.com/

Thanks for telling, Vitaly.

I tested your patch, it works.

One thing is I noticed a warning message when your patch is applied. When
I changed back to revert this patchset, didn't found this message. I didn't
look into the detail of network core code and the kvm vmx code, maybe it's
not relevant.


[ 3708.629234] Type was not set for devlink port.
[ 3708.629258] WARNING: CPU: 3 PID: 60 at net/core/devlink.c:7164 devlink_port_type_warn+0x11/0x20
[ 3708.632328] Modules linked in: rfkill sunrpc intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate iTCO_wdt hpwdt intel_uncore gpio_ich iTCO_vendor_support pcspkr ipmi_ssif hpilo lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter pcc_cpufreq i7core_edac ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper cec ttm crc32c_intel serio_raw drm ata_generic pata_acpi mlx4_core bnx2 hpsa scsi_transport_sas
[ 3708.640782] CPU: 3 PID: 60 Comm: kworker/3:1 Kdump: loaded Tainted: G          I       5.6.0+ #1
[ 3708.642715] Hardware name: HP ProLiant DL380 G6, BIOS P62 08/16/2015
[ 3708.644222] Workqueue: events devlink_port_type_warn
[ 3708.645349] RIP: 0010:devlink_port_type_warn+0x11/0x20
[ 3708.646535] Code: 0f 84 58 ff ff ff 0f 0b e9 51 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 48 c7 c7 c8 78 41 9d e8 21 c4 7f ff <0f> 0b c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 ba 20
[ 3708.650924] RSP: 0018:ffffadd540307e78 EFLAGS: 00010286
[ 3708.652257] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 3708.653889] RDX: 0000000000000002 RSI: ffffffff9df3e3c2 RDI: 0000000000000246
[ 3708.655601] RBP: ffff9e0f8c1a7350 R08: 0000035f7b873491 R09: 0000000000000022
[ 3708.657300] R10: ffffadd540307cc0 R11: 0000000000000000 R12: ffff9e0f974ea880
[ 3708.658952] R13: ffff9e0f974f0700 R14: ffff9e0f958a4d80 R15: ffff9e0f8c1a7358
[ 3708.660645] FS:  0000000000000000(0000) GS:ffff9e0f974c0000(0000) knlGS:0000000000000000
[ 3708.662472] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3708.663719] CR2: 00007f2651a02fa8 CR3: 00000000ac60a000 CR4: 00000000000006e0
[ 3708.665330] Call Trace:
[ 3708.665840]  process_one_work+0x1b4/0x380
[ 3708.666841]  worker_thread+0x50/0x3c0
[ 3708.667751]  kthread+0xf9/0x130
[ 3708.668495]  ? process_one_work+0x380/0x380
[ 3708.669536]  ? kthread_park+0x90/0x90
[ 3708.670402]  ret_from_fork+0x35/0x40
[ 3708.671265] ---[ end trace 0fbe622d402fe0cf ]---
[ 3708.672351] ------------[ cut here ]------------
[ 3708.673507] Type was not set for devlink port.
[ 3708.673520] WARNING: CPU: 3 PID: 60 at net/core/devlink.c:7164 devlink_port_type_warn+0x11/0x20
[ 3708.676731] Modules linked in: rfkill sunrpc intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate iTCO_wdt hpwdt intel_uncore gpio_ich iTCO_vendor_support pcspkr ipmi_ssif hpilo lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter pcc_cpufreq i7core_edac ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper cec ttm crc32c_intel serio_raw drm ata_generic pata_acpi mlx4_core bnx2 hpsa scsi_transport_sas
[ 3708.686060] CPU: 3 PID: 60 Comm: kworker/3:1 Kdump: loaded Tainted: G        W I       5.6.0+ #1
[ 3708.688185] Hardware name: HP ProLiant DL380 G6, BIOS P62 08/16/2015
[ 3708.689575] Workqueue: events devlink_port_type_warn
[ 3708.690676] RIP: 0010:devlink_port_type_warn+0x11/0x20
[ 3708.691794] Code: 0f 84 58 ff ff ff 0f 0b e9 51 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 48 c7 c7 c8 78 41 9d e8 21 c4 7f ff <0f> 0b c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 ba 20
[ 3708.696053] RSP: 0018:ffffadd540307e78 EFLAGS: 00010286
[ 3708.697212] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 3708.698800] RDX: 0000000000000002 RSI: ffffffff9df3e3c2 RDI: 0000000000000246
[ 3708.700503] RBP: ffff9e0f8c1a5d00 R08: 0000035f7e2abfe6 R09: 0000000000000022
[ 3708.702228] R10: ffffadd540307cc0 R11: 0000000000000000 R12: ffff9e0f974ea880
[ 3708.703778] R13: ffff9e0f974f0700 R14: ffff9e0f958a4d80 R15: ffff9e0f8c1a5d08
[ 3708.705410] FS:  0000000000000000(0000) GS:ffff9e0f974c0000(0000) knlGS:0000000000000000
[ 3708.707214] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3708.708462] CR2: 00007f2651a02fa8 CR3: 00000000ac60a000 CR4: 00000000000006e0
[ 3708.710068] Call Trace:
[ 3708.710581]  process_one_work+0x1b4/0x380
[ 3708.711466]  worker_thread+0x50/0x3c0
[ 3708.712307]  kthread+0xf9/0x130
[ 3708.713107]  ? process_one_work+0x380/0x380
[ 3708.714132]  ? kthread_park+0x90/0x90
[ 3708.714938]  ret_from_fork+0x35/0x40
[ 3708.715782] ---[ end trace 0fbe622d402fe0d0 ]---
[10563.256223] perf: interrupt took too long (2566 > 2500), lowering kernel.perf_event_max_sample_rate to 77000
[14719.477168] perf: interrupt took too long (3546 > 3207), lowering kernel.perf_event_max_sample_rate to 56000


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-08 15:18     ` Baoquan He
@ 2020-04-08 19:44       ` Vitaly Kuznetsov
  2020-04-09  1:20         ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2020-04-08 19:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

Baoquan He <bhe@redhat.com> writes:

> On 04/07/20 at 02:04pm, Vitaly Kuznetsov wrote:
>> Baoquan He <bhe@redhat.com> writes:
>> 
>> >
>> > The trace is here. 
>> >
>> > [  132.480817] RIP: 0010:crash_vmclear_local_loaded_vmcss+0x57/0xd0 [kvm_intel] 
>> 
>> This is a known bug,
>> 
>> https://lore.kernel.org/kvm/20200401081348.1345307-1-vkuznets@redhat.com/
>
> Thanks for telling, Vitaly.
>
> I tested your patch, it works.
>
> One thing is I noticed a warning message when your patch is applied. When
> I changed back to revert this patchset, didn't found this message. I didn't
> look into the detail of network core code and the kvm vmx code, maybe it's
> not relevant.
>
>
> [ 3708.629234] Type was not set for devlink port.
> [ 3708.629258] WARNING: CPU: 3 PID: 60 at net/core/devlink.c:7164 devlink_port_type_warn+0x11/0x20
> [ 3708.632328] Modules linked in: rfkill sunrpc intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate iTCO_wdt hpwdt intel_uncore gpio_ich iTCO_vendor_support pcspkr ipmi_ssif hpilo lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter pcc_cpufreq i7core_edac ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper cec ttm crc32c_intel serio_raw drm ata_generic pata_acpi mlx4_core bnx2 hpsa scsi_transport_sas
> [ 3708.640782] CPU: 3 PID: 60 Comm: kworker/3:1 Kdump: loaded Tainted: G          I       5.6.0+ #1
> [ 3708.642715] Hardware name: HP ProLiant DL380 G6, BIOS P62 08/16/2015
> [ 3708.644222] Workqueue: events devlink_port_type_warn
> [ 3708.645349] RIP: 0010:devlink_port_type_warn+0x11/0x20

What's in the patchset you're testing? Is it Sean's series + my patch,
or just my patch? In case it's the later I'm having hard times trying to
see how this can be related, but in case it's the former the fact that
we do stuff a little bit differently on kexec may actually be triggering
the issue above. I still think that it's not causing it, just
triggering.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-08 19:44       ` Vitaly Kuznetsov
@ 2020-04-09  1:20         ` Baoquan He
  2020-04-09 11:14           ` Vitaly Kuznetsov
  0 siblings, 1 reply; 13+ messages in thread
From: Baoquan He @ 2020-04-09  1:20 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

On 04/08/20 at 09:44pm, Vitaly Kuznetsov wrote:
> Baoquan He <bhe@redhat.com> writes:
> 
> > On 04/07/20 at 02:04pm, Vitaly Kuznetsov wrote:
> >> Baoquan He <bhe@redhat.com> writes:
> >> 
> >> >
> >> > The trace is here. 
> >> >
> >> > [  132.480817] RIP: 0010:crash_vmclear_local_loaded_vmcss+0x57/0xd0 [kvm_intel] 
> >> 
> >> This is a known bug,
> >> 
> >> https://lore.kernel.org/kvm/20200401081348.1345307-1-vkuznets@redhat.com/
> >
> > Thanks for telling, Vitaly.
> >
> > I tested your patch, it works.
> >
> > One thing is I noticed a warning message when your patch is applied. When
> > I changed back to revert this patchset, didn't found this message. I didn't
> > look into the detail of network core code and the kvm vmx code, maybe it's
> > not relevant.
> >
> >
> > [ 3708.629234] Type was not set for devlink port.
> > [ 3708.629258] WARNING: CPU: 3 PID: 60 at net/core/devlink.c:7164 devlink_port_type_warn+0x11/0x20
> > [ 3708.632328] Modules linked in: rfkill sunrpc intel_powerclamp coretemp kvm_intel kvm irqbypass intel_cstate iTCO_wdt hpwdt intel_uncore gpio_ich iTCO_vendor_support pcspkr ipmi_ssif hpilo lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter pcc_cpufreq i7core_edac ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper cec ttm crc32c_intel serio_raw drm ata_generic pata_acpi mlx4_core bnx2 hpsa scsi_transport_sas
> > [ 3708.640782] CPU: 3 PID: 60 Comm: kworker/3:1 Kdump: loaded Tainted: G          I       5.6.0+ #1
> > [ 3708.642715] Hardware name: HP ProLiant DL380 G6, BIOS P62 08/16/2015
> > [ 3708.644222] Workqueue: events devlink_port_type_warn
> > [ 3708.645349] RIP: 0010:devlink_port_type_warn+0x11/0x20
> 
> What's in the patchset you're testing? Is it Sean's series + my patch,
> or just my patch? In case it's the later I'm having hard times trying to
> see how this can be related, but in case it's the former the fact that
> we do stuff a little bit differently on kexec may actually be triggering
> the issue above. I still think that it's not causing it, just
> triggering.

I am testing on Linus's tree, this patchset is already there. I just
reverted these patchset, or apply your patch on top of it. Both of them
works. The devlink warning message is not related to this issue because
I found it too when this patchset are reverted.

While I would suggest adding kexec@lists.infradead.org when code changes
are related to kexec/kdump since we usually watch this mailing list.
LKML contains too many mails, we may miss this kind of change, have to
debug and test again. Thanks.

Baoquan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-09  1:20         ` Baoquan He
@ 2020-04-09 11:14           ` Vitaly Kuznetsov
  2020-04-09 12:57             ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2020-04-09 11:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

Baoquan He <bhe@redhat.com> writes:

>
> While I would suggest adding kexec@lists.infradead.org when code changes
> are related to kexec/kdump since we usually watch this mailing list.
> LKML contains too many mails, we may miss this kind of change, have to
> debug and test again.
>

Definitely makes sense and I'll try my best to remember doing this
myself next time but the problem is that scripts/checkpatch.pl is not
smart enough, kexec related bits are scattered all over kernel and
drivers so I'm afraid you're missing a lot in kexec@ :-(

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup
  2020-04-09 11:14           ` Vitaly Kuznetsov
@ 2020-04-09 12:57             ` Baoquan He
  0 siblings, 0 replies; 13+ messages in thread
From: Baoquan He @ 2020-04-09 12:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, kexec,
	dzickus, dyoung, Paolo Bonzini, Sean Christopherson

On 04/09/20 at 01:14pm, Vitaly Kuznetsov wrote:
> Baoquan He <bhe@redhat.com> writes:
> 
> >
> > While I would suggest adding kexec@lists.infradead.org when code changes
> > are related to kexec/kdump since we usually watch this mailing list.
> > LKML contains too many mails, we may miss this kind of change, have to
> > debug and test again.
> >
> 
> Definitely makes sense and I'll try my best to remember doing this
> myself next time but the problem is that scripts/checkpatch.pl is not
> smart enough, kexec related bits are scattered all over kernel and
> drivers so I'm afraid you're missing a lot in kexec@ :-(

Yeah, understand, thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-09 12:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-21 19:37 [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Sean Christopherson
2020-03-21 19:37 ` [PATCH v2 1/3] KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support Sean Christopherson
2020-03-21 19:37 ` [PATCH v2 2/3] KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() Sean Christopherson
2020-03-22 13:08   ` Vitaly Kuznetsov
2020-03-21 19:37 ` [PATCH v2 3/3] KVM: VMX: Gracefully handle faults on VMXON Sean Christopherson
2020-03-22 13:37   ` Vitaly Kuznetsov
2020-04-07 11:01 ` [PATCH v2 0/3] KVM: VMX: Fix for kexec VMCLEAR and VMXON cleanup Baoquan He
2020-04-07 12:04   ` Vitaly Kuznetsov
2020-04-08 15:18     ` Baoquan He
2020-04-08 19:44       ` Vitaly Kuznetsov
2020-04-09  1:20         ` Baoquan He
2020-04-09 11:14           ` Vitaly Kuznetsov
2020-04-09 12:57             ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).