* [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
@ 2020-04-01 8:13 Vitaly Kuznetsov
2020-04-01 15:18 ` Sean Christopherson
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Vitaly Kuznetsov @ 2020-04-01 8:13 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: Jim Mattson, Wanpeng Li, kvm, linux-kernel
If KVM wasn't used at all before we crash the cleanup procedure fails with
BUG: unable to handle page fault for address: ffffffffffffffc8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
Oops: 0000 [#8] SMP PTI
CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
we initialize it in hardware_enable() but this only happens when we start
a VM.
Previously, we used to have a bitmap with enabled CPUs and that was
preventing [masking] the issue.
Initialized loaded_vmcss_on_cpu list earlier, right before we assign
crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
blocked_vcpu_on_cpu_lock are moved altogether for consistency.
Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3aba51d782e2..39a5dde12b79 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2257,10 +2257,6 @@ static int hardware_enable(void)
!hv_get_vp_assist_page(cpu))
return -EFAULT;
- INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
- INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
- spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
-
r = kvm_cpu_vmxon(phys_addr);
if (r)
return r;
@@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
static int __init vmx_init(void)
{
- int r;
+ int r, cpu;
#if IS_ENABLED(CONFIG_HYPERV)
/*
@@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
return r;
}
+ for_each_possible_cpu(cpu) {
+ INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
+ INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
+ spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
+ }
+
#ifdef CONFIG_KEXEC_CORE
rcu_assign_pointer(crash_vmclear_loaded_vmcss,
crash_vmclear_local_loaded_vmcss);
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-04-01 8:13 [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used Vitaly Kuznetsov
@ 2020-04-01 15:18 ` Sean Christopherson
2020-04-07 12:35 ` Paolo Bonzini
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Sean Christopherson @ 2020-04-01 15:18 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Paolo Bonzini, Jim Mattson, Wanpeng Li, kvm, linux-kernel
On Wed, Apr 01, 2020 at 10:13:48AM +0200, Vitaly Kuznetsov wrote:
> If KVM wasn't used at all before we crash the cleanup procedure fails with
> BUG: unable to handle page fault for address: ffffffffffffffc8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
> Oops: 0000 [#8] SMP PTI
> CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
> RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
>
> The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
> we initialize it in hardware_enable() but this only happens when we start
> a VM.
>
> Previously, we used to have a bitmap with enabled CPUs and that was
> preventing [masking] the issue.
>
> Initialized loaded_vmcss_on_cpu list earlier, right before we assign
> crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
> blocked_vcpu_on_cpu_lock are moved altogether for consistency.
>
> Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3aba51d782e2..39a5dde12b79 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> -
> r = kvm_cpu_vmxon(phys_addr);
> if (r)
> return r;
> @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
>
> static int __init vmx_init(void)
> {
> - int r;
> + int r, cpu;
>
> #if IS_ENABLED(CONFIG_HYPERV)
> /*
> @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> return r;
> }
>
> + for_each_possible_cpu(cpu) {
> + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
Hmm, part of me thinks the posted interrupt per_cpu variables should
continue to be initialized during hardware_enable(). But it's a small
part of me :-)
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> + }
> +
> #ifdef CONFIG_KEXEC_CORE
> rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> crash_vmclear_local_loaded_vmcss);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-04-01 8:13 [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used Vitaly Kuznetsov
2020-04-01 15:18 ` Sean Christopherson
@ 2020-04-07 12:35 ` Paolo Bonzini
2020-04-09 1:22 ` Baoquan He
2020-08-20 20:08 ` Jim Mattson
3 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2020-04-07 12:35 UTC (permalink / raw)
To: Vitaly Kuznetsov, Sean Christopherson
Cc: Jim Mattson, Wanpeng Li, kvm, linux-kernel
On 01/04/20 10:13, Vitaly Kuznetsov wrote:
> If KVM wasn't used at all before we crash the cleanup procedure fails with
> BUG: unable to handle page fault for address: ffffffffffffffc8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
> Oops: 0000 [#8] SMP PTI
> CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
> RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
>
> The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
> we initialize it in hardware_enable() but this only happens when we start
> a VM.
>
> Previously, we used to have a bitmap with enabled CPUs and that was
> preventing [masking] the issue.
>
> Initialized loaded_vmcss_on_cpu list earlier, right before we assign
> crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
> blocked_vcpu_on_cpu_lock are moved altogether for consistency.
>
> Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3aba51d782e2..39a5dde12b79 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> -
> r = kvm_cpu_vmxon(phys_addr);
> if (r)
> return r;
> @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
>
> static int __init vmx_init(void)
> {
> - int r;
> + int r, cpu;
>
> #if IS_ENABLED(CONFIG_HYPERV)
> /*
> @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> return r;
> }
>
> + for_each_possible_cpu(cpu) {
> + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> + }
> +
> #ifdef CONFIG_KEXEC_CORE
> rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> crash_vmclear_local_loaded_vmcss);
>
Queued, thanks.
Paolo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-04-01 8:13 [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used Vitaly Kuznetsov
@ 2020-04-09 1:22 ` Baoquan He
2020-04-07 12:35 ` Paolo Bonzini
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2020-04-09 1:22 UTC (permalink / raw)
To: Vitaly Kuznetsov, kexec
Cc: Paolo Bonzini, Sean Christopherson, Jim Mattson, Wanpeng Li, kvm,
linux-kernel
On 04/01/20 at 10:13am, Vitaly Kuznetsov wrote:
> If KVM wasn't used at all before we crash the cleanup procedure fails with
> BUG: unable to handle page fault for address: ffffffffffffffc8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
> Oops: 0000 [#8] SMP PTI
> CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
> RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
>
> The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
> we initialize it in hardware_enable() but this only happens when we start
> a VM.
>
> Previously, we used to have a bitmap with enabled CPUs and that was
> preventing [masking] the issue.
>
> Initialized loaded_vmcss_on_cpu list earlier, right before we assign
> crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
> blocked_vcpu_on_cpu_lock are moved altogether for consistency.
>
> Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Kdump kernel hang can be reproduced on a bare metal machine of Intel always,
issue disappeared with this patch applied. Feel free to add:
Tested-by: Baoquan He <bhe@redhat.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3aba51d782e2..39a5dde12b79 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> -
> r = kvm_cpu_vmxon(phys_addr);
> if (r)
> return r;
> @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
>
> static int __init vmx_init(void)
> {
> - int r;
> + int r, cpu;
>
> #if IS_ENABLED(CONFIG_HYPERV)
> /*
> @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> return r;
> }
>
> + for_each_possible_cpu(cpu) {
> + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> + }
> +
> #ifdef CONFIG_KEXEC_CORE
> rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> crash_vmclear_local_loaded_vmcss);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
@ 2020-04-09 1:22 ` Baoquan He
0 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2020-04-09 1:22 UTC (permalink / raw)
To: Vitaly Kuznetsov, kexec
Cc: Wanpeng Li, kvm, linux-kernel, Sean Christopherson,
Paolo Bonzini, Jim Mattson
On 04/01/20 at 10:13am, Vitaly Kuznetsov wrote:
> If KVM wasn't used at all before we crash the cleanup procedure fails with
> BUG: unable to handle page fault for address: ffffffffffffffc8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
> Oops: 0000 [#8] SMP PTI
> CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
> RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
>
> The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
> we initialize it in hardware_enable() but this only happens when we start
> a VM.
>
> Previously, we used to have a bitmap with enabled CPUs and that was
> preventing [masking] the issue.
>
> Initialized loaded_vmcss_on_cpu list earlier, right before we assign
> crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
> blocked_vcpu_on_cpu_lock are moved altogether for consistency.
>
> Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Kdump kernel hang can be reproduced on a bare metal machine of Intel always,
issue disappeared with this patch applied. Feel free to add:
Tested-by: Baoquan He <bhe@redhat.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3aba51d782e2..39a5dde12b79 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> -
> r = kvm_cpu_vmxon(phys_addr);
> if (r)
> return r;
> @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
>
> static int __init vmx_init(void)
> {
> - int r;
> + int r, cpu;
>
> #if IS_ENABLED(CONFIG_HYPERV)
> /*
> @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> return r;
> }
>
> + for_each_possible_cpu(cpu) {
> + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> + }
> +
> #ifdef CONFIG_KEXEC_CORE
> rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> crash_vmclear_local_loaded_vmcss);
> --
> 2.25.1
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-04-01 8:13 [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used Vitaly Kuznetsov
` (2 preceding siblings ...)
2020-04-09 1:22 ` Baoquan He
@ 2020-08-20 20:08 ` Jim Mattson
2020-08-22 3:40 ` Sean Christopherson
3 siblings, 1 reply; 12+ messages in thread
From: Jim Mattson @ 2020-08-20 20:08 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Paolo Bonzini, Sean Christopherson, Wanpeng Li, kvm list, LKML
On Wed, Apr 1, 2020 at 1:13 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> If KVM wasn't used at all before we crash the cleanup procedure fails with
> BUG: unable to handle page fault for address: ffffffffffffffc8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
> Oops: 0000 [#8] SMP PTI
> CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G D 5.6.0-rc2+ #823
> RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
>
> The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
> we initialize it in hardware_enable() but this only happens when we start
> a VM.
>
> Previously, we used to have a bitmap with enabled CPUs and that was
> preventing [masking] the issue.
>
> Initialized loaded_vmcss_on_cpu list earlier, right before we assign
> crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
> blocked_vcpu_on_cpu_lock are moved altogether for consistency.
>
> Fixes: 31603d4fc2bb ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3aba51d782e2..39a5dde12b79 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> -
> r = kvm_cpu_vmxon(phys_addr);
> if (r)
> return r;
> @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
>
> static int __init vmx_init(void)
> {
> - int r;
> + int r, cpu;
>
> #if IS_ENABLED(CONFIG_HYPERV)
> /*
> @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> return r;
> }
>
> + for_each_possible_cpu(cpu) {
> + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> + }
Just above this chunk, we have:
r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
if (r) {
vmx_exit();
...
If we take that early exit, because vmx_setup_l1d_flush() fails, we
won't initialize loaded_vmcss_on_cpu. However, vmx_exit() calls
kvm_exit(), which calls on_each_cpu(hardware_disable_nolock, NULL, 1).
Hardware_disable_nolock() then calls kvm_arch_hardware_disable(),
which calls kvm_x86_ops.hardware_disable() [the vmx.c
hardware_disable()], which calls vmclear_local_loaded_vmcss().
I believe that vmclear_local_loaded_vmcss() will then try to
dereference a NULL pointer, since per_cpu(loaded_vmcss_on_cpu, cpu) is
uninitialzed.
> #ifdef CONFIG_KEXEC_CORE
> rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> crash_vmclear_local_loaded_vmcss);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-08-20 20:08 ` Jim Mattson
@ 2020-08-22 3:40 ` Sean Christopherson
2020-08-24 18:57 ` Jim Mattson
0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2020-08-22 3:40 UTC (permalink / raw)
To: Jim Mattson; +Cc: Vitaly Kuznetsov, Paolo Bonzini, Wanpeng Li, kvm list, LKML
On Thu, Aug 20, 2020 at 01:08:22PM -0700, Jim Mattson wrote:
> On Wed, Apr 1, 2020 at 1:13 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > ---
> > arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> > 1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 3aba51d782e2..39a5dde12b79 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> > !hv_get_vp_assist_page(cpu))
> > return -EFAULT;
> >
> > - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > -
> > r = kvm_cpu_vmxon(phys_addr);
> > if (r)
> > return r;
> > @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
> >
> > static int __init vmx_init(void)
> > {
> > - int r;
> > + int r, cpu;
> >
> > #if IS_ENABLED(CONFIG_HYPERV)
> > /*
> > @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> > return r;
> > }
> >
> > + for_each_possible_cpu(cpu) {
> > + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > + }
>
> Just above this chunk, we have:
>
> r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
> if (r) {
> vmx_exit();
> ...
>
> If we take that early exit, because vmx_setup_l1d_flush() fails, we
> won't initialize loaded_vmcss_on_cpu. However, vmx_exit() calls
> kvm_exit(), which calls on_each_cpu(hardware_disable_nolock, NULL, 1).
> Hardware_disable_nolock() then calls kvm_arch_hardware_disable(),
> which calls kvm_x86_ops.hardware_disable() [the vmx.c
> hardware_disable()], which calls vmclear_local_loaded_vmcss().
>
> I believe that vmclear_local_loaded_vmcss() will then try to
> dereference a NULL pointer, since per_cpu(loaded_vmcss_on_cpu, cpu) is
> uninitialzed.
I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
finished.
> > #ifdef CONFIG_KEXEC_CORE
> > rcu_assign_pointer(crash_vmclear_loaded_vmcss,
> > crash_vmclear_local_loaded_vmcss);
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-08-22 3:40 ` Sean Christopherson
@ 2020-08-24 18:57 ` Jim Mattson
2020-08-24 22:45 ` Jim Mattson
0 siblings, 1 reply; 12+ messages in thread
From: Jim Mattson @ 2020-08-24 18:57 UTC (permalink / raw)
To: Sean Christopherson
Cc: Vitaly Kuznetsov, Paolo Bonzini, Wanpeng Li, kvm list, LKML
On Fri, Aug 21, 2020 at 8:40 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Thu, Aug 20, 2020 at 01:08:22PM -0700, Jim Mattson wrote:
> > On Wed, Apr 1, 2020 at 1:13 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > ---
> > > arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> > > 1 file changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index 3aba51d782e2..39a5dde12b79 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> > > !hv_get_vp_assist_page(cpu))
> > > return -EFAULT;
> > >
> > > - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > > - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > > - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > > -
> > > r = kvm_cpu_vmxon(phys_addr);
> > > if (r)
> > > return r;
> > > @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
> > >
> > > static int __init vmx_init(void)
> > > {
> > > - int r;
> > > + int r, cpu;
> > >
> > > #if IS_ENABLED(CONFIG_HYPERV)
> > > /*
> > > @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> > > return r;
> > > }
> > >
> > > + for_each_possible_cpu(cpu) {
> > > + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > > + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > > + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > > + }
> >
> > Just above this chunk, we have:
> >
> > r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
> > if (r) {
> > vmx_exit();
> > ...
> >
> > If we take that early exit, because vmx_setup_l1d_flush() fails, we
> > won't initialize loaded_vmcss_on_cpu. However, vmx_exit() calls
> > kvm_exit(), which calls on_each_cpu(hardware_disable_nolock, NULL, 1).
> > Hardware_disable_nolock() then calls kvm_arch_hardware_disable(),
> > which calls kvm_x86_ops.hardware_disable() [the vmx.c
> > hardware_disable()], which calls vmclear_local_loaded_vmcss().
> >
> > I believe that vmclear_local_loaded_vmcss() will then try to
> > dereference a NULL pointer, since per_cpu(loaded_vmcss_on_cpu, cpu) is
> > uninitialzed.
>
> I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
> pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
> impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
> finished.
Unless I'm missing something, there's no check for a non-zero
kvm_usage_count on this path. There is such a check in
hardware_disable_all_nolock(), but not in hardware_disable_nolock().
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-08-24 18:57 ` Jim Mattson
@ 2020-08-24 22:45 ` Jim Mattson
2020-08-25 0:09 ` Sean Christopherson
0 siblings, 1 reply; 12+ messages in thread
From: Jim Mattson @ 2020-08-24 22:45 UTC (permalink / raw)
To: Sean Christopherson
Cc: Vitaly Kuznetsov, Paolo Bonzini, Wanpeng Li, kvm list, LKML
On Mon, Aug 24, 2020 at 11:57 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Fri, Aug 21, 2020 at 8:40 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Thu, Aug 20, 2020 at 01:08:22PM -0700, Jim Mattson wrote:
> > > On Wed, Apr 1, 2020 at 1:13 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> > > > ---
> > > > arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
> > > > 1 file changed, 7 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > > index 3aba51d782e2..39a5dde12b79 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.c
> > > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > > @@ -2257,10 +2257,6 @@ static int hardware_enable(void)
> > > > !hv_get_vp_assist_page(cpu))
> > > > return -EFAULT;
> > > >
> > > > - INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > > > - INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > > > - spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > > > -
> > > > r = kvm_cpu_vmxon(phys_addr);
> > > > if (r)
> > > > return r;
> > > > @@ -8006,7 +8002,7 @@ module_exit(vmx_exit);
> > > >
> > > > static int __init vmx_init(void)
> > > > {
> > > > - int r;
> > > > + int r, cpu;
> > > >
> > > > #if IS_ENABLED(CONFIG_HYPERV)
> > > > /*
> > > > @@ -8060,6 +8056,12 @@ static int __init vmx_init(void)
> > > > return r;
> > > > }
> > > >
> > > > + for_each_possible_cpu(cpu) {
> > > > + INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> > > > + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> > > > + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> > > > + }
> > >
> > > Just above this chunk, we have:
> > >
> > > r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
> > > if (r) {
> > > vmx_exit();
> > > ...
> > >
> > > If we take that early exit, because vmx_setup_l1d_flush() fails, we
> > > won't initialize loaded_vmcss_on_cpu. However, vmx_exit() calls
> > > kvm_exit(), which calls on_each_cpu(hardware_disable_nolock, NULL, 1).
> > > Hardware_disable_nolock() then calls kvm_arch_hardware_disable(),
> > > which calls kvm_x86_ops.hardware_disable() [the vmx.c
> > > hardware_disable()], which calls vmclear_local_loaded_vmcss().
> > >
> > > I believe that vmclear_local_loaded_vmcss() will then try to
> > > dereference a NULL pointer, since per_cpu(loaded_vmcss_on_cpu, cpu) is
> > > uninitialzed.
> >
> > I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
> > pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
> > impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
> > finished.
>
> Unless I'm missing something, there's no check for a non-zero
> kvm_usage_count on this path. There is such a check in
> hardware_disable_all_nolock(), but not in hardware_disable_nolock().
However, cpus_hardware_enabled shouldn't have any bits set, so
everything's fine. Nothing to see here, after all.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-08-24 22:45 ` Jim Mattson
@ 2020-08-25 0:09 ` Sean Christopherson
2020-09-01 10:36 ` Vitaly Kuznetsov
0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2020-08-25 0:09 UTC (permalink / raw)
To: Jim Mattson; +Cc: Vitaly Kuznetsov, Paolo Bonzini, Wanpeng Li, kvm list, LKML
On Mon, Aug 24, 2020 at 03:45:26PM -0700, Jim Mattson wrote:
> On Mon, Aug 24, 2020 at 11:57 AM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Fri, Aug 21, 2020 at 8:40 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > > I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
> > > pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
> > > impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
> > > finished.
> >
> > Unless I'm missing something, there's no check for a non-zero
> > kvm_usage_count on this path. There is such a check in
> > hardware_disable_all_nolock(), but not in hardware_disable_nolock().
>
> However, cpus_hardware_enabled shouldn't have any bits set, so
> everything's fine. Nothing to see here, after all.
Ugh, I forgot that hardware_disable_all_nolock() does a BUG_ON() instead of
bailing on !kvm_usage_count.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-08-25 0:09 ` Sean Christopherson
@ 2020-09-01 10:36 ` Vitaly Kuznetsov
2020-09-02 15:18 ` Sean Christopherson
0 siblings, 1 reply; 12+ messages in thread
From: Vitaly Kuznetsov @ 2020-09-01 10:36 UTC (permalink / raw)
To: Sean Christopherson, Jim Mattson
Cc: Paolo Bonzini, Wanpeng Li, kvm list, LKML
Sean Christopherson <sean.j.christopherson@intel.com> writes:
> On Mon, Aug 24, 2020 at 03:45:26PM -0700, Jim Mattson wrote:
>> On Mon, Aug 24, 2020 at 11:57 AM Jim Mattson <jmattson@google.com> wrote:
>> >
>> > On Fri, Aug 21, 2020 at 8:40 PM Sean Christopherson
>> > <sean.j.christopherson@intel.com> wrote:
>> > > I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
>> > > pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
>> > > impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
>> > > finished.
>> >
>> > Unless I'm missing something, there's no check for a non-zero
>> > kvm_usage_count on this path. There is such a check in
>> > hardware_disable_all_nolock(), but not in hardware_disable_nolock().
>>
>> However, cpus_hardware_enabled shouldn't have any bits set, so
>> everything's fine. Nothing to see here, after all.
>
> Ugh, I forgot that hardware_disable_all_nolock() does a BUG_ON() instead of
> bailing on !kvm_usage_count.
But we can't hit this BUG_ON(), right? I'm failing to see how
hardware_disable_all_nolock() can be reached with kvm_usage_count==0.
--
Vitaly
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used
2020-09-01 10:36 ` Vitaly Kuznetsov
@ 2020-09-02 15:18 ` Sean Christopherson
0 siblings, 0 replies; 12+ messages in thread
From: Sean Christopherson @ 2020-09-02 15:18 UTC (permalink / raw)
To: Vitaly Kuznetsov; +Cc: Jim Mattson, Paolo Bonzini, Wanpeng Li, kvm list, LKML
On Tue, Sep 01, 2020 at 12:36:40PM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>
> > On Mon, Aug 24, 2020 at 03:45:26PM -0700, Jim Mattson wrote:
> >> On Mon, Aug 24, 2020 at 11:57 AM Jim Mattson <jmattson@google.com> wrote:
> >> >
> >> > On Fri, Aug 21, 2020 at 8:40 PM Sean Christopherson
> >> > <sean.j.christopherson@intel.com> wrote:
> >> > > I agree the code is a mess (kvm_init() and kvm_exit() included), but I'm
> >> > > pretty sure hardware_disable_nolock() is guaranteed to be a nop as it's
> >> > > impossible for kvm_usage_count to be non-zero if vmx_init() hasn't
> >> > > finished.
> >> >
> >> > Unless I'm missing something, there's no check for a non-zero
> >> > kvm_usage_count on this path. There is such a check in
> >> > hardware_disable_all_nolock(), but not in hardware_disable_nolock().
> >>
> >> However, cpus_hardware_enabled shouldn't have any bits set, so
> >> everything's fine. Nothing to see here, after all.
> >
> > Ugh, I forgot that hardware_disable_all_nolock() does a BUG_ON() instead of
> > bailing on !kvm_usage_count.
>
> But we can't hit this BUG_ON(), right? I'm failing to see how
> hardware_disable_all_nolock() can be reached with kvm_usage_count==0.
Correct, I was mostly talking to myself.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-09-02 15:24 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 8:13 [PATCH] KVM: VMX: fix crash cleanup when KVM wasn't used Vitaly Kuznetsov
2020-04-01 15:18 ` Sean Christopherson
2020-04-07 12:35 ` Paolo Bonzini
2020-04-09 1:22 ` Baoquan He
2020-04-09 1:22 ` Baoquan He
2020-08-20 20:08 ` Jim Mattson
2020-08-22 3:40 ` Sean Christopherson
2020-08-24 18:57 ` Jim Mattson
2020-08-24 22:45 ` Jim Mattson
2020-08-25 0:09 ` Sean Christopherson
2020-09-01 10:36 ` Vitaly Kuznetsov
2020-09-02 15:18 ` Sean Christopherson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.