All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
@ 2016-07-06 11:42 Wanpeng Li
  2016-07-06 13:26 ` Haozhong Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2016-07-06 11:42 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Wanpeng Li, Paolo Bonzini, Radim Krčmář,
	Yunhong Jiang, Jan Kiszka, Haozhong Zhang

From: Wanpeng Li <wanpeng.li@hotmail.com>

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<          (null)>]           (null)
PGD 0 
Oops: 0010 [#1] SMP
Call Trace:
 ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
 handle_preemption_timer+0xe/0x20 [kvm_intel]
 vmx_handle_exit+0x169/0x15a0 [kvm_intel]
 ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
 kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
 ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
 ? vcpu_load+0x1c/0x60 [kvm]
 ? kvm_arch_vcpu_load+0x57/0x260 [kvm]
 kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
 do_vfs_ioctl+0x96/0x6a0
 ? __fget_light+0x2a/0x90
 SyS_ioctl+0x79/0x90
 do_syscall_64+0x68/0x180
 entry_SYSCALL64_slow_path+0x25/0x25
Code:  Bad RIP value.
RIP  [<          (null)>]           (null)
 RSP <ffff8800b5263c48>
CR2: 0000000000000000
---[ end trace 9c70c48b1a2bc66e ]---

This can be reproduced readily by preemption timer enabled on L0 and disabled 
on L1.

Preemption timer for nested VMX is emulated by hrtimer which is started on L2
entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
nested_vmx_exit_handled is always return true for preemption timer vmexit, then 
the L1 preemption timer vmexit is captured and be treated as a L2 preemption 
timer vmexit, incurr a nested vmexit dereference NULL pointer.

This patch fix it by depending on check_nested_events to capture L2 preemption 
timer(emulated hrtimer) expire and nested vmexit.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v2 -> v3:
 * update patch subject
v1 -> v2:
 * fix typo in patch description

 arch/x86/kvm/vmx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 85e2f0a..29c16a8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8041,6 +8041,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
 	case EXIT_REASON_PCOMMIT:
 		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_PCOMMIT);
+	case EXIT_REASON_PREEMPTION_TIMER:
+		return false;
 	default:
 		return true;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 11:42 [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest Wanpeng Li
@ 2016-07-06 13:26 ` Haozhong Zhang
  2016-07-06 13:32   ` Paolo Bonzini
  2016-07-07  1:05   ` Wanpeng Li
  0 siblings, 2 replies; 8+ messages in thread
From: Haozhong Zhang @ 2016-07-06 13:26 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, kvm, Wanpeng Li, Paolo Bonzini,
	Radim Krčmář,
	Yunhong Jiang, Jan Kiszka

On 07/06/16 19:42, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<          (null)>]           (null)
> PGD 0 
> Oops: 0010 [#1] SMP
> Call Trace:
>  ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
>  handle_preemption_timer+0xe/0x20 [kvm_intel]
>  vmx_handle_exit+0x169/0x15a0 [kvm_intel]
>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>  kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>  ? vcpu_load+0x1c/0x60 [kvm]
>  ? kvm_arch_vcpu_load+0x57/0x260 [kvm]
>  kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
>  do_vfs_ioctl+0x96/0x6a0
>  ? __fget_light+0x2a/0x90
>  SyS_ioctl+0x79/0x90
>  do_syscall_64+0x68/0x180
>  entry_SYSCALL64_slow_path+0x25/0x25
> Code:  Bad RIP value.
> RIP  [<          (null)>]           (null)
>  RSP <ffff8800b5263c48>
> CR2: 0000000000000000
> ---[ end trace 9c70c48b1a2bc66e ]---
> 
> This can be reproduced readily by preemption timer enabled on L0 and disabled 
> on L1.
> 
> Preemption timer for nested VMX is emulated by hrtimer which is started on L2
> entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
> nested_vmx_exit_handled is always return true for preemption timer vmexit, then 
> the L1 preemption timer vmexit is captured and be treated as a L2 preemption 
> timer vmexit, incurr a nested vmexit dereference NULL pointer.
> 
> This patch fix it by depending on check_nested_events to capture L2 preemption 
> timer(emulated hrtimer) expire and nested vmexit.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Yunhong Jiang <yunhong.jiang@intel.com>
> Cc: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> v2 -> v3:
>  * update patch subject
> v1 -> v2:
>  * fix typo in patch description
> 
>  arch/x86/kvm/vmx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 85e2f0a..29c16a8 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8041,6 +8041,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
>  	case EXIT_REASON_PCOMMIT:
>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_PCOMMIT);
> +	case EXIT_REASON_PREEMPTION_TIMER:
> +		return false;
>  	default:
>  		return true;
>  	}
> -- 
> 1.9.1
> 

This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
not enable preemption timer for HVM guests, and will get panic if it
receives a preemption timer vmexit.

Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 13:26 ` Haozhong Zhang
@ 2016-07-06 13:32   ` Paolo Bonzini
  2016-07-06 16:03     ` Haozhong Zhang
  2016-07-07  1:05   ` Wanpeng Li
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2016-07-06 13:32 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	Yunhong Jiang, Jan Kiszka



On 06/07/2016 15:26, Haozhong Zhang wrote:
> On 07/06/16 19:42, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>> IP: [<          (null)>]           (null)
>> PGD 0 
>> Oops: 0010 [#1] SMP
>> Call Trace:
>>  ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
>>  handle_preemption_timer+0xe/0x20 [kvm_intel]
>>  vmx_handle_exit+0x169/0x15a0 [kvm_intel]
>>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>>  kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
>>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>>  ? vcpu_load+0x1c/0x60 [kvm]
>>  ? kvm_arch_vcpu_load+0x57/0x260 [kvm]
>>  kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
>>  do_vfs_ioctl+0x96/0x6a0
>>  ? __fget_light+0x2a/0x90
>>  SyS_ioctl+0x79/0x90
>>  do_syscall_64+0x68/0x180
>>  entry_SYSCALL64_slow_path+0x25/0x25
>> Code:  Bad RIP value.
>> RIP  [<          (null)>]           (null)
>>  RSP <ffff8800b5263c48>
>> CR2: 0000000000000000
>> ---[ end trace 9c70c48b1a2bc66e ]---
>>
>> This can be reproduced readily by preemption timer enabled on L0 and disabled 
>> on L1.
>>
>> Preemption timer for nested VMX is emulated by hrtimer which is started on L2
>> entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
>> nested_vmx_exit_handled is always return true for preemption timer vmexit, then 
>> the L1 preemption timer vmexit is captured and be treated as a L2 preemption 
>> timer vmexit, incurr a nested vmexit dereference NULL pointer.
>>
>> This patch fix it by depending on check_nested_events to capture L2 preemption 
>> timer(emulated hrtimer) expire and nested vmexit.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Yunhong Jiang <yunhong.jiang@intel.com>
>> Cc: Jan Kiszka <jan.kiszka@siemens.com>
>> Cc: Haozhong Zhang <haozhong.zhang@intel.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>> v2 -> v3:
>>  * update patch subject
>> v1 -> v2:
>>  * fix typo in patch description
>>
>>  arch/x86/kvm/vmx.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 85e2f0a..29c16a8 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -8041,6 +8041,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
>>  	case EXIT_REASON_PCOMMIT:
>>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_PCOMMIT);
>> +	case EXIT_REASON_PREEMPTION_TIMER:
>> +		return false;
>>  	default:
>>  		return true;
>>  	}
>> -- 
>> 1.9.1
>>
> 
> This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
> not enable preemption timer for HVM guests, and will get panic if it
> receives a preemption timer vmexit.

Thanks!  I'm still not sure why the bit is set in the vmcs02 though...

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 13:32   ` Paolo Bonzini
@ 2016-07-06 16:03     ` Haozhong Zhang
  2016-07-06 17:11       ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Haozhong Zhang @ 2016-07-06 16:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	Yunhong Jiang, Jan Kiszka, Jim Mattson

On 07/06/16 15:32, Paolo Bonzini wrote:
> 
> 
> On 06/07/2016 15:26, Haozhong Zhang wrote:
> > On 07/06/16 19:42, Wanpeng Li wrote:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> BUG: unable to handle kernel NULL pointer dereference at           (null)
> >> IP: [<          (null)>]           (null)
> >> PGD 0 
> >> Oops: 0010 [#1] SMP
> >> Call Trace:
> >>  ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
> >>  handle_preemption_timer+0xe/0x20 [kvm_intel]
> >>  vmx_handle_exit+0x169/0x15a0 [kvm_intel]
> >>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
> >>  kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
> >>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
> >>  ? vcpu_load+0x1c/0x60 [kvm]
> >>  ? kvm_arch_vcpu_load+0x57/0x260 [kvm]
> >>  kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
> >>  do_vfs_ioctl+0x96/0x6a0
> >>  ? __fget_light+0x2a/0x90
> >>  SyS_ioctl+0x79/0x90
> >>  do_syscall_64+0x68/0x180
> >>  entry_SYSCALL64_slow_path+0x25/0x25
> >> Code:  Bad RIP value.
> >> RIP  [<          (null)>]           (null)
> >>  RSP <ffff8800b5263c48>
> >> CR2: 0000000000000000
> >> ---[ end trace 9c70c48b1a2bc66e ]---
> >>
> >> This can be reproduced readily by preemption timer enabled on L0 and disabled 
> >> on L1.
> >>
> >> Preemption timer for nested VMX is emulated by hrtimer which is started on L2
> >> entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
> >> nested_vmx_exit_handled is always return true for preemption timer vmexit, then 
> >> the L1 preemption timer vmexit is captured and be treated as a L2 preemption 
> >> timer vmexit, incurr a nested vmexit dereference NULL pointer.
> >>
> >> This patch fix it by depending on check_nested_events to capture L2 preemption 
> >> timer(emulated hrtimer) expire and nested vmexit.
> >>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
> >> Cc: Yunhong Jiang <yunhong.jiang@intel.com>
> >> Cc: Jan Kiszka <jan.kiszka@siemens.com>
> >> Cc: Haozhong Zhang <haozhong.zhang@intel.com>
> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> >> ---
> >> v2 -> v3:
> >>  * update patch subject
> >> v1 -> v2:
> >>  * fix typo in patch description
> >>
> >>  arch/x86/kvm/vmx.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index 85e2f0a..29c16a8 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -8041,6 +8041,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
> >>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
> >>  	case EXIT_REASON_PCOMMIT:
> >>  		return nested_cpu_has2(vmcs12, SECONDARY_EXEC_PCOMMIT);
> >> +	case EXIT_REASON_PREEMPTION_TIMER:
> >> +		return false;
> >>  	default:
> >>  		return true;
> >>  	}
> >> -- 
> >> 1.9.1
> >>
> > 
> > This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
> > not enable preemption timer for HVM guests, and will get panic if it
> > receives a preemption timer vmexit.
> 
> Thanks!  I'm still not sure why the bit is set in the vmcs02 though...
> 

Yes, it looks really weird.

I replaced "return false" in Wanpeng's patch by

    pr_info("VMCS: preemption timer enabled = %d\n",
            !!(vmcs_read32(PIN_BASED_VM_EXEC_CONTROL) & PIN_BASED_VMX_PREEMPTION_TIMER));

and redid my test. As expected, L1 Xen crashed due to the unexpected
preemption timer vmexit. I got a log from above statement just before crash:

    VMCS: preemption timer enabled = 1

which is expected to be 0, because preemption timer is disabled in
vmcs02. I also modified L1 Xen to dump VMCS at crash, and it says
preemption timer is disabled.

I noticed Jim Mattson recently sent a patch "KVM: nVMX: Fix memory
corruption when using VMCS shadowing" to fix the inconsistency between
vmcs12 and its shadow. Is it relevant here? I'll test his patch
tomorrow.

Thanks,
Haozhong

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 16:03     ` Haozhong Zhang
@ 2016-07-06 17:11       ` Paolo Bonzini
  2016-07-07  1:58         ` Wanpeng Li
  2016-07-07  3:55         ` Wanpeng Li
  0 siblings, 2 replies; 8+ messages in thread
From: Paolo Bonzini @ 2016-07-06 17:11 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	Yunhong Jiang, Jan Kiszka, Jim Mattson



On 06/07/2016 18:03, Haozhong Zhang wrote:
>>> This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
>>> not enable preemption timer for HVM guests, and will get panic if it
>>> receives a preemption timer vmexit.
>>
>> Thanks!  I'm still not sure why the bit is set in the vmcs02 though...
> 
> Yes, it looks really weird.
> 
> I replaced "return false" in Wanpeng's patch by
> 
>     pr_info("VMCS: preemption timer enabled = %d\n",
>             !!(vmcs_read32(PIN_BASED_VM_EXEC_CONTROL) & PIN_BASED_VMX_PREEMPTION_TIMER));
> 
> and redid my test. As expected, L1 Xen crashed due to the unexpected
> preemption timer vmexit. I got a log from above statement just before crash:
> 
>     VMCS: preemption timer enabled = 1
> 
> which is expected to be 0, because preemption timer is disabled in
> vmcs02. I also modified L1 Xen to dump VMCS at crash, and it says
> preemption timer is disabled.
> 
> I noticed Jim Mattson recently sent a patch "KVM: nVMX: Fix memory
> corruption when using VMCS shadowing" to fix the inconsistency between
> vmcs12 and its shadow. Is it relevant here? I'll test his patch
> tomorrow.

No, it shouldn't have any effect.

I think it happens when the post_block hook switches back from sw_timer
to hv_timer, and L2 is running.  So the right fix should be along the
lines of what I posted earlier.  If you don't beat me to it, I'll take
another look tomorrow.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 13:26 ` Haozhong Zhang
  2016-07-06 13:32   ` Paolo Bonzini
@ 2016-07-07  1:05   ` Wanpeng Li
  1 sibling, 0 replies; 8+ messages in thread
From: Wanpeng Li @ 2016-07-07  1:05 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm, Wanpeng Li, Paolo Bonzini,
	Radim Krčmář,
	Yunhong Jiang, Jan Kiszka

2016-07-06 21:26 GMT+08:00 Haozhong Zhang <haozhong.zhang@intel.com>:
> On 07/06/16 19:42, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>> IP: [<          (null)>]           (null)
>> PGD 0
>> Oops: 0010 [#1] SMP
>> Call Trace:
>>  ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
>>  handle_preemption_timer+0xe/0x20 [kvm_intel]
>>  vmx_handle_exit+0x169/0x15a0 [kvm_intel]
>>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>>  kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
>>  ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
>>  ? vcpu_load+0x1c/0x60 [kvm]
>>  ? kvm_arch_vcpu_load+0x57/0x260 [kvm]
>>  kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
>>  do_vfs_ioctl+0x96/0x6a0
>>  ? __fget_light+0x2a/0x90
>>  SyS_ioctl+0x79/0x90
>>  do_syscall_64+0x68/0x180
>>  entry_SYSCALL64_slow_path+0x25/0x25
>> Code:  Bad RIP value.
>> RIP  [<          (null)>]           (null)
>>  RSP <ffff8800b5263c48>
>> CR2: 0000000000000000
>> ---[ end trace 9c70c48b1a2bc66e ]---
>>
>> This can be reproduced readily by preemption timer enabled on L0 and disabled
>> on L1.
>>
>> Preemption timer for nested VMX is emulated by hrtimer which is started on L2
>> entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
>> nested_vmx_exit_handled is always return true for preemption timer vmexit, then
>> the L1 preemption timer vmexit is captured and be treated as a L2 preemption
>> timer vmexit, incurr a nested vmexit dereference NULL pointer.
>>
>> This patch fix it by depending on check_nested_events to capture L2 preemption
>> timer(emulated hrtimer) expire and nested vmexit.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Yunhong Jiang <yunhong.jiang@intel.com>
>> Cc: Jan Kiszka <jan.kiszka@siemens.com>
>> Cc: Haozhong Zhang <haozhong.zhang@intel.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>> v2 -> v3:
>>  * update patch subject
>> v1 -> v2:
>>  * fix typo in patch description
>>
>>  arch/x86/kvm/vmx.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 85e2f0a..29c16a8 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -8041,6 +8041,8 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
>>               return nested_cpu_has2(vmcs12, SECONDARY_EXEC_XSAVES);
>>       case EXIT_REASON_PCOMMIT:
>>               return nested_cpu_has2(vmcs12, SECONDARY_EXEC_PCOMMIT);
>> +     case EXIT_REASON_PREEMPTION_TIMER:
>> +             return false;
>>       default:
>>               return true;
>>       }
>> --
>> 1.9.1
>>
>
> This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
> not enable preemption timer for HVM guests, and will get panic if it
> receives a preemption timer vmexit.
>
> Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>

Thanks. :)

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 17:11       ` Paolo Bonzini
@ 2016-07-07  1:58         ` Wanpeng Li
  2016-07-07  3:55         ` Wanpeng Li
  1 sibling, 0 replies; 8+ messages in thread
From: Wanpeng Li @ 2016-07-07  1:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, Wanpeng Li, Radim Krčmář,
	Yunhong Jiang, Jan Kiszka, Jim Mattson

2016-07-07 1:11 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 06/07/2016 18:03, Haozhong Zhang wrote:
>>>> This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
>>>> not enable preemption timer for HVM guests, and will get panic if it
>>>> receives a preemption timer vmexit.
>>>
>>> Thanks!  I'm still not sure why the bit is set in the vmcs02 though...
>>
>> Yes, it looks really weird.
>>
>> I replaced "return false" in Wanpeng's patch by
>>
>>     pr_info("VMCS: preemption timer enabled = %d\n",
>>             !!(vmcs_read32(PIN_BASED_VM_EXEC_CONTROL) & PIN_BASED_VMX_PREEMPTION_TIMER));
>>
>> and redid my test. As expected, L1 Xen crashed due to the unexpected
>> preemption timer vmexit. I got a log from above statement just before crash:
>>
>>     VMCS: preemption timer enabled = 1
>>
>> which is expected to be 0, because preemption timer is disabled in
>> vmcs02. I also modified L1 Xen to dump VMCS at crash, and it says
>> preemption timer is disabled.
>>
>> I noticed Jim Mattson recently sent a patch "KVM: nVMX: Fix memory
>> corruption when using VMCS shadowing" to fix the inconsistency between
>> vmcs12 and its shadow. Is it relevant here? I'll test his patch
>> tomorrow.
>
> No, it shouldn't have any effect.
>
> I think it happens when the post_block hook switches back from sw_timer
> to hv_timer, and L2 is running.  So the right fix should be along the
> lines of what I posted earlier.  If you don't beat me to it, I'll take
> another look tomorrow.

I think I just figure out the root cause, I will send out a patch to fix it.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest
  2016-07-06 17:11       ` Paolo Bonzini
  2016-07-07  1:58         ` Wanpeng Li
@ 2016-07-07  3:55         ` Wanpeng Li
  1 sibling, 0 replies; 8+ messages in thread
From: Wanpeng Li @ 2016-07-07  3:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, Wanpeng Li, Radim Krčmář,
	Yunhong Jiang, Jan Kiszka, Jim Mattson

2016-07-07 1:11 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 06/07/2016 18:03, Haozhong Zhang wrote:
>>>> This patch also fixed the crash of L1 Xen with L2 HVM guest. Xen does
>>>> not enable preemption timer for HVM guests, and will get panic if it
>>>> receives a preemption timer vmexit.
>>>
>>> Thanks!  I'm still not sure why the bit is set in the vmcs02 though...
>>
>> Yes, it looks really weird.
>>
>> I replaced "return false" in Wanpeng's patch by
>>
>>     pr_info("VMCS: preemption timer enabled = %d\n",
>>             !!(vmcs_read32(PIN_BASED_VM_EXEC_CONTROL) & PIN_BASED_VMX_PREEMPTION_TIMER));
>>
>> and redid my test. As expected, L1 Xen crashed due to the unexpected
>> preemption timer vmexit. I got a log from above statement just before crash:
>>
>>     VMCS: preemption timer enabled = 1
>>
>> which is expected to be 0, because preemption timer is disabled in
>> vmcs02. I also modified L1 Xen to dump VMCS at crash, and it says
>> preemption timer is disabled.
>>
>> I noticed Jim Mattson recently sent a patch "KVM: nVMX: Fix memory
>> corruption when using VMCS shadowing" to fix the inconsistency between
>> vmcs12 and its shadow. Is it relevant here? I'll test his patch
>> tomorrow.
>
> No, it shouldn't have any effect.
>
> I think it happens when the post_block hook switches back from sw_timer

Please review my another patch 'KVM: nVMX: Fix preemption timer bit
set in vmcs02 even if L1 doesn't enable it', which can fix the vmcs02
bit set.

> to hv_timer, and L2 is running.  So the right fix should be along the
> lines of what I posted earlier.  If you don't beat me to it, I'll take
> another look tomorrow.

Maybe you can continue "L1 TSC deadline timer to trigger while L2 is
running" work based on my two bugfixes, however, your patch is still
calltrace on top of my two fixes.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-07-07  3:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-06 11:42 [PATCH v3] KVM: nVMX: Fix incorrect preemption timer vmexit in nested guest Wanpeng Li
2016-07-06 13:26 ` Haozhong Zhang
2016-07-06 13:32   ` Paolo Bonzini
2016-07-06 16:03     ` Haozhong Zhang
2016-07-06 17:11       ` Paolo Bonzini
2016-07-07  1:58         ` Wanpeng Li
2016-07-07  3:55         ` Wanpeng Li
2016-07-07  1:05   ` Wanpeng Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.