* [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used
@ 2022-01-22 1:52 Sean Christopherson
2022-01-22 8:47 ` Vitaly Kuznetsov
2022-01-24 13:52 ` Paolo Bonzini
0 siblings, 2 replies; 4+ messages in thread
From: Sean Christopherson @ 2022-01-22 1:52 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
Joerg Roedel, kvm, linux-kernel, Lai Jiangshan
Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state
if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit
kernel and is not emulating 32-bit syscalls. As the name suggests,
vmx_set_constant_host_state() is intended for state that is *constant*.
When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are
per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a
new CPU. The logic in vmx_vcpu_load_vmcs() doesn't differentiate between
"never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP
on VMCS load also handles setting correct host state when the VMCS is
first loaded.
Because a VMCS must be loaded before it is initialized during vCPU RESET,
zeroing the field in vmx_set_constant_host_state() obliterates the value
that was written when the VMCS was loaded. If the vCPU is run before it
is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP,
leading to a #DF on the next 32-bit syscall.
double fault: 0000 [#1] SMP
CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
EIP: entry_SYSENTER_32+0x0/0xe7
Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8
EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000
ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093
CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90
Fixes: 6ab8a4053f71 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)")
Cc: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmx.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a02a28ce7cc3..ce2aae12fcc5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4094,10 +4094,13 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
vmcs_write32(HOST_IA32_SYSENTER_CS, low32);
/*
- * If 32-bit syscall is enabled, vmx_vcpu_load_vcms rewrites
- * HOST_IA32_SYSENTER_ESP.
+ * SYSENTER is used only for (emulating) 32-bit kernels, zero out
+ * SYSENTER.ESP if it is NOT used. When SYSENTER is used, the per-CPU
+ * stack is set when the VMCS is loaded (and may already be set!).
*/
- vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
+ if (!IS_ENABLED(CONFIG_IA32_EMULATION) && !IS_ENABLED(CONFIG_X86_32))
+ vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
+
rdmsrl(MSR_IA32_SYSENTER_EIP, tmpl);
vmcs_writel(HOST_IA32_SYSENTER_EIP, tmpl); /* 22.2.3 */
base-commit: e2e83a73d7ce66f62c7830a85619542ef59c90e4
--
2.35.0.rc0.227.g00780c9af4-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used
2022-01-22 1:52 [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used Sean Christopherson
@ 2022-01-22 8:47 ` Vitaly Kuznetsov
2022-01-24 13:46 ` Paolo Bonzini
2022-01-24 13:52 ` Paolo Bonzini
1 sibling, 1 reply; 4+ messages in thread
From: Vitaly Kuznetsov @ 2022-01-22 8:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, Lai Jiangshan
Sean Christopherson <seanjc@google.com> writes:
> Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state
> if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit
> kernel and is not emulating 32-bit syscalls. As the name suggests,
> vmx_set_constant_host_state() is intended for state that is *constant*.
> When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are
> per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a
> new CPU. The logic in vmx_vcpu_load_vmcs() doesn't differentiate between
> "never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP
> on VMCS load also handles setting correct host state when the VMCS is
> first loaded.
>
> Because a VMCS must be loaded before it is initialized during vCPU RESET,
> zeroing the field in vmx_set_constant_host_state() obliterates the value
> that was written when the VMCS was loaded. If the vCPU is run before it
> is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP,
> leading to a #DF on the next 32-bit syscall.
>
> double fault: 0000 [#1] SMP
> CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> EIP: entry_SYSENTER_32+0x0/0xe7
> Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8
> EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000
> ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000
> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093
> CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90
>
> Fixes: 6ab8a4053f71 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)")
> Cc: Lai Jiangshan <laijs@linux.alibaba.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a02a28ce7cc3..ce2aae12fcc5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4094,10 +4094,13 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
> vmcs_write32(HOST_IA32_SYSENTER_CS, low32);
>
> /*
> - * If 32-bit syscall is enabled, vmx_vcpu_load_vcms rewrites
> - * HOST_IA32_SYSENTER_ESP.
> + * SYSENTER is used only for (emulating) 32-bit kernels, zero out
> + * SYSENTER.ESP if it is NOT used. When SYSENTER is used, the per-CPU
> + * stack is set when the VMCS is loaded (and may already be set!).
For an unprepared reader, I'd suggest adding something like "This pairs
with how HOST_IA32_SYSENTER_ESP is written in vmx_vcpu_load_vmcs()".
> */
> - vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
> + if (!IS_ENABLED(CONFIG_IA32_EMULATION) && !IS_ENABLED(CONFIG_X86_32))
Isn't it the same as "!IS_ENABLED(CONFIG_COMPAT_32)"? (same goes to the
check in vmx_vcpu_load_vmcs())
> + vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
> +
> rdmsrl(MSR_IA32_SYSENTER_EIP, tmpl);
> vmcs_writel(HOST_IA32_SYSENTER_EIP, tmpl); /* 22.2.3 */
>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
--
Vitaly
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used
2022-01-22 8:47 ` Vitaly Kuznetsov
@ 2022-01-24 13:46 ` Paolo Bonzini
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2022-01-24 13:46 UTC (permalink / raw)
To: Vitaly Kuznetsov, Sean Christopherson
Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel, Lai Jiangshan
On 1/22/22 09:47, Vitaly Kuznetsov wrote:
>> */
>> - vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
>> + if (!IS_ENABLED(CONFIG_IA32_EMULATION) && !IS_ENABLED(CONFIG_X86_32))
> Isn't it the same as "!IS_ENABLED(CONFIG_COMPAT_32)"? (same goes to the
> check in vmx_vcpu_load_vmcs())
>
It is, but I think it's clearer to write it as it's already done in
arch/x86/kvm/vmx/vmx.c, or possibly
if (IS_ENABLED(CONFIG_X86_64) && !IS_ENABLED(CONFIG_IA32_EMULATION))
CONFIG_COMPAT_32 doesn't say as clearly whether it's enabled for 32-bit
systems or not.
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used
2022-01-22 1:52 [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used Sean Christopherson
2022-01-22 8:47 ` Vitaly Kuznetsov
@ 2022-01-24 13:52 ` Paolo Bonzini
1 sibling, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2022-01-24 13:52 UTC (permalink / raw)
To: Sean Christopherson
Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
linux-kernel, Lai Jiangshan
On 1/22/22 02:52, Sean Christopherson wrote:
> Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state
> if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit
> kernel and is not emulating 32-bit syscalls. As the name suggests,
> vmx_set_constant_host_state() is intended for state that is *constant*.
> When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are
> per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a
> new CPU. The logic in vmx_vcpu_load_vmcs() doesn't differentiate between
> "never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP
> on VMCS load also handles setting correct host state when the VMCS is
> first loaded.
>
> Because a VMCS must be loaded before it is initialized during vCPU RESET,
> zeroing the field in vmx_set_constant_host_state() obliterates the value
> that was written when the VMCS was loaded. If the vCPU is run before it
> is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP,
> leading to a #DF on the next 32-bit syscall.
>
> double fault: 0000 [#1] SMP
> CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> EIP: entry_SYSENTER_32+0x0/0xe7
> Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8
> EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000
> ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000
> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093
> CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90
>
> Fixes: 6ab8a4053f71 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)")
> Cc: Lai Jiangshan <laijs@linux.alibaba.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/vmx/vmx.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a02a28ce7cc3..ce2aae12fcc5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4094,10 +4094,13 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
> vmcs_write32(HOST_IA32_SYSENTER_CS, low32);
>
> /*
> - * If 32-bit syscall is enabled, vmx_vcpu_load_vcms rewrites
> - * HOST_IA32_SYSENTER_ESP.
> + * SYSENTER is used only for (emulating) 32-bit kernels, zero out
> + * SYSENTER.ESP if it is NOT used. When SYSENTER is used, the per-CPU
> + * stack is set when the VMCS is loaded (and may already be set!).
Slightly clearer:
/*
* SYSENTER is used for 32-bit system calls on either 32-bit or
* 64-bit kernels. It is always zero if neither is allowed, otherwise
* vmx_vcpu_load_vmcs loads it with the per-CPU entry stack (and may
* have already done so!).
*/
> */
> - vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
> + if (!IS_ENABLED(CONFIG_IA32_EMULATION) && !IS_ENABLED(CONFIG_X86_32))
> + vmcs_writel(HOST_IA32_SYSENTER_ESP, 0);
> +
> rdmsrl(MSR_IA32_SYSENTER_EIP, tmpl);
> vmcs_writel(HOST_IA32_SYSENTER_EIP, tmpl); /* 22.2.3 */
>
>
> base-commit: e2e83a73d7ce66f62c7830a85619542ef59c90e4
Queued, thanks.
Paolo
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-01-24 13:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-22 1:52 [PATCH] KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT used Sean Christopherson
2022-01-22 8:47 ` Vitaly Kuznetsov
2022-01-24 13:46 ` Paolo Bonzini
2022-01-24 13:52 ` Paolo Bonzini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.