* [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
@ 2023-10-26 5:42 Eiichi Tsukata
2023-10-26 5:49 ` Eiichi Tsukata
0 siblings, 1 reply; 9+ messages in thread
From: Eiichi Tsukata @ 2023-10-26 5:42 UTC (permalink / raw)
To: pbonzini, mtosatti, kvm, qemu-devel; +Cc: Eiichi Tsukata
kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
because vCPU's hflag is referred in KVM vmx_get_nested_state()
validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
a vCPU is in VMX operation and enters SMM mode. This leads to live
migration failure.
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
---
target/i386/kvm/kvm.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e7c054cc16..cd635c9142 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
return ret;
}
+ /*
+ * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
+ * SMM.
+ */
+ ret = kvm_put_vcpu_events(x86_cpu, level);
+ if (ret < 0) {
+ return ret;
+ }
+
if (level >= KVM_PUT_RESET_STATE) {
ret = kvm_put_nested_state(x86_cpu);
if (ret < 0) {
@@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
if (ret < 0) {
return ret;
}
- ret = kvm_put_vcpu_events(x86_cpu, level);
- if (ret < 0) {
- return ret;
- }
if (level >= KVM_PUT_RESET_STATE) {
ret = kvm_put_mp_state(x86_cpu);
if (ret < 0) {
--
2.41.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-10-26 5:42 [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state() Eiichi Tsukata
@ 2023-10-26 5:49 ` Eiichi Tsukata
2023-10-26 5:52 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 9+ messages in thread
From: Eiichi Tsukata @ 2023-10-26 5:49 UTC (permalink / raw)
To: pbonzini, mtosatti, kvm, qemu-devel
Hi all,
Here is additional details on the issue.
We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
We sometimes saw live migration failures of VSM-enabled VMs. It turned
out that the issue happens during live migration when VMs change boot related
EFI variables (ex: BootOrder, Boot0001).
After some debugging, I've found the race I mentioned in the commit message.
Symptom
=======
When it happnes with the latest Qemu which has commit https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
Qemu shows the following error message on destination.
qemu-system-x86_64: Failed to put registers after init: Invalid argument
If it happens with older Qemu which doesn't have the commit, then we see CPU dump something like this:
KVM internal error. Suberror: 3
extra data[0]: 0x0000000080000b0e
extra data[1]: 0x0000000000000031
extra data[2]: 0x0000000000000683
extra data[3]: 0x000000007f809000
extra data[4]: 0x0000000000000026
RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000f61
RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000000000
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
GDT= 000000007f7df000 0000004f
IDT= 000000007f836000 000001ff
CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
In the above dump, CR3 is pointing to SMRAM region though SMM=0.
Repro
=====
Repro step is pretty simple.
* Run SMM enabled Linux guest with secure boot enabled OVMF.
* Run the following script in the guest.
/usr/libexec/qemu-kvm &
while true
do
efibootmgr -n 1
done
* Do live migration
On my environment, live migration fails in 20%.
VMX specific
============
This issue is VMX sepcific and SVM is not affected as the validation
in svm_set_nested_state() is a bit different from VMX one.
VMX:
static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user *user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
{
.. /* * SMM temporarily disables VMX, so we cannot be in guest mode,
* nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
* must be zero.
*/ if (is_smm(vcpu) ?
(kvm_state->flags &
(KVM_STATE_NESTED_GUEST_MODE | KVM_STATE_NESTED_RUN_PENDING))
: kvm_state->hdr.vmx.smm.flags)
return -EINVAL;
..
SVM:
static int svm_set_nested_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user *user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
{
.. /* SMM temporarily disables SVM, so we cannot be in guest mode. */ if (is_smm(vcpu) && (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
return -EINVAL;
..
Thanks,
Eiichi
> On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>
> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
> because vCPU's hflag is referred in KVM vmx_get_nested_state()
> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
> a vCPU is in VMX operation and enters SMM mode. This leads to live
> migration failure.
>
> Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
> ---
> target/i386/kvm/kvm.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index e7c054cc16..cd635c9142 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> return ret;
> }
>
> + /*
> + * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
> + * SMM.
> + */
> + ret = kvm_put_vcpu_events(x86_cpu, level);
> + if (ret < 0) {
> + return ret;
> + }
> +
> if (level >= KVM_PUT_RESET_STATE) {
> ret = kvm_put_nested_state(x86_cpu);
> if (ret < 0) {
> @@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> if (ret < 0) {
> return ret;
> }
> - ret = kvm_put_vcpu_events(x86_cpu, level);
> - if (ret < 0) {
> - return ret;
> - }
> if (level >= KVM_PUT_RESET_STATE) {
> ret = kvm_put_mp_state(x86_cpu);
> if (ret < 0) {
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-10-26 5:49 ` Eiichi Tsukata
@ 2023-10-26 5:52 ` Philippe Mathieu-Daudé
2023-10-26 8:52 ` Vitaly Kuznetsov
0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-10-26 5:52 UTC (permalink / raw)
To: Eiichi Tsukata, pbonzini, mtosatti, kvm, qemu-devel, Vitaly Kuznetsov
Cc'ing Vitaly.
On 26/10/23 07:49, Eiichi Tsukata wrote:
> Hi all,
>
> Here is additional details on the issue.
>
> We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
> We sometimes saw live migration failures of VSM-enabled VMs. It turned
> out that the issue happens during live migration when VMs change boot related
> EFI variables (ex: BootOrder, Boot0001).
> After some debugging, I've found the race I mentioned in the commit message.
>
> Symptom
> =======
>
> When it happnes with the latest Qemu which has commit https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
> Qemu shows the following error message on destination.
>
> qemu-system-x86_64: Failed to put registers after init: Invalid argument
>
> If it happens with older Qemu which doesn't have the commit, then we see CPU dump something like this:
>
> KVM internal error. Suberror: 3
> extra data[0]: 0x0000000080000b0e
> extra data[1]: 0x0000000000000031
> extra data[2]: 0x0000000000000683
> extra data[3]: 0x000000007f809000
> extra data[4]: 0x0000000000000026
> RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000f61
> RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000000000
> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000
> R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
> RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
> SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
> FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
> GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
> LDT=0000 0000000000000000 ffffffff 00c00000
> TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
> GDT= 000000007f7df000 0000004f
> IDT= 000000007f836000 000001ff
> CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000d00
> Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>
> In the above dump, CR3 is pointing to SMRAM region though SMM=0.
>
> Repro
> =====
>
> Repro step is pretty simple.
>
> * Run SMM enabled Linux guest with secure boot enabled OVMF.
> * Run the following script in the guest.
>
> /usr/libexec/qemu-kvm &
> while true
> do
> efibootmgr -n 1
> done
>
> * Do live migration
>
> On my environment, live migration fails in 20%.
>
> VMX specific
> ============
>
> This issue is VMX sepcific and SVM is not affected as the validation
> in svm_set_nested_state() is a bit different from VMX one.
>
> VMX:
>
> static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
> struct kvm_nested_state __user *user_kvm_nested_state,
> struct kvm_nested_state *kvm_state)
> {
> .. /* * SMM temporarily disables VMX, so we cannot be in guest mode,
> * nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
> * must be zero.
> */ if (is_smm(vcpu) ?
> (kvm_state->flags &
> (KVM_STATE_NESTED_GUEST_MODE | KVM_STATE_NESTED_RUN_PENDING))
> : kvm_state->hdr.vmx.smm.flags)
> return -EINVAL;
> ..
>
> SVM:
>
> static int svm_set_nested_state(struct kvm_vcpu *vcpu,
> struct kvm_nested_state __user *user_kvm_nested_state,
> struct kvm_nested_state *kvm_state)
> {
> .. /* SMM temporarily disables SVM, so we cannot be in guest mode. */ if (is_smm(vcpu) && (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
> return -EINVAL;
> ..
>
> Thanks,
>
> Eiichi
>
>> On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>>
>> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
>> because vCPU's hflag is referred in KVM vmx_get_nested_state()
>> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
>> a vCPU is in VMX operation and enters SMM mode. This leads to live
>> migration failure.
>>
>> Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
>> ---
>> target/i386/kvm/kvm.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index e7c054cc16..cd635c9142 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>> return ret;
>> }
>>
>> + /*
>> + * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
>> + * SMM.
>> + */
>> + ret = kvm_put_vcpu_events(x86_cpu, level);
>> + if (ret < 0) {
>> + return ret;
>> + }
>> +
>> if (level >= KVM_PUT_RESET_STATE) {
>> ret = kvm_put_nested_state(x86_cpu);
>> if (ret < 0) {
>> @@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>> if (ret < 0) {
>> return ret;
>> }
>> - ret = kvm_put_vcpu_events(x86_cpu, level);
>> - if (ret < 0) {
>> - return ret;
>> - }
>> if (level >= KVM_PUT_RESET_STATE) {
>> ret = kvm_put_mp_state(x86_cpu);
>> if (ret < 0) {
>> --
>> 2.41.0
>>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-10-26 5:52 ` Philippe Mathieu-Daudé
@ 2023-10-26 8:52 ` Vitaly Kuznetsov
2023-11-01 2:09 ` Eiichi Tsukata
0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2023-10-26 8:52 UTC (permalink / raw)
To: Philippe Mathieu-Daudé,
Eiichi Tsukata, pbonzini, mtosatti, kvm, qemu-devel,
Maxim Levitsky
Cc'ing Max :-) At first glance the condition in vmx_set_nested_state()
is correct so I guess we either have a stale
KVM_STATE_NESTED_RUN_PENDING when in SMM or stale smm.flags when outside
of it...
Philippe Mathieu-Daudé <philmd@linaro.org> writes:
> Cc'ing Vitaly.
>
> On 26/10/23 07:49, Eiichi Tsukata wrote:
>> Hi all,
>>
>> Here is additional details on the issue.
>>
>> We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
>> We sometimes saw live migration failures of VSM-enabled VMs. It turned
>> out that the issue happens during live migration when VMs change boot related
>> EFI variables (ex: BootOrder, Boot0001).
>> After some debugging, I've found the race I mentioned in the commit message.
>>
>> Symptom
>> =======
>>
>> When it happnes with the latest Qemu which has commit https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
>> Qemu shows the following error message on destination.
>>
>> qemu-system-x86_64: Failed to put registers after init: Invalid argument
>>
>> If it happens with older Qemu which doesn't have the commit, then we see CPU dump something like this:
>>
>> KVM internal error. Suberror: 3
>> extra data[0]: 0x0000000080000b0e
>> extra data[1]: 0x0000000000000031
>> extra data[2]: 0x0000000000000683
>> extra data[3]: 0x000000007f809000
>> extra data[4]: 0x0000000000000026
>> RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000f61
>> RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000000000
>> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000
>> R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
>> RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>> SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> LDT=0000 0000000000000000 ffffffff 00c00000
>> TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
>> GDT= 000000007f7df000 0000004f
>> IDT= 000000007f836000 000001ff
>> CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000d00
>> Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>>
>> In the above dump, CR3 is pointing to SMRAM region though SMM=0.
>>
>> Repro
>> =====
>>
>> Repro step is pretty simple.
>>
>> * Run SMM enabled Linux guest with secure boot enabled OVMF.
>> * Run the following script in the guest.
>>
>> /usr/libexec/qemu-kvm &
>> while true
>> do
>> efibootmgr -n 1
>> done
>>
>> * Do live migration
>>
>> On my environment, live migration fails in 20%.
>>
>> VMX specific
>> ============
>>
>> This issue is VMX sepcific and SVM is not affected as the validation
>> in svm_set_nested_state() is a bit different from VMX one.
>>
>> VMX:
>>
>> static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>> struct kvm_nested_state __user *user_kvm_nested_state,
>> struct kvm_nested_state *kvm_state)
>> {
>> .. /* * SMM temporarily disables VMX, so we cannot be in guest mode,
>> * nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
>> * must be zero.
>> */ if (is_smm(vcpu) ?
>> (kvm_state->flags &
>> (KVM_STATE_NESTED_GUEST_MODE | KVM_STATE_NESTED_RUN_PENDING))
>> : kvm_state->hdr.vmx.smm.flags)
>> return -EINVAL;
>> ..
>>
>> SVM:
>>
>> static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>> struct kvm_nested_state __user *user_kvm_nested_state,
>> struct kvm_nested_state *kvm_state)
>> {
>> .. /* SMM temporarily disables SVM, so we cannot be in guest mode. */ if (is_smm(vcpu) && (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
>> return -EINVAL;
>> ..
>>
>> Thanks,
>>
>> Eiichi
>>
>>> On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>>>
>>> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
>>> because vCPU's hflag is referred in KVM vmx_get_nested_state()
>>> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
>>> a vCPU is in VMX operation and enters SMM mode. This leads to live
>>> migration failure.
>>>
>>> Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
>>> ---
>>> target/i386/kvm/kvm.c | 13 +++++++++----
>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>> index e7c054cc16..cd635c9142 100644
>>> --- a/target/i386/kvm/kvm.c
>>> +++ b/target/i386/kvm/kvm.c
>>> @@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>> return ret;
>>> }
>>>
>>> + /*
>>> + * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
>>> + * SMM.
>>> + */
>>> + ret = kvm_put_vcpu_events(x86_cpu, level);
>>> + if (ret < 0) {
>>> + return ret;
>>> + }
>>> +
>>> if (level >= KVM_PUT_RESET_STATE) {
>>> ret = kvm_put_nested_state(x86_cpu);
>>> if (ret < 0) {
>>> @@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>> if (ret < 0) {
>>> return ret;
>>> }
>>> - ret = kvm_put_vcpu_events(x86_cpu, level);
>>> - if (ret < 0) {
>>> - return ret;
>>> - }
>>> if (level >= KVM_PUT_RESET_STATE) {
>>> ret = kvm_put_mp_state(x86_cpu);
>>> if (ret < 0) {
>>> --
>>> 2.41.0
>>>
>>
>>
>
--
Vitaly
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-10-26 8:52 ` Vitaly Kuznetsov
@ 2023-11-01 2:09 ` Eiichi Tsukata
2023-11-01 14:04 ` Vitaly Kuznetsov
0 siblings, 1 reply; 9+ messages in thread
From: Eiichi Tsukata @ 2023-11-01 2:09 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Philippe Mathieu-Daudé,
pbonzini, mtosatti, kvm, qemu-devel, Maxim Levitsky
FYI: The EINVAL in vmx_set_nested_state() is caused by the following condition:
* vcpu->arch.hflags == 0
* kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
Please feel free to ask me any more data points you need.
Thanks,
Eiichi
> On Oct 26, 2023, at 17:52, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Cc'ing Max :-) At first glance the condition in vmx_set_nested_state()
> is correct so I guess we either have a stale
> KVM_STATE_NESTED_RUN_PENDING when in SMM or stale smm.flags when outside
> of it...
>
> Philippe Mathieu-Daudé <philmd@linaro.org> writes:
>
>> Cc'ing Vitaly.
>>
>> On 26/10/23 07:49, Eiichi Tsukata wrote:
>>> Hi all,
>>>
>>> Here is additional details on the issue.
>>>
>>> We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
>>> We sometimes saw live migration failures of VSM-enabled VMs. It turned
>>> out that the issue happens during live migration when VMs change boot related
>>> EFI variables (ex: BootOrder, Boot0001).
>>> After some debugging, I've found the race I mentioned in the commit message.
>>>
>>> Symptom
>>> =======
>>>
>>> When it happnes with the latest Qemu which has commit https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_qemu_qemu_commit_7191f24c7fcfbc1216d09&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=dy01Dr4Ly8mhvnUdx1pZhhT1bkq4h9z5aVWu3paoZtk&m=w7w9eoWDJkzNpMAP1--ljTEkGkHZkB_81JBr2vdK47SK2RQFyAJHI5f13n2bybQF&s=orR7p9iiUFmw98VuhxB-Uc6Jtn91Ldm1V6qOhIqRKnc&e=
>>> Qemu shows the following error message on destination.
>>>
>>> qemu-system-x86_64: Failed to put registers after init: Invalid argument
>>>
>>> If it happens with older Qemu which doesn't have the commit, then we see CPU dump something like this:
>>>
>>> KVM internal error. Suberror: 3
>>> extra data[0]: 0x0000000080000b0e
>>> extra data[1]: 0x0000000000000031
>>> extra data[2]: 0x0000000000000683
>>> extra data[3]: 0x000000007f809000
>>> extra data[4]: 0x0000000000000026
>>> RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000f61
>>> RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000000000
>>> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000
>>> R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
>>> RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>> ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>>> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>>> SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>>> DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>>> FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>>> GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>>> LDT=0000 0000000000000000 ffffffff 00c00000
>>> TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
>>> GDT= 000000007f7df000 0000004f
>>> IDT= 000000007f836000 000001ff
>>> CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
>>> EFER=0000000000000d00
>>> Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>>>
>>> In the above dump, CR3 is pointing to SMRAM region though SMM=0.
>>>
>>> Repro
>>> =====
>>>
>>> Repro step is pretty simple.
>>>
>>> * Run SMM enabled Linux guest with secure boot enabled OVMF.
>>> * Run the following script in the guest.
>>>
>>> /usr/libexec/qemu-kvm &
>>> while true
>>> do
>>> efibootmgr -n 1
>>> done
>>>
>>> * Do live migration
>>>
>>> On my environment, live migration fails in 20%.
>>>
>>> VMX specific
>>> ============
>>>
>>> This issue is VMX sepcific and SVM is not affected as the validation
>>> in svm_set_nested_state() is a bit different from VMX one.
>>>
>>> VMX:
>>>
>>> static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>>> struct kvm_nested_state __user *user_kvm_nested_state,
>>> struct kvm_nested_state *kvm_state)
>>> {
>>> .. /* * SMM temporarily disables VMX, so we cannot be in guest mode,
>>> * nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
>>> * must be zero.
>>> */ if (is_smm(vcpu) ?
>>> (kvm_state->flags &
>>> (KVM_STATE_NESTED_GUEST_MODE | KVM_STATE_NESTED_RUN_PENDING))
>>> : kvm_state->hdr.vmx.smm.flags)
>>> return -EINVAL;
>>> ..
>>>
>>> SVM:
>>>
>>> static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>>> struct kvm_nested_state __user *user_kvm_nested_state,
>>> struct kvm_nested_state *kvm_state)
>>> {
>>> .. /* SMM temporarily disables SVM, so we cannot be in guest mode. */ if (is_smm(vcpu) && (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
>>> return -EINVAL;
>>> ..
>>>
>>> Thanks,
>>>
>>> Eiichi
>>>
>>>> On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>>>>
>>>> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
>>>> because vCPU's hflag is referred in KVM vmx_get_nested_state()
>>>> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
>>>> a vCPU is in VMX operation and enters SMM mode. This leads to live
>>>> migration failure.
>>>>
>>>> Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
>>>> ---
>>>> target/i386/kvm/kvm.c | 13 +++++++++----
>>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>>> index e7c054cc16..cd635c9142 100644
>>>> --- a/target/i386/kvm/kvm.c
>>>> +++ b/target/i386/kvm/kvm.c
>>>> @@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>>> return ret;
>>>> }
>>>>
>>>> + /*
>>>> + * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
>>>> + * SMM.
>>>> + */
>>>> + ret = kvm_put_vcpu_events(x86_cpu, level);
>>>> + if (ret < 0) {
>>>> + return ret;
>>>> + }
>>>> +
>>>> if (level >= KVM_PUT_RESET_STATE) {
>>>> ret = kvm_put_nested_state(x86_cpu);
>>>> if (ret < 0) {
>>>> @@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>>> if (ret < 0) {
>>>> return ret;
>>>> }
>>>> - ret = kvm_put_vcpu_events(x86_cpu, level);
>>>> - if (ret < 0) {
>>>> - return ret;
>>>> - }
>>>> if (level >= KVM_PUT_RESET_STATE) {
>>>> ret = kvm_put_mp_state(x86_cpu);
>>>> if (ret < 0) {
>>>> --
>>>> 2.41.0
>>>>
>>>
>>>
>>
>
> --
> Vitaly
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-11-01 2:09 ` Eiichi Tsukata
@ 2023-11-01 14:04 ` Vitaly Kuznetsov
2023-11-08 1:12 ` Eiichi Tsukata
0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2023-11-01 14:04 UTC (permalink / raw)
To: Eiichi Tsukata, pbonzini, Maxim Levitsky
Cc: Philippe Mathieu-Daudé, mtosatti, kvm, qemu-devel
Eiichi Tsukata <eiichi.tsukata@nutanix.com> writes:
> FYI: The EINVAL in vmx_set_nested_state() is caused by the following condition:
> * vcpu->arch.hflags == 0
> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
This is a weird state indeed,
'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
reflection of vmx->nested.smm.vmxon (see
vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
vCPU must be in SMM to have it set.
In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
Paolo, Max, any idea how this is supposed to work?
--
Vitaly
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-11-01 14:04 ` Vitaly Kuznetsov
@ 2023-11-08 1:12 ` Eiichi Tsukata
2024-01-16 0:13 ` Eiichi Tsukata
0 siblings, 1 reply; 9+ messages in thread
From: Eiichi Tsukata @ 2023-11-08 1:12 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: pbonzini, Maxim Levitsky, Philippe Mathieu-Daudé,
mtosatti, kvm, qemu-devel
Hi all, appreciate any comments or feedbacks on the patch.
Thanks,
Eiichi
> On Nov 1, 2023, at 23:04, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Eiichi Tsukata <eiichi.tsukata@nutanix.com> writes:
>
>> FYI: The EINVAL in vmx_set_nested_state() is caused by the following condition:
>> * vcpu->arch.hflags == 0
>> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
>
> This is a weird state indeed,
>
> 'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
> but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
> reflection of vmx->nested.smm.vmxon (see
> vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
> in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
> vCPU must be in SMM to have it set.
>
> In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
> kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
> kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
> kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
>
> Paolo, Max, any idea how this is supposed to work?
>
> --
> Vitaly
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2023-11-08 1:12 ` Eiichi Tsukata
@ 2024-01-16 0:13 ` Eiichi Tsukata
2024-01-16 9:31 ` Vitaly Kuznetsov
0 siblings, 1 reply; 9+ messages in thread
From: Eiichi Tsukata @ 2024-01-16 0:13 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: pbonzini, Maxim Levitsky, Philippe Mathieu-Daudé,
mtosatti, kvm, qemu-devel
Ping.
> On Nov 8, 2023, at 10:12, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>
> Hi all, appreciate any comments or feedbacks on the patch.
>
> Thanks,
> Eiichi
>
>> On Nov 1, 2023, at 23:04, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Eiichi Tsukata <eiichi.tsukata@nutanix.com> writes:
>>
>>> FYI: The EINVAL in vmx_set_nested_state() is caused by the following condition:
>>> * vcpu->arch.hflags == 0
>>> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
>>
>> This is a weird state indeed,
>>
>> 'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
>> but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
>> reflection of vmx->nested.smm.vmxon (see
>> vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
>> in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
>> vCPU must be in SMM to have it set.
>>
>> In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
>> kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
>> kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
>> kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
>>
>> Paolo, Max, any idea how this is supposed to work?
>>
>> --
>> Vitaly
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()
2024-01-16 0:13 ` Eiichi Tsukata
@ 2024-01-16 9:31 ` Vitaly Kuznetsov
0 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2024-01-16 9:31 UTC (permalink / raw)
To: Eiichi Tsukata, pbonzini, Maxim Levitsky
Cc: Philippe Mathieu-Daudé, mtosatti, kvm, qemu-devel
As I'm the addressee of the ping for some reason ... :-)
the fix looks good to me but I'm not sure about all the consequences of
moving kvm_put_vcpu_events() to an earlier stage. Max, Paolo, please
take a look!
Eiichi Tsukata <eiichi.tsukata@nutanix.com> writes:
> Ping.
>
>> On Nov 8, 2023, at 10:12, Eiichi Tsukata <eiichi.tsukata@nutanix.com> wrote:
>>
>> Hi all, appreciate any comments or feedbacks on the patch.
>>
>> Thanks,
>> Eiichi
>>
>>> On Nov 1, 2023, at 23:04, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>>
>>> Eiichi Tsukata <eiichi.tsukata@nutanix.com> writes:
>>>
>>>> FYI: The EINVAL in vmx_set_nested_state() is caused by the following condition:
>>>> * vcpu->arch.hflags == 0
>>>> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
>>>
>>> This is a weird state indeed,
>>>
>>> 'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
>>> but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
>>> reflection of vmx->nested.smm.vmxon (see
>>> vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
>>> in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
>>> vCPU must be in SMM to have it set.
>>>
>>> In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
>>> kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
>>> kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
>>> kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
>>>
>>> Paolo, Max, any idea how this is supposed to work?
>>>
>>> --
>>> Vitaly
>>>
>>
>
--
Vitaly
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-01-16 9:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-26 5:42 [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state() Eiichi Tsukata
2023-10-26 5:49 ` Eiichi Tsukata
2023-10-26 5:52 ` Philippe Mathieu-Daudé
2023-10-26 8:52 ` Vitaly Kuznetsov
2023-11-01 2:09 ` Eiichi Tsukata
2023-11-01 14:04 ` Vitaly Kuznetsov
2023-11-08 1:12 ` Eiichi Tsukata
2024-01-16 0:13 ` Eiichi Tsukata
2024-01-16 9:31 ` Vitaly Kuznetsov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.