All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
@ 2022-01-29  9:46 Leonardo Bras
  2022-01-31 12:53 ` David Edmondson
  2022-02-01 18:31 ` Leonardo Bras Soares Passos
  0 siblings, 2 replies; 8+ messages in thread
From: Leonardo Bras @ 2022-01-29  9:46 UTC (permalink / raw)
  To: Paolo Bonzini, David Edmondson, Leonardo Bras, Peter Xu,
	Dr . David Alan Gilbert
  Cc: qemu-devel

The following steps describe a migration bug:
1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
2 - Migrate to a host with EPYC-Naples cpu

The guest kernel crashes shortly after the migration.

The crash happens due to a fault caused by XRSTOR:
A set bit in XSTATE_BV is not set in XCR0.
The faulting bit is FEATURE_PKRU (enabled in Milan, but not in Naples)

To avoid this kind of bug:
In kvm_get_xsave, mask-out from xstate_bv any bits that are not set in
current vcpu's features.

This keeps cpu->env->xstate_bv with feature bits compatible with any
host machine capable of running the vcpu model.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 target/i386/xsave_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/xsave_helper.c b/target/i386/xsave_helper.c
index ac61a96344..0628226234 100644
--- a/target/i386/xsave_helper.c
+++ b/target/i386/xsave_helper.c
@@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu, const void *buf, uint32_t buflen)
         env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
     }
 
-    env->xstate_bv = header->xstate_bv;
+    env->xstate_bv = header->xstate_bv & env->features[FEAT_XSAVE_COMP_LO];
 
     e = &x86_ext_save_areas[XSTATE_YMM_BIT];
     if (e->size && e->offset) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-01-29  9:46 [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features Leonardo Bras
@ 2022-01-31 12:53 ` David Edmondson
  2022-02-01  8:29   ` Igor Mammedov
  2022-02-01 19:09   ` Leonardo Brás
  2022-02-01 18:31 ` Leonardo Bras Soares Passos
  1 sibling, 2 replies; 8+ messages in thread
From: David Edmondson @ 2022-01-31 12:53 UTC (permalink / raw)
  To: qemu-devel

On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:

> The following steps describe a migration bug:
> 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> 2 - Migrate to a host with EPYC-Naples cpu
>
> The guest kernel crashes shortly after the migration.
>
> The crash happens due to a fault caused by XRSTOR:
> A set bit in XSTATE_BV is not set in XCR0.
> The faulting bit is FEATURE_PKRU (enabled in Milan, but not in Naples)

I'm trying to understand how this happens.

If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should not
be exposed to the VM (it is not available in the EPYC CPU).

Given this, how would bit 0x200 (representing PKRU) end up set in
xstate_bv?

> To avoid this kind of bug:
> In kvm_get_xsave, mask-out from xstate_bv any bits that are not set in
> current vcpu's features.
>
> This keeps cpu->env->xstate_bv with feature bits compatible with any
> host machine capable of running the vcpu model.
>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  target/i386/xsave_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/i386/xsave_helper.c b/target/i386/xsave_helper.c
> index ac61a96344..0628226234 100644
> --- a/target/i386/xsave_helper.c
> +++ b/target/i386/xsave_helper.c
> @@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu, const void *buf, uint32_t buflen)
>          env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
>      }
>
> -    env->xstate_bv = header->xstate_bv;
> +    env->xstate_bv = header->xstate_bv & env->features[FEAT_XSAVE_COMP_LO];
>
>      e = &x86_ext_save_areas[XSTATE_YMM_BIT];
>      if (e->size && e->offset) {

dme.
-- 
You have underestimated my power, as you shortly will discover.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-01-31 12:53 ` David Edmondson
@ 2022-02-01  8:29   ` Igor Mammedov
  2022-02-01 19:17     ` Leonardo Brás
  2022-02-01 19:09   ` Leonardo Brás
  1 sibling, 1 reply; 8+ messages in thread
From: Igor Mammedov @ 2022-02-01  8:29 UTC (permalink / raw)
  To: David Edmondson; +Cc: qemu-devel, Dr . David Alan Gilbert

On Mon, 31 Jan 2022 12:53:31 +0000
David Edmondson <david.edmondson@oracle.com> wrote:

> On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:
> 
> > The following steps describe a migration bug:
> > 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> > 2 - Migrate to a host with EPYC-Naples cpu
> >
> > The guest kernel crashes shortly after the migration.
> >
> > The crash happens due to a fault caused by XRSTOR:
> > A set bit in XSTATE_BV is not set in XCR0.
> > The faulting bit is FEATURE_PKRU (enabled in Milan, but not in Naples)  
> 
> I'm trying to understand how this happens.
> 
> If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should not
> be exposed to the VM (it is not available in the EPYC CPU).
> 
> Given this, how would bit 0x200 (representing PKRU) end up set in
> xstate_bv?
> 
> > To avoid this kind of bug:
> > In kvm_get_xsave, mask-out from xstate_bv any bits that are not set in
> > current vcpu's features.

In addition to above:

it's not good idea to silently mask something out.
If we can't ensure the same feature-set for a CPU model
and can't verify it by asking QEMU on source and
target host, the next best thing would be to explicitly
fail migration (i.e. adding check to.post_load hook or
doing some other migration magic, CCing David)

> >
> > This keeps cpu->env->xstate_bv with feature bits compatible with any
> > host machine capable of running the vcpu model.
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  target/i386/xsave_helper.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/target/i386/xsave_helper.c b/target/i386/xsave_helper.c
> > index ac61a96344..0628226234 100644
> > --- a/target/i386/xsave_helper.c
> > +++ b/target/i386/xsave_helper.c
> > @@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu, const void *buf, uint32_t buflen)
> >          env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
> >      }
> >
> > -    env->xstate_bv = header->xstate_bv;
> > +    env->xstate_bv = header->xstate_bv & env->features[FEAT_XSAVE_COMP_LO];
> >
> >      e = &x86_ext_save_areas[XSTATE_YMM_BIT];
> >      if (e->size && e->offset) {  
> 
> dme.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-01-29  9:46 [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features Leonardo Bras
  2022-01-31 12:53 ` David Edmondson
@ 2022-02-01 18:31 ` Leonardo Bras Soares Passos
  1 sibling, 0 replies; 8+ messages in thread
From: Leonardo Bras Soares Passos @ 2022-02-01 18:31 UTC (permalink / raw)
  To: Paolo Bonzini, David Edmondson, Leonardo Bras, Peter Xu,
	Dr . David Alan Gilbert, Igor Mammedov
  Cc: qemu-devel

Hello David Edmondson and Igor Memmedov,

Thank you for the feedback!

For some reason I did not get your comments in my email.
I could only notice them when I opened Patchwork to get the link.

Sorry for the delay. I will do my best to address them in a few minutes.

Best regards,
Leo

On Sat, Jan 29, 2022 at 6:47 AM Leonardo Bras <leobras@redhat.com> wrote:
>
> The following steps describe a migration bug:
> 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> 2 - Migrate to a host with EPYC-Naples cpu
>
> The guest kernel crashes shortly after the migration.
>
> The crash happens due to a fault caused by XRSTOR:
> A set bit in XSTATE_BV is not set in XCR0.
> The faulting bit is FEATURE_PKRU (enabled in Milan, but not in Naples)
>
> To avoid this kind of bug:
> In kvm_get_xsave, mask-out from xstate_bv any bits that are not set in
> current vcpu's features.
>
> This keeps cpu->env->xstate_bv with feature bits compatible with any
> host machine capable of running the vcpu model.
>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  target/i386/xsave_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/i386/xsave_helper.c b/target/i386/xsave_helper.c
> index ac61a96344..0628226234 100644
> --- a/target/i386/xsave_helper.c
> +++ b/target/i386/xsave_helper.c
> @@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu, const void *buf, uint32_t buflen)
>          env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
>      }
>
> -    env->xstate_bv = header->xstate_bv;
> +    env->xstate_bv = header->xstate_bv & env->features[FEAT_XSAVE_COMP_LO];
>
>      e = &x86_ext_save_areas[XSTATE_YMM_BIT];
>      if (e->size && e->offset) {
> --
> 2.34.1
>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-01-31 12:53 ` David Edmondson
  2022-02-01  8:29   ` Igor Mammedov
@ 2022-02-01 19:09   ` Leonardo Brás
  2022-02-02 15:46     ` David Edmondson
  1 sibling, 1 reply; 8+ messages in thread
From: Leonardo Brás @ 2022-02-01 19:09 UTC (permalink / raw)
  To: David Edmondson, qemu-devel

Hello David, thanks for this feedback!

On Mon, 2022-01-31 at 12:53 +0000, David Edmondson wrote:
> On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:
> 
> > The following steps describe a migration bug:
> > 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> > 2 - Migrate to a host with EPYC-Naples cpu
> > 
> > The guest kernel crashes shortly after the migration.
> > 
> > The crash happens due to a fault caused by XRSTOR:
> > A set bit in XSTATE_BV is not set in XCR0.
> > The faulting bit is FEATURE_PKRU (enabled in Milan, but not in
> > Naples)
> 
> I'm trying to understand how this happens.
> 
> If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should
> not
> be exposed to the VM (it is not available in the EPYC CPU).
> 
> Given this, how would bit 0x200 (representing PKRU) end up set in
> xstate_bv?


During my debug, I noticed this bit gets set before the kernel even
starts. 

It's possible Seabios and/or IPXE are somehow setting 0x200 using the
xrstor command. I am not sure if qemu is able to stop this in KVM mode.

If you have any info on this, please let me know.

Best regards,
Leo

> 
> > To avoid this kind of bug:
> > In kvm_get_xsave, mask-out from xstate_bv any bits that are not set
> > in
> > current vcpu's features.
> > 
> > This keeps cpu->env->xstate_bv with feature bits compatible with
> > any
> > host machine capable of running the vcpu model.
> > 
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  target/i386/xsave_helper.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/target/i386/xsave_helper.c
> > b/target/i386/xsave_helper.c
> > index ac61a96344..0628226234 100644
> > --- a/target/i386/xsave_helper.c
> > +++ b/target/i386/xsave_helper.c
> > @@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu,
> > const void *buf, uint32_t buflen)
> >          env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
> >      }
> > 
> > -    env->xstate_bv = header->xstate_bv;
> > +    env->xstate_bv = header->xstate_bv & env-
> > >features[FEAT_XSAVE_COMP_LO];
> > 
> >      e = &x86_ext_save_areas[XSTATE_YMM_BIT];
> >      if (e->size && e->offset) {
> 
> dme.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-02-01  8:29   ` Igor Mammedov
@ 2022-02-01 19:17     ` Leonardo Brás
  0 siblings, 0 replies; 8+ messages in thread
From: Leonardo Brás @ 2022-02-01 19:17 UTC (permalink / raw)
  To: Igor Mammedov, David Edmondson, Paolo Bonzini, Leonardo Bras,
	Peter Xu, Dr . David Alan Gilbert
  Cc: qemu-devel

Hello Igor,

On Tue, 2022-02-01 at 09:29 +0100, Igor Mammedov wrote:
> On Mon, 31 Jan 2022 12:53:31 +0000
> David Edmondson <david.edmondson@oracle.com> wrote:
> 
> > On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:
> > 
> > > The following steps describe a migration bug:
> > > 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> > > 2 - Migrate to a host with EPYC-Naples cpu
> > > 
> > > The guest kernel crashes shortly after the migration.
> > > 
> > > The crash happens due to a fault caused by XRSTOR:
> > > A set bit in XSTATE_BV is not set in XCR0.
> > > The faulting bit is FEATURE_PKRU (enabled in Milan, but not in
> > > Naples)  
> > 
> > I'm trying to understand how this happens.
> > 
> > If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should
> > not
> > be exposed to the VM (it is not available in the EPYC CPU).
> > 
> > Given this, how would bit 0x200 (representing PKRU) end up set in
> > xstate_bv?
> > 
> > > To avoid this kind of bug:
> > > In kvm_get_xsave, mask-out from xstate_bv any bits that are not
> > > set in
> > > current vcpu's features.
> 
> In addition to above:
> 
> it's not good idea to silently mask something out.
> If we can't ensure the same feature-set for a CPU model
> and can't verify it by asking QEMU on source and
> target host, the next best thing would be to explicitly
> fail migration (i.e. adding check to.post_load hook or
> doing some other migration magic, CCing David)

Maybe there is something to do with the host kernel (kvm) doing some
strange stuff.

IIRC qemu ended up getting some masked version for using on migration,
since it was not failing as expected.

I will try to investigate further.
Please let me know if you have any information on that.

Best regards,
Leo

> 
> > > 
> > > This keeps cpu->env->xstate_bv with feature bits compatible with
> > > any
> > > host machine capable of running the vcpu model.
> > > 
> > > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > > ---
> > >  target/i386/xsave_helper.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/target/i386/xsave_helper.c
> > > b/target/i386/xsave_helper.c
> > > index ac61a96344..0628226234 100644
> > > --- a/target/i386/xsave_helper.c
> > > +++ b/target/i386/xsave_helper.c
> > > @@ -167,7 +167,7 @@ void x86_cpu_xrstor_all_areas(X86CPU *cpu,
> > > const void *buf, uint32_t buflen)
> > >          env->xmm_regs[i].ZMM_Q(1) = ldq_p(xmm + 8);
> > >      }
> > > 
> > > -    env->xstate_bv = header->xstate_bv;
> > > +    env->xstate_bv = header->xstate_bv & env-
> > > >features[FEAT_XSAVE_COMP_LO];
> > > 
> > >      e = &x86_ext_save_areas[XSTATE_YMM_BIT];
> > >      if (e->size && e->offset) {  
> > 
> > dme.
> 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-02-01 19:09   ` Leonardo Brás
@ 2022-02-02 15:46     ` David Edmondson
  2022-02-05  8:22       ` Leonardo Bras Soares Passos
  0 siblings, 1 reply; 8+ messages in thread
From: David Edmondson @ 2022-02-02 15:46 UTC (permalink / raw)
  To: Leonardo Brás; +Cc: qemu-devel

On Tuesday, 2022-02-01 at 16:09:57 -03, Leonardo Brás wrote:

> Hello David, thanks for this feedback!
>
> On Mon, 2022-01-31 at 12:53 +0000, David Edmondson wrote:
>> On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:
>> 
>> > The following steps describe a migration bug:
>> > 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
>> > 2 - Migrate to a host with EPYC-Naples cpu
>> > 
>> > The guest kernel crashes shortly after the migration.
>> > 
>> > The crash happens due to a fault caused by XRSTOR:
>> > A set bit in XSTATE_BV is not set in XCR0.
>> > The faulting bit is FEATURE_PKRU (enabled in Milan, but not in
>> > Naples)
>> 
>> I'm trying to understand how this happens.
>> 
>> If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should
>> not
>> be exposed to the VM (it is not available in the EPYC CPU).
>> 
>> Given this, how would bit 0x200 (representing PKRU) end up set in
>> xstate_bv?
>
> During my debug, I noticed this bit gets set before the kernel even
> starts. 
>
> It's possible Seabios and/or IPXE are somehow setting 0x200 using the
> xrstor command. I am not sure if qemu is able to stop this in KVM mode.

I don't believe that this should be possible.

If the CPU is set to EPYC in QEMU then .features[FEAT_7_0_ECX] does not
include CPUID_7_0_ECX_PKU, which in turn means that when
x86_cpu_enable_xsave_components() generates FEAT_XSAVE_COMP_LO it should
not set XSTATE_PKRU_BIT.

Given that, KVM's vcpu->arch.guest_supported_xcr0 will not include
XSTATE_PKRU_BIT, and __kvm_set_xcr() should not allow that bit to be
set when it intercepts the guest xsetbv instruction.

dme.
-- 
Please forgive me if I act a little strange, for I know not what I do.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features
  2022-02-02 15:46     ` David Edmondson
@ 2022-02-05  8:22       ` Leonardo Bras Soares Passos
  0 siblings, 0 replies; 8+ messages in thread
From: Leonardo Bras Soares Passos @ 2022-02-05  8:22 UTC (permalink / raw)
  To: David Edmondson; +Cc: qemu-devel

Hello David, thank you for the feedback.

On Wed, Feb 2, 2022 at 12:47 PM David Edmondson
<david.edmondson@oracle.com> wrote:
>
> On Tuesday, 2022-02-01 at 16:09:57 -03, Leonardo Brás wrote:
>
> > Hello David, thanks for this feedback!
> >
> > On Mon, 2022-01-31 at 12:53 +0000, David Edmondson wrote:
> >> On Saturday, 2022-01-29 at 06:46:45 -03, Leonardo Bras wrote:
> >>
> >> > The following steps describe a migration bug:
> >> > 1 - Bring up a VM with -cpu EPYC on a host with EPYC-Milan cpu
> >> > 2 - Migrate to a host with EPYC-Naples cpu
> >> >
> >> > The guest kernel crashes shortly after the migration.
> >> >
> >> > The crash happens due to a fault caused by XRSTOR:
> >> > A set bit in XSTATE_BV is not set in XCR0.
> >> > The faulting bit is FEATURE_PKRU (enabled in Milan, but not in
> >> > Naples)
> >>
> >> I'm trying to understand how this happens.
> >>
> >> If we boot on EPYC-Milan with "-cpu EPYC", the PKRU feature should
> >> not
> >> be exposed to the VM (it is not available in the EPYC CPU).
> >>
> >> Given this, how would bit 0x200 (representing PKRU) end up set in
> >> xstate_bv?
> >
> > During my debug, I noticed this bit gets set before the kernel even
> > starts.
> >
> > It's possible Seabios and/or IPXE are somehow setting 0x200 using the
> > xrstor command. I am not sure if qemu is able to stop this in KVM mode.
>
> I don't believe that this should be possible.
>
> If the CPU is set to EPYC in QEMU then .features[FEAT_7_0_ECX] does not
> include CPUID_7_0_ECX_PKU, which in turn means that when
> x86_cpu_enable_xsave_components() generates FEAT_XSAVE_COMP_LO it should
> not set XSTATE_PKRU_BIT.
>
> Given that, KVM's vcpu->arch.guest_supported_xcr0 will not include
> XSTATE_PKRU_BIT, and __kvm_set_xcr() should not allow that bit to be
> set when it intercepts the guest xsetbv instruction.

Thanks for sharing those details, it helped me on the kernel side of this bug.

FWIW, i did send a patchset fixing this bug to kernel list:
https://patchwork.kernel.org/project/kvm/list/?series=611524&state=%2A&archive=both


Best regards,
Leo



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-05  8:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-29  9:46 [PATCH v1 1/1] target/i386: Mask xstate_bv based on the cpu enabled features Leonardo Bras
2022-01-31 12:53 ` David Edmondson
2022-02-01  8:29   ` Igor Mammedov
2022-02-01 19:17     ` Leonardo Brás
2022-02-01 19:09   ` Leonardo Brás
2022-02-02 15:46     ` David Edmondson
2022-02-05  8:22       ` Leonardo Bras Soares Passos
2022-02-01 18:31 ` Leonardo Bras Soares Passos

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.