All of lore.kernel.org
 help / color / mirror / Atom feed
* odd behaviour of virtualized CPUs
@ 2023-10-23  7:43 Gerrit Slomma
  2023-10-23 15:19 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Gerrit Slomma @ 2023-10-23  7:43 UTC (permalink / raw)
  To: kvm

Hello

I came upon following behaviour, i think this is a bug, but where to 
file it?
I filed it against qemu-kvm at Red Hat-jira for the time being, but this 
is a closed environment as it seems.

Sourcecode first:
#include <stdio.h>
#include <string.h>
#include <immintrin.h>

void main(void) {
         __m256i test1 = _mm256_set_epi32(1,2,3,4,5,6,7,8);
         __m256i test2 = _mm256_set_epi32(1,2,3,4,5,6,7,8);

         for (int count = 0; count < 8; count++) {
                 printf("[%d] %d ", count, *((int*)(&test1) + count));
         }

         printf("\n");

         for (int count = 0; count < 8; count++) {
                 printf("[%d] %d ", count, *((int*)(&test2) + count));
         }

         printf("\n");
         test1 = _mm256_add_epi32(test1, test2);
         test2 = _mm256_mullo_epi32(test1, test2);

         for (int count = 0; count < 8; count++) {
                 printf("[%d] %d ", count, *((int*)(&test1) + count));
         }

         printf("\n");

         for (int count = 0; count < 8; count++) {
                 printf("[%d] %d ", count, *((int*)(&test2) + count));
         }

         printf("\n");
}

Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used 
intrinsics are AVX2-intrinsics.
When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2 
this yields "illegal instruction".
When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU 
is capable of AVX2 (i.e. Haswell or Skylake) this runs, despite 
advertised flag is only avx:
$ ./avx2
[0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
[0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
[0] 16 [1] 14 [2] 12 [3] 10 [4] 8 [5] 6 [6] 4 [7] 2
[0] 128 [1] 98 [2] 72 [3] 50 [4] 32 [5] 18 [6] 8 [7] 2

this holds for FMA3-instructions (i used intrinsic is 
_mm256_fmadd_pd(a,b,c).)

When i emulate the CPU as Westmere it yields "illegal instruction".

Regards, Gerrit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: odd behaviour of virtualized CPUs
  2023-10-23  7:43 odd behaviour of virtualized CPUs Gerrit Slomma
@ 2023-10-23 15:19 ` Sean Christopherson
  2023-10-23 16:29   ` Jim Mattson
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2023-10-23 15:19 UTC (permalink / raw)
  To: Gerrit Slomma; +Cc: kvm

On Mon, Oct 23, 2023, Gerrit Slomma wrote:
> Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used intrinsics
> are AVX2-intrinsics.
> When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2 this
> yields "illegal instruction".
> When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU is
> capable of AVX2 (i.e. Haswell or Skylake) this runs, despite advertised flag
> is only avx:

This is expected.  Many AVX instructions have virtualization holes, i.e. hardware
doesn't provide controls that allow the hypervisor (KVM) to precisely disable (or
intercept) specific sets of AVX instructions.  The virtualization holes are "safe"
because the instructions don't grant access to novel CPU state, just new ways of
manipulating existing state.  E.g. AVX2 instructions operate on existing AVX state
(YMM registers).

AVX512 on the other hand does introduce new state (ZMM registers) and so hardware
provides a control (XCR0.AVX512) that KVM can use to prevent the guest from
accessing the new state.

In other words, a misbehaving guest that ignores CPUID can hose itself, e.g. if
the VM gets live migrated to a host that _doesn't_ natively support AVX2, then
the workload will suddenly start getting #UDs.  But the integrity of the host and
the VM's state is not in danger.

> $ ./avx2
> [0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
> [0] 8 [1] 7 [2] 6 [3] 5 [4] 4 [5] 3 [6] 2 [7] 1
> [0] 16 [1] 14 [2] 12 [3] 10 [4] 8 [5] 6 [6] 4 [7] 2
> [0] 128 [1] 98 [2] 72 [3] 50 [4] 32 [5] 18 [6] 8 [7] 2
> 
> this holds for FMA3-instructions (i used intrinsic is
> _mm256_fmadd_pd(a,b,c).)
> 
> When i emulate the CPU as Westmere it yields "illegal instruction".

This is also expected.  Westmere doesn't support AVX, and so KVM disallows the
guest from setting XCR0.YMM.  Buried in the "PROGRAMMING WITH INTEL® AVX, FMA,
AND INTEL® AVX2" section of the SDM is this snippet:

  If YMM state management is not enabled by an operating systems, Intel AVX
  instructions will #UD regardless of CPUID.1:ECX.AVX[bit 28].

I.e. Westmere doesn't have an AVX2 virtualization hole because it doesn't support
AVX in the first place.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: odd behaviour of virtualized CPUs
  2023-10-23 15:19 ` Sean Christopherson
@ 2023-10-23 16:29   ` Jim Mattson
  2023-10-23 17:43     ` Gerrit Slomma
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Mattson @ 2023-10-23 16:29 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Gerrit Slomma, kvm

On Mon, Oct 23, 2023 at 8:19 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Oct 23, 2023, Gerrit Slomma wrote:
> > Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used intrinsics
> > are AVX2-intrinsics.
> > When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2 this
> > yields "illegal instruction".
> > When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU is
> > capable of AVX2 (i.e. Haswell or Skylake) this runs, despite advertised flag
> > is only avx:
>
> This is expected.  Many AVX instructions have virtualization holes, i.e. hardware
> doesn't provide controls that allow the hypervisor (KVM) to precisely disable (or
> intercept) specific sets of AVX instructions.  The virtualization holes are "safe"
> because the instructions don't grant access to novel CPU state, just new ways of
> manipulating existing state.  E.g. AVX2 instructions operate on existing AVX state
> (YMM registers).
>
> AVX512 on the other hand does introduce new state (ZMM registers) and so hardware
> provides a control (XCR0.AVX512) that KVM can use to prevent the guest from
> accessing the new state.
>
> In other words, a misbehaving guest that ignores CPUID can hose itself, e.g. if
> the VM gets live migrated to a host that _doesn't_ natively support AVX2, then
> the workload will suddenly start getting #UDs.  But the integrity of the host and
> the VM's state is not in danger.

One could argue that trying to virtualize a Sandy Bridge CPU on
Haswell hardware is simply user error, since the virtualization
hardware doesn't support that masquerade.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: odd behaviour of virtualized CPUs
  2023-10-23 16:29   ` Jim Mattson
@ 2023-10-23 17:43     ` Gerrit Slomma
  2023-10-23 20:06       ` Jim Mattson
  0 siblings, 1 reply; 5+ messages in thread
From: Gerrit Slomma @ 2023-10-23 17:43 UTC (permalink / raw)
  To: Jim Mattson, Sean Christopherson; +Cc: kvm

Why?
As Sean pointed out if you have older CPUs that don't support a specific 
instruction set you need to restrict the capabilities in order to 
support live migration.
"this is expected" is a bit far fetched, it is not expected but it is 
observed and real behaviour.
I came across this when testing virtualized systems for performance with 
the sllr-application from primegrid which only ran with SSE-code (plain 
code flavour for me), not using AVX on a CPU that said to me via lscpu 
it was a E5-2660v2.
(This was VMWare for that matter and i wrote the test app i posted and 
tested in KVM-qemu-virtualized on other hosts).

Regards, Gerrit.

On 23.10.23 18:29, Jim Mattson wrote:
> On Mon, Oct 23, 2023 at 8:19 AM Sean Christopherson <seanjc@google.com> wrote:
>> On Mon, Oct 23, 2023, Gerrit Slomma wrote:
>>> Compilation with "gcc -mavx -i avx2 avx2.c" fails, due to used intrinsics
>>> are AVX2-intrinsics.
>>> When compiled with "gcc -mavx2 -o avx2 avx2.c" an run on a E7-4880v2 this
>>> yields "illegal instruction".
>>> When run on a KVM-virtualized "Sandy Bridge"-CPU, but the underlying CPU is
>>> capable of AVX2 (i.e. Haswell or Skylake) this runs, despite advertised flag
>>> is only avx:
>> This is expected.  Many AVX instructions have virtualization holes, i.e. hardware
>> doesn't provide controls that allow the hypervisor (KVM) to precisely disable (or
>> intercept) specific sets of AVX instructions.  The virtualization holes are "safe"
>> because the instructions don't grant access to novel CPU state, just new ways of
>> manipulating existing state.  E.g. AVX2 instructions operate on existing AVX state
>> (YMM registers).
>>
>> AVX512 on the other hand does introduce new state (ZMM registers) and so hardware
>> provides a control (XCR0.AVX512) that KVM can use to prevent the guest from
>> accessing the new state.
>>
>> In other words, a misbehaving guest that ignores CPUID can hose itself, e.g. if
>> the VM gets live migrated to a host that _doesn't_ natively support AVX2, then
>> the workload will suddenly start getting #UDs.  But the integrity of the host and
>> the VM's state is not in danger.
> One could argue that trying to virtualize a Sandy Bridge CPU on
> Haswell hardware is simply user error, since the virtualization
> hardware doesn't support that masquerade.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: odd behaviour of virtualized CPUs
  2023-10-23 17:43     ` Gerrit Slomma
@ 2023-10-23 20:06       ` Jim Mattson
  0 siblings, 0 replies; 5+ messages in thread
From: Jim Mattson @ 2023-10-23 20:06 UTC (permalink / raw)
  To: Gerrit Slomma; +Cc: Sean Christopherson, kvm

On Mon, Oct 23, 2023 at 10:43 AM Gerrit Slomma
<gerrit.slomma@itsslomma.de> wrote:
>
> Why?
> As Sean pointed out if you have older CPUs that don't support a specific
> instruction set you need to restrict the capabilities in order to
> support live migration.

The x86 hardware virtualization facilities do not allow the hypervisor
to restrict capabilities a la carte. Some capabilities do have a
"gatekeeper," like a CR4 bit or an XCR0 bit, which, when clear, will
induce an exception if that capability is used. However, many
capabilities do not. Take the SERIALIZE instruction, for example. It
should raise #UD on platforms older than Sapphire Rapids, but if your
virtual machine is masquerading as an older microarchitecture on a
Sapphire Rapids host, you will find that the SERIALIZE instruction is
available, does not raise #UD, and works just as it does on bare
metal.

As a result, there is no way for a virtual CPU to masquerade as an
older microarchitecture when running on Sapphire Rapids.

It can come close enough to be acceptable for a heterogenous migration
pool, but it's still a virtualization hole.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-23 20:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-23  7:43 odd behaviour of virtualized CPUs Gerrit Slomma
2023-10-23 15:19 ` Sean Christopherson
2023-10-23 16:29   ` Jim Mattson
2023-10-23 17:43     ` Gerrit Slomma
2023-10-23 20:06       ` Jim Mattson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.