From: Yang Zhong <yang.zhong@intel.com>
To: x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
pbonzini@redhat.com
Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com,
jing2.liu@linux.intel.com, jing2.liu@intel.com,
yang.zhong@intel.com
Subject: [PATCH 00/19] AMX Support in KVM
Date: Tue, 7 Dec 2021 19:03:40 -0500 [thread overview]
Message-ID: <20211208000359.2853257-1-yang.zhong@intel.com> (raw)
(send on behalf of Jing who is currently on leave)
This series brings AMX (Advanced Matrix eXtensions) virtualization
support to KVM. The three preparation patches in fpu core from
Thomas [1] are also included.
A large portion of the changes in this series is to deal with eXtended
Feature Disable (XFD) which allows resizing of the fpstate buffer to
support dynamically-enabled XSTATE features with large state component
(e.g. 8K for AMX).
The support is based on several key changes (design discussions can be
found in [2]):
- Guest permissions for dynamically-enabled XSAVE features
Native tasks have to request permission via prctl() before touching
a dynamic-resized XSTATE compoenent. Introduce guest permissions
for the similar purpose. Userspace VMM is expected to request guest
permission only once when the first vCPU is created.
KVM checks guest permission in KVM_SET_CPUID2. Setting XFD in guest
cpuid w/o proper permissions fails this operation.
- Extend fpstate reallocation mechanism to cover guest fpu
Unlike native tasks which have reallocation triggered from #NM
handler, guest fpstate reallocation is requested by KVM when it
detects the guest intention to use dynamically-enabled XSAVE
features.
The reallocation request is handled when exiting to userspace
VMM. This implies that KVM must break vcpu_run() loop and exit
to userspace VMM instead of immediately resuming back to the guest
when reallocation is required.
- Detect fpstate reallocation in the emulation code
Because guest #NM is not trapped in KVM (costly), the guest
intention of using a dynamically-enabled XSAVE feature[i] can be
indirectly represented by guest XCR0[i]=1 and XFD[i]=0. This
requires the emulation logic of both WRMSR(IA32_XFD) and XSETBV
to check reallocation requirement when one of the two conditions
is changed.
- Disable WRMSR interception for IA32_XFD
IA32_XFD can be frequently updated by the guest, as it is part of
the task state and swapped in context switch when prev and next have
different XFD setting. Always intercepting WRMSR can easily cause
non-negligible overhead.
Disable WRMSR interception for IA32_XFD after fpstate reallocation
succeeds. After that point the guest direct writes IA32_XFD without
causing VM-exits.
However MSR passthrough implies that guest_fpstate::xfd and per-cpu
xfd cache might be out of sync with the current IA32_XFD value set by
the guest. This suggests KVM needs to re-sync the software state
with IA32_XFD before the vCPU thread might be preempted or interrupted.
- Save/restore guest XFD_ERR
When XFD causes an instruction to generate #NM, XFD_ERR contains
information about which disabled state components are being accessed.
The #NM handler is expected to check this information and then enable
the state components by clearing IA32_XFD for the faulting task (if
having permission).
#NM can be triggered in both host and guest. It'd be problematic if
the XFD_ERR value generated in guest is consumed/clobbered by the
host before the guest itself doing so. This may lead to non-XFD-
related #NM treated as XFD #NM in host (due to guest XFD_ERR value),
or XFD-related #NM treated as non-XFD #NM in guest (XFD_ERR cleared
by the host #NM handler).
KVM needs to save the guest XFD_ERR value before this register
might be accessed by the host and restore it before entering the
guest.
One open remains in this area about when to start saving/restoring
guest XFD_ERR. Several options are discussed in patch 15.
- Expose related cpuid bits to guest
The last step is to allow exposing XFD, AMX_TILE, AMX_INT8 and
AMX_BF16 in guest cpuid. Adding those bits into kvm_cpu_caps finally
activates all previous logics in this series
To verify AMX virtualization overhead on non-AMX usages, we run the
Phoronix kernel build test in the guest w/ and w/o AMX in cpuid. The
result shows no observable difference between two configurations.
Live migration support is still being worked on. Userspace VMM needs
to use the new KVM_{G|S}SET_XSAVE2 ioctl in this series to migrate state
for dynamically-enabled XSAVE features.
Thanks Thomas for the thoughts and patches on the KVM FPU and AMX
support. Thanks Jun Nakajima for the design suggestions.
[1] git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu-kvm
[2] https://www.spinics.net/lists/kvm/msg259015.html
Thanks,
Yang
---
Jing Liu (13):
kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule
kvm: x86: Check guest xstate permissions when KVM_SET_CPUID2
x86/fpu: Move xfd initialization out of __fpstate_reset() to the
callers
kvm: x86: Propagate fpstate reallocation error to userspace
x86/fpu: Move xfd_update_state() to xstate.c and export symbol
kvm: x86: Prepare reallocation check
kvm: x86: Emulate WRMSR of guest IA32_XFD
kvm: x86: Disable WRMSR interception for IA32_XFD on demand
x86/fpu: Prepare for KVM XFD_ERR handling
kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl
docs: virt: api.rst: Document the new KVM_{G, S}ET_XSAVE2 ioctls
kvm: x86: AMX XCR0 support for guest
kvm: x86: Add AMX CPUIDs support
Thomas Gleixner (4):
x86/fpu: Extend prctl() with guest permissions
x86/fpu: Prepare KVM for dynamically enabled states
x86/fpu: Add reallocation mechanims for KVM
x86/fpu: Prepare KVM for bringing XFD state back in-sync
Yang Zhong (2):
kvm: x86: Check fpstate reallocation in XSETBV emulation
kvm: x86: Save and restore guest XFD_ERR properly
Documentation/virt/kvm/api.rst | 47 +++++++
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/fpu/api.h | 12 ++
arch/x86/include/asm/fpu/types.h | 56 +++++++++
arch/x86/include/asm/fpu/xstate.h | 2 +
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/include/uapi/asm/kvm.h | 6 +
arch/x86/include/uapi/asm/prctl.h | 26 ++--
arch/x86/kernel/fpu/core.c | 109 ++++++++++++++++-
arch/x86/kernel/fpu/xstate.c | 119 +++++++++++++++---
arch/x86/kernel/fpu/xstate.h | 29 +++--
arch/x86/kernel/process.c | 2 +
arch/x86/kvm/cpuid.c | 36 +++++-
arch/x86/kvm/vmx/vmx.c | 20 +++
arch/x86/kvm/vmx/vmx.h | 2 +-
arch/x86/kvm/x86.c | 189 ++++++++++++++++++++++++++++-
arch/x86/kvm/x86.h | 2 +
include/uapi/linux/kvm.h | 8 +-
19 files changed, 607 insertions(+), 63 deletions(-)
next reply other threads:[~2021-12-07 15:09 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-08 0:03 Yang Zhong [this message]
2021-12-08 0:03 ` [PATCH 01/19] x86/fpu: Extend prctl() with guest permissions Yang Zhong
2021-12-14 0:16 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 02/19] x86/fpu: Prepare KVM for dynamically enabled states Yang Zhong
2021-12-13 9:12 ` Paolo Bonzini
2021-12-13 12:00 ` Thomas Gleixner
2021-12-13 12:45 ` Paolo Bonzini
2021-12-13 19:50 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 03/19] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2021-12-08 0:03 ` [PATCH 04/19] kvm: x86: Check guest xstate permissions when KVM_SET_CPUID2 Yang Zhong
2021-12-08 0:03 ` [PATCH 05/19] x86/fpu: Move xfd initialization out of __fpstate_reset() to the callers Yang Zhong
2021-12-10 22:33 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 06/19] x86/fpu: Add reallocation mechanims for KVM Yang Zhong
2021-12-08 0:03 ` [PATCH 07/19] kvm: x86: Propagate fpstate reallocation error to userspace Yang Zhong
2021-12-10 15:44 ` Paolo Bonzini
2021-12-08 0:03 ` [PATCH 08/19] x86/fpu: Move xfd_update_state() to xstate.c and export symbol Yang Zhong
2021-12-10 22:44 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 09/19] kvm: x86: Prepare reallocation check Yang Zhong
2021-12-13 9:16 ` Paolo Bonzini
2021-12-14 7:06 ` Tian, Kevin
2021-12-14 10:16 ` Paolo Bonzini
2021-12-14 14:41 ` Liu, Jing2
2021-12-15 7:09 ` Tian, Kevin
2021-12-08 0:03 ` [PATCH 10/19] kvm: x86: Emulate WRMSR of guest IA32_XFD Yang Zhong
2021-12-10 16:02 ` Paolo Bonzini
2021-12-13 7:51 ` Liu, Jing2
2021-12-13 9:01 ` Paolo Bonzini
2021-12-14 10:26 ` Yang Zhong
2021-12-14 11:24 ` Paolo Bonzini
2021-12-10 23:09 ` Thomas Gleixner
2021-12-13 15:06 ` Paolo Bonzini
2021-12-13 19:45 ` Thomas Gleixner
2021-12-13 21:23 ` Thomas Gleixner
2021-12-14 7:16 ` Tian, Kevin
2021-12-08 0:03 ` [PATCH 11/19] kvm: x86: Check fpstate reallocation in XSETBV emulation Yang Zhong
2021-12-08 0:03 ` [PATCH 12/19] x86/fpu: Prepare KVM for bringing XFD state back in-sync Yang Zhong
2021-12-10 23:11 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 13/19] kvm: x86: Disable WRMSR interception for IA32_XFD on demand Yang Zhong
2021-12-08 7:23 ` Liu, Jing2
2021-12-08 0:03 ` [PATCH 14/19] x86/fpu: Prepare for KVM XFD_ERR handling Yang Zhong
2021-12-10 16:16 ` Paolo Bonzini
2021-12-10 23:20 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 15/19] kvm: x86: Save and restore guest XFD_ERR properly Yang Zhong
2021-12-10 16:23 ` Paolo Bonzini
2021-12-10 22:01 ` Paolo Bonzini
2021-12-12 13:10 ` Yang Zhong
2021-12-11 0:10 ` Thomas Gleixner
2021-12-11 1:31 ` Paolo Bonzini
2021-12-11 3:23 ` Tian, Kevin
2021-12-11 13:10 ` Thomas Gleixner
2021-12-11 3:07 ` Tian, Kevin
2021-12-11 13:29 ` Thomas Gleixner
2021-12-12 1:50 ` Tian, Kevin
2021-12-12 9:10 ` Paolo Bonzini
2021-12-08 0:03 ` [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl Yang Zhong
2021-12-10 16:25 ` Paolo Bonzini
2021-12-10 16:30 ` Paolo Bonzini
2021-12-10 22:13 ` Paolo Bonzini
2021-12-13 8:23 ` Wang, Wei W
2021-12-13 9:24 ` Paolo Bonzini
2021-12-14 6:06 ` Wang, Wei W
2021-12-14 6:18 ` Paolo Bonzini
2021-12-15 2:39 ` Wang, Wei W
2021-12-15 13:42 ` Paolo Bonzini
2021-12-16 8:25 ` Wang, Wei W
2021-12-16 10:28 ` Paolo Bonzini
2021-12-20 17:54 ` State Component 18 and Palette 1 (Re: [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl) Nakajima, Jun
2021-12-22 14:44 ` Paolo Bonzini
2021-12-22 23:47 ` Nakajima, Jun
2021-12-22 14:52 ` Dave Hansen
2021-12-22 23:51 ` Nakajima, Jun
2021-12-13 10:10 ` [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl Thomas Gleixner
2021-12-13 10:43 ` Paolo Bonzini
2021-12-13 12:40 ` Thomas Gleixner
2021-12-08 0:03 ` [PATCH 17/19] docs: virt: api.rst: Document the new KVM_{G, S}ET_XSAVE2 ioctls Yang Zhong
2021-12-08 0:03 ` [PATCH 18/19] kvm: x86: AMX XCR0 support for guest Yang Zhong
2021-12-10 16:30 ` Paolo Bonzini
2021-12-08 0:03 ` [PATCH 19/19] kvm: x86: Add AMX CPUIDs support Yang Zhong
2021-12-10 21:52 ` Paolo Bonzini
2021-12-11 21:20 ` [PATCH 00/19] AMX Support in KVM Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211208000359.2853257-1-yang.zhong@intel.com \
--to=yang.zhong@intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=jing2.liu@intel.com \
--cc=jing2.liu@linux.intel.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).