linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yang Zhong <yang.zhong@intel.com>
To: x86@kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	pbonzini@redhat.com
Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com,
	jing2.liu@linux.intel.com, jing2.liu@intel.com,
	yang.zhong@intel.com
Subject: [PATCH 00/19] AMX Support in KVM
Date: Tue,  7 Dec 2021 19:03:40 -0500	[thread overview]
Message-ID: <20211208000359.2853257-1-yang.zhong@intel.com> (raw)

(send on behalf of Jing who is currently on leave)

This series brings AMX (Advanced Matrix eXtensions) virtualization
support to KVM. The three preparation patches in fpu core from 
Thomas [1] are also included. 

A large portion of the changes in this series is to deal with eXtended
Feature Disable (XFD) which allows resizing of the fpstate buffer to 
support dynamically-enabled XSTATE features with large state component
(e.g. 8K for AMX).

The support is based on several key changes (design discussions can be
found in [2]):

  - Guest permissions for dynamically-enabled XSAVE features

    Native tasks have to request permission via prctl() before touching
    a dynamic-resized XSTATE compoenent. Introduce guest permissions 
    for the similar purpose. Userspace VMM is expected to request guest
    permission only once when the first vCPU is created.

    KVM checks guest permission in KVM_SET_CPUID2. Setting XFD in guest
    cpuid w/o proper permissions fails this operation.

  - Extend fpstate reallocation mechanism to cover guest fpu

    Unlike native tasks which have reallocation triggered from #NM 
    handler, guest fpstate reallocation is requested by KVM when it
    detects the guest intention to use dynamically-enabled XSAVE
    features.

    The reallocation request is handled when exiting to userspace
    VMM. This implies that KVM must break vcpu_run() loop and exit
    to userspace VMM instead of immediately resuming back to the guest
    when reallocation is required.

  - Detect fpstate reallocation in the emulation code

    Because guest #NM is not trapped in KVM (costly), the guest 
    intention of using a dynamically-enabled XSAVE feature[i] can be
    indirectly represented by guest XCR0[i]=1 and XFD[i]=0. This 
    requires the emulation logic of both WRMSR(IA32_XFD) and XSETBV 
    to check reallocation requirement when one of the two conditions
    is changed.

  - Disable WRMSR interception for IA32_XFD

    IA32_XFD can be frequently updated by the guest, as it is part of
    the task state and swapped in context switch when prev and next have
    different XFD setting. Always intercepting WRMSR can easily cause
    non-negligible overhead.

    Disable WRMSR interception for IA32_XFD after fpstate reallocation
    succeeds. After that point the guest direct writes IA32_XFD without
    causing VM-exits.

    However MSR passthrough implies that guest_fpstate::xfd and per-cpu
    xfd cache might be out of sync with the current IA32_XFD value set by
    the guest. This suggests KVM needs to re-sync the software state
    with IA32_XFD before the vCPU thread might be preempted or interrupted.

  - Save/restore guest XFD_ERR

    When XFD causes an instruction to generate #NM, XFD_ERR contains
    information about which disabled state components are being accessed.
    The #NM handler is expected to check this information and then enable
    the state components by clearing IA32_XFD for the faulting task (if 
    having permission).

    #NM can be triggered in both host and guest. It'd be problematic if
    the XFD_ERR value generated in guest is consumed/clobbered by the 
    host before the guest itself doing so. This may lead to non-XFD-
    related #NM treated as XFD #NM in host (due to guest XFD_ERR value),
    or XFD-related #NM treated as non-XFD #NM in guest (XFD_ERR cleared 
    by the host #NM handler).

    KVM needs to save the guest XFD_ERR value before this register
    might be accessed by the host and restore it before entering the 
    guest.

    One open remains in this area about when to start saving/restoring
    guest XFD_ERR. Several options are discussed in patch 15.

  - Expose related cpuid bits to guest

    The last step is to allow exposing XFD, AMX_TILE, AMX_INT8 and
    AMX_BF16 in guest cpuid. Adding those bits into kvm_cpu_caps finally
    activates all previous logics in this series

To verify AMX virtualization overhead on non-AMX usages, we run the
Phoronix kernel build test in the guest w/ and w/o AMX in cpuid. The 
result shows no observable difference between two configurations.

Live migration support is still being worked on. Userspace VMM needs
to use the new KVM_{G|S}SET_XSAVE2 ioctl in this series to migrate state
for dynamically-enabled XSAVE features.

Thanks Thomas for the thoughts and patches on the KVM FPU and AMX
support. Thanks Jun Nakajima for the design suggestions.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu-kvm
[2] https://www.spinics.net/lists/kvm/msg259015.html

Thanks,
Yang

---
Jing Liu (13):
  kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule
  kvm: x86: Check guest xstate permissions when KVM_SET_CPUID2
  x86/fpu: Move xfd initialization out of __fpstate_reset() to the
    callers
  kvm: x86: Propagate fpstate reallocation error to userspace
  x86/fpu: Move xfd_update_state() to xstate.c and export symbol
  kvm: x86: Prepare reallocation check
  kvm: x86: Emulate WRMSR of guest IA32_XFD
  kvm: x86: Disable WRMSR interception for IA32_XFD on demand
  x86/fpu: Prepare for KVM XFD_ERR handling
  kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl
  docs: virt: api.rst: Document the new KVM_{G, S}ET_XSAVE2 ioctls
  kvm: x86: AMX XCR0 support for guest
  kvm: x86: Add AMX CPUIDs support

Thomas Gleixner (4):
  x86/fpu: Extend prctl() with guest permissions
  x86/fpu: Prepare KVM for dynamically enabled states
  x86/fpu: Add reallocation mechanims for KVM
  x86/fpu: Prepare KVM for bringing XFD state back in-sync

Yang Zhong (2):
  kvm: x86: Check fpstate reallocation in XSETBV emulation
  kvm: x86: Save and restore guest XFD_ERR properly

 Documentation/virt/kvm/api.rst     |  47 +++++++
 arch/x86/include/asm/cpufeatures.h |   2 +
 arch/x86/include/asm/fpu/api.h     |  12 ++
 arch/x86/include/asm/fpu/types.h   |  56 +++++++++
 arch/x86/include/asm/fpu/xstate.h  |   2 +
 arch/x86/include/asm/kvm-x86-ops.h |   1 +
 arch/x86/include/asm/kvm_host.h    |   2 +
 arch/x86/include/uapi/asm/kvm.h    |   6 +
 arch/x86/include/uapi/asm/prctl.h  |  26 ++--
 arch/x86/kernel/fpu/core.c         | 109 ++++++++++++++++-
 arch/x86/kernel/fpu/xstate.c       | 119 +++++++++++++++---
 arch/x86/kernel/fpu/xstate.h       |  29 +++--
 arch/x86/kernel/process.c          |   2 +
 arch/x86/kvm/cpuid.c               |  36 +++++-
 arch/x86/kvm/vmx/vmx.c             |  20 +++
 arch/x86/kvm/vmx/vmx.h             |   2 +-
 arch/x86/kvm/x86.c                 | 189 ++++++++++++++++++++++++++++-
 arch/x86/kvm/x86.h                 |   2 +
 include/uapi/linux/kvm.h           |   8 +-
 19 files changed, 607 insertions(+), 63 deletions(-)


             reply	other threads:[~2021-12-07 15:09 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08  0:03 Yang Zhong [this message]
2021-12-08  0:03 ` [PATCH 01/19] x86/fpu: Extend prctl() with guest permissions Yang Zhong
2021-12-14  0:16   ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 02/19] x86/fpu: Prepare KVM for dynamically enabled states Yang Zhong
2021-12-13  9:12   ` Paolo Bonzini
2021-12-13 12:00     ` Thomas Gleixner
2021-12-13 12:45       ` Paolo Bonzini
2021-12-13 19:50         ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 03/19] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2021-12-08  0:03 ` [PATCH 04/19] kvm: x86: Check guest xstate permissions when KVM_SET_CPUID2 Yang Zhong
2021-12-08  0:03 ` [PATCH 05/19] x86/fpu: Move xfd initialization out of __fpstate_reset() to the callers Yang Zhong
2021-12-10 22:33   ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 06/19] x86/fpu: Add reallocation mechanims for KVM Yang Zhong
2021-12-08  0:03 ` [PATCH 07/19] kvm: x86: Propagate fpstate reallocation error to userspace Yang Zhong
2021-12-10 15:44   ` Paolo Bonzini
2021-12-08  0:03 ` [PATCH 08/19] x86/fpu: Move xfd_update_state() to xstate.c and export symbol Yang Zhong
2021-12-10 22:44   ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 09/19] kvm: x86: Prepare reallocation check Yang Zhong
2021-12-13  9:16   ` Paolo Bonzini
2021-12-14  7:06     ` Tian, Kevin
2021-12-14 10:16       ` Paolo Bonzini
2021-12-14 14:41         ` Liu, Jing2
2021-12-15  7:09           ` Tian, Kevin
2021-12-08  0:03 ` [PATCH 10/19] kvm: x86: Emulate WRMSR of guest IA32_XFD Yang Zhong
2021-12-10 16:02   ` Paolo Bonzini
2021-12-13  7:51     ` Liu, Jing2
2021-12-13  9:01       ` Paolo Bonzini
2021-12-14 10:26     ` Yang Zhong
2021-12-14 11:24       ` Paolo Bonzini
2021-12-10 23:09   ` Thomas Gleixner
2021-12-13 15:06   ` Paolo Bonzini
2021-12-13 19:45     ` Thomas Gleixner
2021-12-13 21:23       ` Thomas Gleixner
2021-12-14  7:16         ` Tian, Kevin
2021-12-08  0:03 ` [PATCH 11/19] kvm: x86: Check fpstate reallocation in XSETBV emulation Yang Zhong
2021-12-08  0:03 ` [PATCH 12/19] x86/fpu: Prepare KVM for bringing XFD state back in-sync Yang Zhong
2021-12-10 23:11   ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 13/19] kvm: x86: Disable WRMSR interception for IA32_XFD on demand Yang Zhong
2021-12-08  7:23   ` Liu, Jing2
2021-12-08  0:03 ` [PATCH 14/19] x86/fpu: Prepare for KVM XFD_ERR handling Yang Zhong
2021-12-10 16:16   ` Paolo Bonzini
2021-12-10 23:20   ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 15/19] kvm: x86: Save and restore guest XFD_ERR properly Yang Zhong
2021-12-10 16:23   ` Paolo Bonzini
2021-12-10 22:01   ` Paolo Bonzini
2021-12-12 13:10     ` Yang Zhong
2021-12-11  0:10   ` Thomas Gleixner
2021-12-11  1:31     ` Paolo Bonzini
2021-12-11  3:23       ` Tian, Kevin
2021-12-11 13:10       ` Thomas Gleixner
2021-12-11  3:07     ` Tian, Kevin
2021-12-11 13:29       ` Thomas Gleixner
2021-12-12  1:50         ` Tian, Kevin
2021-12-12  9:10           ` Paolo Bonzini
2021-12-08  0:03 ` [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl Yang Zhong
2021-12-10 16:25   ` Paolo Bonzini
2021-12-10 16:30   ` Paolo Bonzini
2021-12-10 22:13     ` Paolo Bonzini
2021-12-13  8:23       ` Wang, Wei W
2021-12-13  9:24         ` Paolo Bonzini
2021-12-14  6:06           ` Wang, Wei W
2021-12-14  6:18             ` Paolo Bonzini
2021-12-15  2:39               ` Wang, Wei W
2021-12-15 13:42                 ` Paolo Bonzini
2021-12-16  8:25                   ` Wang, Wei W
2021-12-16 10:28                     ` Paolo Bonzini
2021-12-20 17:54       ` State Component 18 and Palette 1 (Re: [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl) Nakajima, Jun
2021-12-22 14:44         ` Paolo Bonzini
2021-12-22 23:47           ` Nakajima, Jun
2021-12-22 14:52         ` Dave Hansen
2021-12-22 23:51           ` Nakajima, Jun
2021-12-13 10:10     ` [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl Thomas Gleixner
2021-12-13 10:43       ` Paolo Bonzini
2021-12-13 12:40         ` Thomas Gleixner
2021-12-08  0:03 ` [PATCH 17/19] docs: virt: api.rst: Document the new KVM_{G, S}ET_XSAVE2 ioctls Yang Zhong
2021-12-08  0:03 ` [PATCH 18/19] kvm: x86: AMX XCR0 support for guest Yang Zhong
2021-12-10 16:30   ` Paolo Bonzini
2021-12-08  0:03 ` [PATCH 19/19] kvm: x86: Add AMX CPUIDs support Yang Zhong
2021-12-10 21:52   ` Paolo Bonzini
2021-12-11 21:20 ` [PATCH 00/19] AMX Support in KVM Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211208000359.2853257-1-yang.zhong@intel.com \
    --to=yang.zhong@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jing2.liu@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).