All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jing Liu <jing2.liu@intel.com>
To: x86@kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org
Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com,
	jing2.liu@linux.intel.com, jing2.liu@intel.com,
	guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com
Subject: [PATCH v3 19/22] kvm: x86: Get/set expanded xstate buffer
Date: Wed, 22 Dec 2021 04:40:49 -0800	[thread overview]
Message-ID: <20211222124052.644626-20-jing2.liu@intel.com> (raw)
In-Reply-To: <20211222124052.644626-1-jing2.liu@intel.com>

From: Guang Zeng <guang.zeng@intel.com>

When AMX is enabled it requires a larger xstate buffer than
the legacy hardcoded 4KB one. Exising kvm ioctls
(KVM_[G|S]ET_XSAVE under KVM_CAP_XSAVE) are not suitable for
this purpose.

A new capability (KVM_CAP_XSAVE2) is introduced to mark an
extended kvm_xsave format to support >4KB fpstate. The expanded
fpstate size is returned to userspace when it checks the
KVM_CAP_XSAVE2 capability.

Introduce a new KVM_GET_XSAVE2 under this capability to use the
new format for copying guest fpstate to userspace. Reuse
KVM_SET_XSAVE for both old/new formats by reimplementing it to
do properly-sized memdup_user() based on the guest fpu container.

Also, update the api doc with the new KVM_GET_XSAVE2 ioctl.

Signed-off-by: Guang Zeng <guang.zeng@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
---
 Documentation/virt/kvm/api.rst  | 42 +++++++++++++++++++++++++++++++--
 arch/x86/include/uapi/asm/kvm.h | 16 ++++++++++++-
 arch/x86/kvm/x86.c              | 39 +++++++++++++++++++++++++++++-
 include/uapi/linux/kvm.h        |  4 ++++
 4 files changed, 97 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 1cf2483246cd..e48f7de5f23a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1566,6 +1566,7 @@ otherwise it will return EBUSY error.
 
   struct kvm_xsave {
 	__u32 region[1024];
+	__u32 extra[0];
   };
 
 This ioctl would copy current vcpu's xsave struct to the userspace.
@@ -1574,7 +1575,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
 4.43 KVM_SET_XSAVE
 ------------------
 
-:Capability: KVM_CAP_XSAVE
+:Capability: KVM_CAP_XSAVE and KVM_CAP_XSAVE2
 :Architectures: x86
 :Type: vcpu ioctl
 :Parameters: struct kvm_xsave (in)
@@ -1585,9 +1586,18 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
 
   struct kvm_xsave {
 	__u32 region[1024];
+	__u32 extra[0];
   };
 
-This ioctl would copy userspace's xsave struct to the kernel.
+This ioctl would copy userspace's xsave struct to the kernel. It copies
+as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2),
+when invoked on the vm file descriptor. The size value returned by
+KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
+Currently, it is only greater than 4096 if a dynamic feature has been
+enabled with ``arch_prctl()``, but this may change in the future.
+
+The offsets of the state save areas in struct kvm_xsave follow the
+contents of CPUID leaf 0xD on the host.
 
 
 4.44 KVM_GET_XCRS
@@ -5507,6 +5517,34 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header.
 The Stats Data block contains an array of 64-bit values in the same order
 as the descriptors in Descriptors block.
 
+4.42 KVM_GET_XSAVE2
+------------------
+
+:Capability: KVM_CAP_XSAVE2
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xsave (out)
+:Returns: 0 on success, -1 on error
+
+
+::
+
+  struct kvm_xsave {
+	__u32 region[1024];
+	__u32 extra[0];
+  };
+
+This ioctl would copy current vcpu's xsave struct to the userspace. It
+copies as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+when invoked on the vm file descriptor. The size value returned by
+KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
+Currently, it is only greater than 4096 if a dynamic feature has been
+enabled with ``arch_prctl()``, but this may change in the future.
+
+The offsets of the state save areas in struct kvm_xsave follow the contents
+of CPUID leaf 0xD on the host.
+
+
 5. The kvm_run structure
 ========================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 5a776a08f78c..2da3316bb559 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -373,9 +373,23 @@ struct kvm_debugregs {
 	__u64 reserved[9];
 };
 
-/* for KVM_CAP_XSAVE */
+/* for KVM_CAP_XSAVE and KVM_CAP_XSAVE2 */
 struct kvm_xsave {
+	/*
+	 * KVM_GET_XSAVE2 and KVM_SET_XSAVE write and read as many bytes
+	 * as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+	 * respectively, when invoked on the vm file descriptor.
+	 *
+	 * The size value returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+	 * will always be at least 4096. Currently, it is only greater
+	 * than 4096 if a dynamic feature has been enabled with
+	 * ``arch_prctl()``, but this may change in the future.
+	 *
+	 * The offsets of the state save areas in struct kvm_xsave follow
+	 * the contents of CPUID leaf 0xD on the host.
+	 */
 	__u32 region[1024];
+	__u32 extra[0];
 };
 
 #define KVM_MAX_XCRS	16
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c558c098979a..3b756ff13103 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4297,6 +4297,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		else
 			r = 0;
 		break;
+	case KVM_CAP_XSAVE2:
+		r = kvm->vcpus[0]->arch.guest_fpu.uabi_size;
+		if (r < sizeof(struct kvm_xsave))
+			r = sizeof(struct kvm_xsave);
+		break;
 	default:
 		break;
 	}
@@ -4900,6 +4905,16 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
 				       vcpu->arch.pkru);
 }
 
+static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
+					  u8 *state, unsigned int size)
+{
+	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
+		return;
+
+	fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu,
+				       state, size, vcpu->arch.pkru);
+}
+
 static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 					struct kvm_xsave *guest_xsave)
 {
@@ -5367,7 +5382,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		break;
 	}
 	case KVM_SET_XSAVE: {
-		u.xsave = memdup_user(argp, sizeof(*u.xsave));
+		int size = vcpu->arch.guest_fpu.uabi_size;
+
+		u.xsave = memdup_user(argp, size);
 		if (IS_ERR(u.xsave)) {
 			r = PTR_ERR(u.xsave);
 			goto out_nofree;
@@ -5376,6 +5393,26 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_x86_set_xsave(vcpu, u.xsave);
 		break;
 	}
+
+	case KVM_GET_XSAVE2: {
+		int size = vcpu->arch.guest_fpu.uabi_size;
+
+		u.xsave = kzalloc(size, GFP_KERNEL_ACCOUNT);
+		if (!u.xsave) {
+			r = -ENOMEM;
+			break;
+		}
+
+		kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+
+		if (copy_to_user(argp, u.xsave, size)) {
+			r = -EFAULT;
+			break;
+		}
+		r = 0;
+		break;
+	}
+
 	case KVM_GET_XCRS: {
 		u.xcrs = kzalloc(sizeof(struct kvm_xcrs), GFP_KERNEL_ACCOUNT);
 		r = -ENOMEM;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 1daa45268de2..9d1c01669560 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
 #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
+#define KVM_CAP_XSAVE2 207
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1610,6 +1611,9 @@ struct kvm_enc_region {
 #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
+/* Available with KVM_CAP_XSAVE2 */
+#define KVM_GET_XSAVE2		  _IOR(KVMIO,  0xcf, struct kvm_xsave)
+
 struct kvm_s390_pv_sec_parm {
 	__u64 origin;
 	__u64 length;
-- 
2.27.0


  parent reply	other threads:[~2021-12-22 12:42 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-22 12:40 [PATCH v3 00/22] AMX Support in KVM Jing Liu
2021-12-22 12:40 ` [PATCH v3 01/22] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Jing Liu
2021-12-22 12:40 ` [PATCH v3 02/22] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Jing Liu
2021-12-22 12:40 ` [PATCH v3 03/22] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Jing Liu
2021-12-22 12:40 ` [PATCH v3 04/22] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Jing Liu
2021-12-22 12:40 ` [PATCH v3 05/22] kvm: x86: Check permitted dynamic xfeatures at KVM_SET_CPUID2 Jing Liu
2021-12-28 23:38   ` Sean Christopherson
2021-12-29  2:18     ` Tian, Kevin
2021-12-22 12:40 ` [PATCH v3 06/22] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Jing Liu
2021-12-22 12:40 ` [PATCH v3 07/22] x86/fpu: Add guest support to xfd_enable_feature() Jing Liu
2021-12-22 12:40 ` [PATCH v3 08/22] x86/fpu: Provide fpu_update_guest_perm_features() for guest Jing Liu
2021-12-22 12:40 ` [PATCH v3 09/22] kvm: x86: Enable dynamic XSAVE features at KVM_SET_CPUID2 Jing Liu
2021-12-28 23:54   ` Sean Christopherson
2021-12-29  2:23     ` Tian, Kevin
2021-12-22 12:40 ` [PATCH v3 10/22] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Jing Liu
2021-12-22 12:40 ` [PATCH v3 11/22] kvm: x86: Add emulation for IA32_XFD Jing Liu
2021-12-22 12:40 ` [PATCH v3 12/22] x86/fpu: Prepare xfd_err in struct fpu_guest Jing Liu
2021-12-22 12:40 ` [PATCH v3 13/22] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Jing Liu
2021-12-29  0:09   ` Sean Christopherson
2021-12-29  2:52     ` Tian, Kevin
2021-12-29 17:37       ` Sean Christopherson
2021-12-29  6:50     ` Tian, Kevin
2021-12-29  8:13     ` Tian, Kevin
2021-12-22 12:40 ` [PATCH v3 14/22] kvm: x86: Emulate IA32_XFD_ERR for guest Jing Liu
2021-12-22 12:40 ` [PATCH v3 15/22] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Jing Liu
2021-12-22 12:40 ` [PATCH v3 16/22] kvm: x86: Add XCR0 support for Intel AMX Jing Liu
2021-12-29  0:21   ` Sean Christopherson
2021-12-29  3:01     ` Tian, Kevin
2021-12-22 12:40 ` [PATCH v3 17/22] kvm: x86: Add CPUID " Jing Liu
2021-12-22 12:40 ` [PATCH v3 18/22] x86/fpu: Add uabi_size to guest_fpu Jing Liu
2021-12-22 12:40 ` Jing Liu [this message]
2021-12-29  0:38   ` [PATCH v3 19/22] kvm: x86: Get/set expanded xstate buffer Sean Christopherson
2021-12-29  2:57     ` Wang, Wei W
2021-12-29  6:36       ` Tian, Kevin
2021-12-22 12:40 ` [PATCH v3 20/22] kvm: selftests: Add support for KVM_CAP_XSAVE2 Jing Liu
2021-12-22 12:40 ` [PATCH v3 21/22] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Jing Liu
2021-12-22 12:40 ` [PATCH v3 22/22] kvm: x86: Disable interception for IA32_XFD on demand Jing Liu
2021-12-29  1:04   ` Sean Christopherson
2021-12-29  3:35     ` Tian, Kevin
2021-12-29  7:16     ` Tian, Kevin
2021-12-29 17:26       ` Sean Christopherson
2021-12-30  1:28         ` Tian, Kevin
2021-12-30  7:04         ` Tian, Kevin
2021-12-31  9:42         ` Tian, Kevin
2021-12-29  7:37     ` Tian, Kevin
2022-01-04 18:32     ` Paolo Bonzini
2022-01-04 18:58       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211222124052.644626-20-jing2.liu@intel.com \
    --to=jing2.liu@intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=guang.zeng@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yang.zhong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.