linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yang Zhong <yang.zhong@intel.com>
To: x86@kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org
Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com,
	jing2.liu@linux.intel.com, jing2.liu@intel.com,
	guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com
Subject: [PATCH v4 18/21] kvm: x86: Add support for getting/setting expanded xstate buffer
Date: Wed, 29 Dec 2021 05:13:25 -0800	[thread overview]
Message-ID: <20211229131328.12283-19-yang.zhong@intel.com> (raw)
In-Reply-To: <20211229131328.12283-1-yang.zhong@intel.com>

From: Guang Zeng <guang.zeng@intel.com>

With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set
xstate data from/to KVM. This doesn't work when dynamic xfeatures
(e.g. AMX) are exposed to the guest as they require a larger buffer
size.

Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the
required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2).
KVM_SET_XSAVE is extended to work with both legacy and new capabilities
by doing properly-sized memdup_user() based on the guest fpu container.
KVM_GET_XSAVE is kept for backward-compatible reason. Instead,
KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred
interface for getting xstate buffer (4KB or larger size) from KVM
(Link: https://lkml.org/lkml/2021/12/15/510)

Also, update the api doc with the new KVM_GET_XSAVE2 ioctl.

Signed-off-by: Guang Zeng <guang.zeng@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
---
 Documentation/virt/kvm/api.rst  | 42 ++++++++++++++++++++++++++++++--
 arch/x86/include/uapi/asm/kvm.h | 16 +++++++++++-
 arch/x86/kvm/x86.c              | 43 ++++++++++++++++++++++++++++++++-
 include/uapi/linux/kvm.h        |  4 +++
 4 files changed, 101 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 1cf2483246cd..e48f7de5f23a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1566,6 +1566,7 @@ otherwise it will return EBUSY error.
 
   struct kvm_xsave {
 	__u32 region[1024];
+	__u32 extra[0];
   };
 
 This ioctl would copy current vcpu's xsave struct to the userspace.
@@ -1574,7 +1575,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
 4.43 KVM_SET_XSAVE
 ------------------
 
-:Capability: KVM_CAP_XSAVE
+:Capability: KVM_CAP_XSAVE and KVM_CAP_XSAVE2
 :Architectures: x86
 :Type: vcpu ioctl
 :Parameters: struct kvm_xsave (in)
@@ -1585,9 +1586,18 @@ This ioctl would copy current vcpu's xsave struct to the userspace.
 
   struct kvm_xsave {
 	__u32 region[1024];
+	__u32 extra[0];
   };
 
-This ioctl would copy userspace's xsave struct to the kernel.
+This ioctl would copy userspace's xsave struct to the kernel. It copies
+as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2),
+when invoked on the vm file descriptor. The size value returned by
+KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
+Currently, it is only greater than 4096 if a dynamic feature has been
+enabled with ``arch_prctl()``, but this may change in the future.
+
+The offsets of the state save areas in struct kvm_xsave follow the
+contents of CPUID leaf 0xD on the host.
 
 
 4.44 KVM_GET_XCRS
@@ -5507,6 +5517,34 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header.
 The Stats Data block contains an array of 64-bit values in the same order
 as the descriptors in Descriptors block.
 
+4.42 KVM_GET_XSAVE2
+------------------
+
+:Capability: KVM_CAP_XSAVE2
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xsave (out)
+:Returns: 0 on success, -1 on error
+
+
+::
+
+  struct kvm_xsave {
+	__u32 region[1024];
+	__u32 extra[0];
+  };
+
+This ioctl would copy current vcpu's xsave struct to the userspace. It
+copies as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+when invoked on the vm file descriptor. The size value returned by
+KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096.
+Currently, it is only greater than 4096 if a dynamic feature has been
+enabled with ``arch_prctl()``, but this may change in the future.
+
+The offsets of the state save areas in struct kvm_xsave follow the contents
+of CPUID leaf 0xD on the host.
+
+
 5. The kvm_run structure
 ========================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 5a776a08f78c..2da3316bb559 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -373,9 +373,23 @@ struct kvm_debugregs {
 	__u64 reserved[9];
 };
 
-/* for KVM_CAP_XSAVE */
+/* for KVM_CAP_XSAVE and KVM_CAP_XSAVE2 */
 struct kvm_xsave {
+	/*
+	 * KVM_GET_XSAVE2 and KVM_SET_XSAVE write and read as many bytes
+	 * as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+	 * respectively, when invoked on the vm file descriptor.
+	 *
+	 * The size value returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2)
+	 * will always be at least 4096. Currently, it is only greater
+	 * than 4096 if a dynamic feature has been enabled with
+	 * ``arch_prctl()``, but this may change in the future.
+	 *
+	 * The offsets of the state save areas in struct kvm_xsave follow
+	 * the contents of CPUID leaf 0xD on the host.
+	 */
 	__u32 region[1024];
+	__u32 extra[0];
 };
 
 #define KVM_MAX_XCRS	16
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bdf89c28d2ce..76e1941db223 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4296,6 +4296,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		else
 			r = 0;
 		break;
+	case KVM_CAP_XSAVE2:
+		r = kvm->vcpus[0]->arch.guest_fpu.uabi_size;
+		if (r < sizeof(struct kvm_xsave))
+			r = sizeof(struct kvm_xsave);
+		break;
 	default:
 		break;
 	}
@@ -4899,6 +4904,16 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
 				       vcpu->arch.pkru);
 }
 
+static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu,
+					  u8 *state, unsigned int size)
+{
+	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
+		return;
+
+	fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu,
+				       state, size, vcpu->arch.pkru);
+}
+
 static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 					struct kvm_xsave *guest_xsave)
 {
@@ -5352,6 +5367,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		break;
 	}
 	case KVM_GET_XSAVE: {
+		r = -EINVAL;
+		if (vcpu->arch.guest_fpu.uabi_size > sizeof(struct kvm_xsave))
+			break;
+
 		u.xsave = kzalloc(sizeof(struct kvm_xsave), GFP_KERNEL_ACCOUNT);
 		r = -ENOMEM;
 		if (!u.xsave)
@@ -5366,7 +5385,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		break;
 	}
 	case KVM_SET_XSAVE: {
-		u.xsave = memdup_user(argp, sizeof(*u.xsave));
+		int size = vcpu->arch.guest_fpu.uabi_size;
+
+		u.xsave = memdup_user(argp, size);
 		if (IS_ERR(u.xsave)) {
 			r = PTR_ERR(u.xsave);
 			goto out_nofree;
@@ -5375,6 +5396,26 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_x86_set_xsave(vcpu, u.xsave);
 		break;
 	}
+
+	case KVM_GET_XSAVE2: {
+		int size = vcpu->arch.guest_fpu.uabi_size;
+
+		u.xsave = kzalloc(size, GFP_KERNEL_ACCOUNT);
+		if (!u.xsave) {
+			r = -ENOMEM;
+			break;
+		}
+
+		kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size);
+
+		if (copy_to_user(argp, u.xsave, size)) {
+			r = -EFAULT;
+			break;
+		}
+		r = 0;
+		break;
+	}
+
 	case KVM_GET_XCRS: {
 		u.xcrs = kzalloc(sizeof(struct kvm_xcrs), GFP_KERNEL_ACCOUNT);
 		r = -ENOMEM;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 1daa45268de2..9d1c01669560 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
 #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
+#define KVM_CAP_XSAVE2 207
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1610,6 +1611,9 @@ struct kvm_enc_region {
 #define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
 #define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
 
+/* Available with KVM_CAP_XSAVE2 */
+#define KVM_GET_XSAVE2		  _IOR(KVMIO,  0xcf, struct kvm_xsave)
+
 struct kvm_s390_pv_sec_parm {
 	__u64 origin;
 	__u64 length;

  parent reply	other threads:[~2021-12-29 13:15 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-29 13:13 [PATCH v4 00/21] AMX Support in KVM Yang Zhong
2021-12-29 13:13 ` [PATCH v4 01/21] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Yang Zhong
2021-12-29 13:13 ` [PATCH v4 02/21] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Yang Zhong
2021-12-29 13:13 ` [PATCH v4 03/21] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2022-01-04 19:54   ` Sean Christopherson
2021-12-29 13:13 ` [PATCH v4 04/21] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Yang Zhong
2021-12-29 13:13 ` [PATCH v4 05/21] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Yang Zhong
2021-12-29 13:13 ` [PATCH v4 06/21] x86/fpu: Add guest support to xfd_enable_feature() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 07/21] x86/fpu: Provide fpu_update_guest_perm_features() for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 08/21] kvm: x86: Check and enable permitted dynamic xfeatures at KVM_SET_CPUID2 Yang Zhong
2021-12-29 16:55   ` Sean Christopherson
2021-12-30  2:28     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 09/21] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Yang Zhong
2021-12-29 13:13 ` [PATCH v4 10/21] kvm: x86: Add emulation for IA32_XFD Yang Zhong
2022-01-04 19:32   ` Sean Christopherson
2022-01-05  0:22     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 11/21] x86/fpu: Prepare xfd_err in struct fpu_guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 12/21] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Yang Zhong
2022-01-04 20:01   ` Sean Christopherson
2022-01-05  0:27     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 13/21] kvm: x86: Emulate IA32_XFD_ERR for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 14/21] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Yang Zhong
2022-01-04 19:34   ` Sean Christopherson
2022-01-05  0:23     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 15/21] kvm: x86: Add XCR0 support for Intel AMX Yang Zhong
2021-12-29 13:13 ` [PATCH v4 16/21] kvm: x86: Add CPUID " Yang Zhong
2021-12-29 13:13 ` [PATCH v4 17/21] x86/fpu: Add uabi_size to guest_fpu Yang Zhong
2021-12-29 13:13 ` Yang Zhong [this message]
2022-01-04 19:46   ` [PATCH v4 18/21] kvm: x86: Add support for getting/setting expanded xstate buffer Sean Christopherson
2022-01-04 20:45     ` Paolo Bonzini
2022-01-05  4:03       ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 19/21] kvm: selftests: Add support for KVM_CAP_XSAVE2 Yang Zhong
2021-12-29 13:13 ` [PATCH v4 20/21] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 21/21] kvm: x86: Disable interception for IA32_XFD on demand Yang Zhong
2021-12-30  8:46 ` [PATCH v4 00/21] AMX Support in KVM Tian, Kevin
2022-01-04 18:36 ` Paolo Bonzini
2022-01-04 18:54   ` Sean Christopherson
2022-01-05  0:10     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211229131328.12283-19-yang.zhong@intel.com \
    --to=yang.zhong@intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=guang.zeng@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).