linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
@ 2022-08-08 14:15 Kyle Huey
  2022-08-08 14:15 ` [PATCH v5 2/2] selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace Kyle Huey
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Kyle Huey @ 2022-08-08 14:15 UTC (permalink / raw)
  To: Dave Hansen, Thomas Gleixner, Borislav Petkov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Paolo Bonzini, Andy Lutomirski,
	Peter Zijlstra, linux-kernel, linux-kselftest,
	Robert O'Callahan, David Manouchehri, Kyle Huey,
	Borislav Petkov, kvm, stable

From: Kyle Huey <me@kylehuey.com>

When management of the PKRU register was moved away from XSTATE, emulation
of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
for APIs that write XSTATE. This can be seen by running gdb and executing
`p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
write to the PKRU register (which gdb performs through ptrace) is ignored.

There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
and sigreturn pass in pointers to the appropriate PKRU value.

This also adds code to initialize the PKRU value to the hardware init value
(namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
This is a change to the current KVM_SET_XSAVE behavior.

Changelog since v4:
- Selftest additionally checks PKRU readbacks through ptrace.
- Selftest flips all PKRU bits (except the key used for PROT_EXEC).

Changelog since v3:
- The v3 patch is now part 1 of 2.
- Adds a selftest in part 2 of 2.

Changelog since v2:
- Removed now unused variables in fpu_copy_uabi_to_guest_fpstate

Changelog since v1:
- Handles the error case of copy_to_buffer().

Signed-off-by: Kyle Huey <me@kylehuey.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: kvm@vger.kernel.org # For edge case behavior of KVM_SET_XSAVE
Cc: stable@vger.kernel.org # 5.14+
Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()")
---
 arch/x86/kernel/fpu/core.c   | 13 +------------
 arch/x86/kernel/fpu/regset.c |  2 +-
 arch/x86/kernel/fpu/signal.c |  2 +-
 arch/x86/kernel/fpu/xstate.c | 28 +++++++++++++++++++++++-----
 arch/x86/kernel/fpu/xstate.h |  4 ++--
 5 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 3b28c5b25e12..46b935bc87c8 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -391,8 +391,6 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
 {
 	struct fpstate *kstate = gfpu->fpstate;
 	const union fpregs_state *ustate = buf;
-	struct pkru_state *xpkru;
-	int ret;
 
 	if (!cpu_feature_enabled(X86_FEATURE_XSAVE)) {
 		if (ustate->xsave.header.xfeatures & ~XFEATURE_MASK_FPSSE)
@@ -406,16 +404,7 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
 	if (ustate->xsave.header.xfeatures & ~xcr0)
 		return -EINVAL;
 
-	ret = copy_uabi_from_kernel_to_xstate(kstate, ustate);
-	if (ret)
-		return ret;
-
-	/* Retrieve PKRU if not in init state */
-	if (kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU) {
-		xpkru = get_xsave_addr(&kstate->regs.xsave, XFEATURE_PKRU);
-		*vpkru = xpkru->pkru;
-	}
-	return 0;
+	return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);
 }
 EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate);
 #endif /* CONFIG_KVM */
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 75ffaef8c299..6d056b68f4ed 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -167,7 +167,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	}
 
 	fpu_force_restore(fpu);
-	ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf);
+	ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf, &target->thread.pkru);
 
 out:
 	vfree(tmpbuf);
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 91d4b6de58ab..558076dbde5b 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -396,7 +396,7 @@ static bool __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 
 	fpregs = &fpu->fpstate->regs;
 	if (use_xsave() && !fx_only) {
-		if (copy_sigframe_from_user_to_xstate(fpu->fpstate, buf_fx))
+		if (copy_sigframe_from_user_to_xstate(tsk, buf_fx))
 			return false;
 	} else {
 		if (__copy_from_user(&fpregs->fxsave, buf_fx,
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8340156bfd2..e01d3514ae68 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1197,7 +1197,7 @@ static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
 
 
 static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
-			       const void __user *ubuf)
+			       const void __user *ubuf, u32 *pkru)
 {
 	struct xregs_state *xsave = &fpstate->regs.xsave;
 	unsigned int offset, size;
@@ -1235,6 +1235,24 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		mask = BIT_ULL(i);
 
+		if (i == XFEATURE_PKRU) {
+			/*
+			 * Retrieve PKRU if not in init state, otherwise
+			 * initialize it.
+			 */
+			if (hdr.xfeatures & mask) {
+				struct pkru_state xpkru = {0};
+
+				if (copy_from_buffer(&xpkru, xstate_offsets[i],
+						     sizeof(xpkru), kbuf, ubuf))
+					return -EFAULT;
+
+				*pkru = xpkru.pkru;
+			} else {
+				*pkru = 0;
+			}
+		}
+
 		if (hdr.xfeatures & mask) {
 			void *dst = __raw_xsave_addr(xsave, i);
 
@@ -1264,9 +1282,9 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
  * format and copy to the target thread. Used by ptrace and KVM.
  */
-int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf)
+int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru)
 {
-	return copy_uabi_to_xstate(fpstate, kbuf, NULL);
+	return copy_uabi_to_xstate(fpstate, kbuf, NULL, pkru);
 }
 
 /*
@@ -1274,10 +1292,10 @@ int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf)
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
  */
-int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate,
+int copy_sigframe_from_user_to_xstate(struct task_struct *tsk,
 				      const void __user *ubuf)
 {
-	return copy_uabi_to_xstate(fpstate, NULL, ubuf);
+	return copy_uabi_to_xstate(tsk->thread.fpu.fpstate, NULL, ubuf, &tsk->thread.pkru);
 }
 
 static bool validate_independent_components(u64 mask)
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 5ad47031383b..a4ecb04d8d64 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -46,8 +46,8 @@ extern void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate,
 				      u32 pkru_val, enum xstate_copy_mode copy_mode);
 extern void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 				    enum xstate_copy_mode mode);
-extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf);
-extern int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate, const void __user *ubuf);
+extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru);
+extern int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf);
 
 
 extern void fpu__init_cpu_xstate(void);
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v5 2/2] selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace
  2022-08-08 14:15 [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
@ 2022-08-08 14:15 ` Kyle Huey
  2022-08-18  4:02 ` [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
  2022-08-18 10:57 ` Thomas Gleixner
  2 siblings, 0 replies; 7+ messages in thread
From: Kyle Huey @ 2022-08-08 14:15 UTC (permalink / raw)
  To: Dave Hansen, Thomas Gleixner, Borislav Petkov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Paolo Bonzini, Andy Lutomirski,
	Peter Zijlstra, linux-kernel, linux-kselftest,
	Robert O'Callahan, David Manouchehri, Kyle Huey

From: Kyle Huey <me@kylehuey.com>

This tests PTRACE_SETREGSET with NT_X86_XSTATE modifying PKRU directly and
removing the PKRU bit from XSTATE_BV.

Signed-off-by: Kyle Huey <me@kylehuey.com>
---
 tools/testing/selftests/vm/pkey-x86.h        |  12 ++
 tools/testing/selftests/vm/protection_keys.c | 131 ++++++++++++++++++-
 2 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-x86.h b/tools/testing/selftests/vm/pkey-x86.h
index b078ce9c6d2a..72c14cd3ddc7 100644
--- a/tools/testing/selftests/vm/pkey-x86.h
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -104,6 +104,18 @@ static inline int cpu_has_pkeys(void)
 	return 1;
 }
 
+static inline int cpu_max_xsave_size(void)
+{
+	unsigned long XSTATE_CPUID = 0xd;
+	unsigned int eax;
+	unsigned int ebx;
+	unsigned int ecx;
+	unsigned int edx;
+
+	__cpuid_count(XSTATE_CPUID, 0, eax, ebx, ecx, edx);
+	return ecx;
+}
+
 static inline u32 pkey_bit_position(int pkey)
 {
 	return pkey * PKEY_BITS_PER_PKEY;
diff --git a/tools/testing/selftests/vm/protection_keys.c b/tools/testing/selftests/vm/protection_keys.c
index 291bc1e07842..95f403a0c46d 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -18,12 +18,13 @@
  *	do a plain mprotect() to a mprotect_pkey() area and make sure the pkey sticks
  *
  * Compile like this:
- *	gcc      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
- *	gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -mxsave      -o protection_keys    -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
+ *	gcc -mxsave -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall protection_keys.c -lrt -ldl -lm
  */
 #define _GNU_SOURCE
 #define __SANE_USERSPACE_TYPES__
 #include <errno.h>
+#include <linux/elf.h>
 #include <linux/futex.h>
 #include <time.h>
 #include <sys/time.h>
@@ -1550,6 +1551,129 @@ void test_implicit_mprotect_exec_only_memory(int *ptr, u16 pkey)
 	do_not_expect_pkey_fault("plain read on recently PROT_EXEC area");
 }
 
+#if defined(__i386__) || defined(__x86_64__)
+void test_ptrace_modifies_pkru(int *ptr, u16 pkey)
+{
+	u32 new_pkru;
+	pid_t child;
+	int status, ret;
+	int pkey_offset = pkey_reg_xstate_offset();
+	size_t xsave_size = cpu_max_xsave_size();
+	void *xsave;
+	u32 *pkey_register;
+	u64 *xstate_bv;
+	struct iovec iov;
+
+	new_pkru = ~read_pkey_reg();
+	/* Don't make PROT_EXEC mappings inaccessible */
+	new_pkru &= ~3;
+
+	child = fork();
+	pkey_assert(child >= 0);
+	dprintf3("[%d] fork() ret: %d\n", getpid(), child);
+	if (!child) {
+		ptrace(PTRACE_TRACEME, 0, 0, 0);
+		/* Stop and allow the tracer to modify PKRU directly */
+		raise(SIGSTOP);
+
+		/*
+		 * need __read_pkey_reg() version so we do not do shadow_pkey_reg
+		 * checking
+		 */
+		if (__read_pkey_reg() != new_pkru)
+			exit(1);
+
+		/* Stop and allow the tracer to clear XSTATE_BV for PKRU */
+		raise(SIGSTOP);
+
+		if (__read_pkey_reg() != 0)
+			exit(1);
+
+		/* Stop and allow the tracer to examine PKRU */
+		raise(SIGSTOP);
+
+		exit(0);
+	}
+
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
+
+	xsave = (void *)malloc(xsave_size);
+	pkey_assert(xsave > 0);
+
+	/* Modify the PKRU register directly */
+	iov.iov_base = xsave;
+	iov.iov_len = xsave_size;
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+
+	pkey_register = (u32 *)(xsave + pkey_offset);
+	pkey_assert(*pkey_register == read_pkey_reg());
+
+	*pkey_register = new_pkru;
+
+	ret = ptrace(PTRACE_SETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+
+	/* Test that the modification is visible in ptrace before any execution */
+	memset(xsave, 0xCC, xsave_size);
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(*pkey_register == new_pkru);
+
+	/* Execute the tracee */
+	ret = ptrace(PTRACE_CONT, child, 0, 0);
+	pkey_assert(ret == 0);
+
+	/* Test that the tracee saw the PKRU value change */
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
+
+	/* Test that the modification is visible in ptrace after execution */
+	memset(xsave, 0xCC, xsave_size);
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(*pkey_register == new_pkru);
+
+	/* Clear the PKRU bit from XSTATE_BV */
+	xstate_bv = (u64 *)(xsave + 512);
+	*xstate_bv &= ~(1 << 9);
+
+	ret = ptrace(PTRACE_SETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+
+	/* Test that the modification is visible in ptrace before any execution */
+	memset(xsave, 0xCC, xsave_size);
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(*pkey_register == 0);
+
+	ret = ptrace(PTRACE_CONT, child, 0, 0);
+	pkey_assert(ret == 0);
+
+	/* Test that the tracee saw the PKRU value go to 0 */
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
+
+	/* Test that the modification is visible in ptrace after execution */
+	memset(xsave, 0xCC, xsave_size);
+	ret = ptrace(PTRACE_GETREGSET, child, (void *)NT_X86_XSTATE, &iov);
+	pkey_assert(ret == 0);
+	pkey_assert(*pkey_register == 0);
+
+	ret = ptrace(PTRACE_CONT, child, 0, 0);
+	pkey_assert(ret == 0);
+	pkey_assert(child == waitpid(child, &status, 0));
+	dprintf3("[%d] waitpid(%d) status: %x\n", getpid(), child, status);
+	pkey_assert(WIFEXITED(status));
+	pkey_assert(WEXITSTATUS(status) == 0);
+	free(xsave);
+}
+#endif
+
 void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
 {
 	int size = PAGE_SIZE;
@@ -1585,6 +1709,9 @@ void (*pkey_tests[])(int *ptr, u16 pkey) = {
 	test_pkey_syscalls_bad_args,
 	test_pkey_alloc_exhaust,
 	test_pkey_alloc_free_attach_pkey0,
+#if defined(__i386__) || defined(__x86_64__)
+	test_ptrace_modifies_pkru,
+#endif
 };
 
 void run_tests_once(void)
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
  2022-08-08 14:15 [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
  2022-08-08 14:15 ` [PATCH v5 2/2] selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace Kyle Huey
@ 2022-08-18  4:02 ` Kyle Huey
  2022-08-18 10:57 ` Thomas Gleixner
  2 siblings, 0 replies; 7+ messages in thread
From: Kyle Huey @ 2022-08-18  4:02 UTC (permalink / raw)
  To: Dave Hansen, Thomas Gleixner, Borislav Petkov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Paolo Bonzini, Andy Lutomirski,
	Peter Zijlstra, linux-kernel, linux-kselftest,
	Robert O'Callahan, David Manouchehri, Borislav Petkov, kvm,
	stable

On Mon, Aug 8, 2022 at 7:15 AM Kyle Huey <me@kylehuey.com> wrote:
>
> From: Kyle Huey <me@kylehuey.com>
>
> When management of the PKRU register was moved away from XSTATE, emulation
> of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> for APIs that write XSTATE. This can be seen by running gdb and executing
> `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> write to the PKRU register (which gdb performs through ptrace) is ignored.
>
> There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> and sigreturn pass in pointers to the appropriate PKRU value.
>
> This also adds code to initialize the PKRU value to the hardware init value
> (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> This is a change to the current KVM_SET_XSAVE behavior.
>
> Changelog since v4:
> - Selftest additionally checks PKRU readbacks through ptrace.
> - Selftest flips all PKRU bits (except the key used for PROT_EXEC).
>
> Changelog since v3:
> - The v3 patch is now part 1 of 2.
> - Adds a selftest in part 2 of 2.
>
> Changelog since v2:
> - Removed now unused variables in fpu_copy_uabi_to_guest_fpstate
>
> Changelog since v1:
> - Handles the error case of copy_to_buffer().
>
> Signed-off-by: Kyle Huey <me@kylehuey.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: kvm@vger.kernel.org # For edge case behavior of KVM_SET_XSAVE
> Cc: stable@vger.kernel.org # 5.14+
> Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()")
> ---
>  arch/x86/kernel/fpu/core.c   | 13 +------------
>  arch/x86/kernel/fpu/regset.c |  2 +-
>  arch/x86/kernel/fpu/signal.c |  2 +-
>  arch/x86/kernel/fpu/xstate.c | 28 +++++++++++++++++++++++-----
>  arch/x86/kernel/fpu/xstate.h |  4 ++--
>  5 files changed, 28 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 3b28c5b25e12..46b935bc87c8 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -391,8 +391,6 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
>  {
>         struct fpstate *kstate = gfpu->fpstate;
>         const union fpregs_state *ustate = buf;
> -       struct pkru_state *xpkru;
> -       int ret;
>
>         if (!cpu_feature_enabled(X86_FEATURE_XSAVE)) {
>                 if (ustate->xsave.header.xfeatures & ~XFEATURE_MASK_FPSSE)
> @@ -406,16 +404,7 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
>         if (ustate->xsave.header.xfeatures & ~xcr0)
>                 return -EINVAL;
>
> -       ret = copy_uabi_from_kernel_to_xstate(kstate, ustate);
> -       if (ret)
> -               return ret;
> -
> -       /* Retrieve PKRU if not in init state */
> -       if (kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU) {
> -               xpkru = get_xsave_addr(&kstate->regs.xsave, XFEATURE_PKRU);
> -               *vpkru = xpkru->pkru;
> -       }
> -       return 0;
> +       return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);
>  }
>  EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate);
>  #endif /* CONFIG_KVM */
> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
> index 75ffaef8c299..6d056b68f4ed 100644
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -167,7 +167,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
>         }
>
>         fpu_force_restore(fpu);
> -       ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf);
> +       ret = copy_uabi_from_kernel_to_xstate(fpu->fpstate, kbuf ?: tmpbuf, &target->thread.pkru);
>
>  out:
>         vfree(tmpbuf);
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index 91d4b6de58ab..558076dbde5b 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -396,7 +396,7 @@ static bool __fpu_restore_sig(void __user *buf, void __user *buf_fx,
>
>         fpregs = &fpu->fpstate->regs;
>         if (use_xsave() && !fx_only) {
> -               if (copy_sigframe_from_user_to_xstate(fpu->fpstate, buf_fx))
> +               if (copy_sigframe_from_user_to_xstate(tsk, buf_fx))
>                         return false;
>         } else {
>                 if (__copy_from_user(&fpregs->fxsave, buf_fx,
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c8340156bfd2..e01d3514ae68 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1197,7 +1197,7 @@ static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
>
>
>  static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
> -                              const void __user *ubuf)
> +                              const void __user *ubuf, u32 *pkru)
>  {
>         struct xregs_state *xsave = &fpstate->regs.xsave;
>         unsigned int offset, size;
> @@ -1235,6 +1235,24 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
>         for (i = 0; i < XFEATURE_MAX; i++) {
>                 mask = BIT_ULL(i);
>
> +               if (i == XFEATURE_PKRU) {
> +                       /*
> +                        * Retrieve PKRU if not in init state, otherwise
> +                        * initialize it.
> +                        */
> +                       if (hdr.xfeatures & mask) {
> +                               struct pkru_state xpkru = {0};
> +
> +                               if (copy_from_buffer(&xpkru, xstate_offsets[i],
> +                                                    sizeof(xpkru), kbuf, ubuf))
> +                                       return -EFAULT;
> +
> +                               *pkru = xpkru.pkru;
> +                       } else {
> +                               *pkru = 0;
> +                       }
> +               }
> +
>                 if (hdr.xfeatures & mask) {
>                         void *dst = __raw_xsave_addr(xsave, i);
>
> @@ -1264,9 +1282,9 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
>   * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
>   * format and copy to the target thread. Used by ptrace and KVM.
>   */
> -int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf)
> +int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru)
>  {
> -       return copy_uabi_to_xstate(fpstate, kbuf, NULL);
> +       return copy_uabi_to_xstate(fpstate, kbuf, NULL, pkru);
>  }
>
>  /*
> @@ -1274,10 +1292,10 @@ int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf)
>   * XSAVE[S] format and copy to the target thread. This is called from the
>   * sigreturn() and rt_sigreturn() system calls.
>   */
> -int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate,
> +int copy_sigframe_from_user_to_xstate(struct task_struct *tsk,
>                                       const void __user *ubuf)
>  {
> -       return copy_uabi_to_xstate(fpstate, NULL, ubuf);
> +       return copy_uabi_to_xstate(tsk->thread.fpu.fpstate, NULL, ubuf, &tsk->thread.pkru);
>  }
>
>  static bool validate_independent_components(u64 mask)
> diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
> index 5ad47031383b..a4ecb04d8d64 100644
> --- a/arch/x86/kernel/fpu/xstate.h
> +++ b/arch/x86/kernel/fpu/xstate.h
> @@ -46,8 +46,8 @@ extern void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate,
>                                       u32 pkru_val, enum xstate_copy_mode copy_mode);
>  extern void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
>                                     enum xstate_copy_mode mode);
> -extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf);
> -extern int copy_sigframe_from_user_to_xstate(struct fpstate *fpstate, const void __user *ubuf);
> +extern int copy_uabi_from_kernel_to_xstate(struct fpstate *fpstate, const void *kbuf, u32 *pkru);
> +extern int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf);
>
>
>  extern void fpu__init_cpu_xstate(void);
> --
> 2.37.1
>

Bump.

If there are no further comments/complaints, can we get this queued up
via x86/urgent (or something)? We're eager to get this moving and
eventually onto stable to fix rr users on 5.15.

- Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
  2022-08-08 14:15 [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
  2022-08-08 14:15 ` [PATCH v5 2/2] selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace Kyle Huey
  2022-08-18  4:02 ` [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
@ 2022-08-18 10:57 ` Thomas Gleixner
  2022-08-18 19:48   ` Kyle Huey
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2022-08-18 10:57 UTC (permalink / raw)
  To: Kyle Huey, Dave Hansen, Borislav Petkov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Paolo Bonzini, Andy Lutomirski,
	Peter Zijlstra, linux-kernel, linux-kselftest,
	Robert O'Callahan, David Manouchehri, Kyle Huey,
	Borislav Petkov, kvm, stable

Kyle!

On Mon, Aug 08 2022 at 07:15, Kyle Huey wrote:
> When management of the PKRU register was moved away from XSTATE, emulation
> of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> for APIs that write XSTATE. This can be seen by running gdb and executing
> `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> write to the PKRU register (which gdb performs through ptrace) is ignored.
>
> There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> and sigreturn pass in pointers to the appropriate PKRU value.
>
> This also adds code to initialize the PKRU value to the hardware init value
> (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> This is a change to the current KVM_SET_XSAVE behavior.

You are stating a fact here, but provide 0 justification why this is
correct.

>
> Changelog since v4:

Can you please put the change log past the --- seperator line, so it
gets stripped off when the patch is applied? That spares manual fixups.

>
> Signed-off-by: Kyle Huey <me@kylehuey.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: kvm@vger.kernel.org # For edge case behavior of KVM_SET_XSAVE
> Cc: stable@vger.kernel.org # 5.14+
> Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()")

Can you please use the documented tag ordering?

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#patch-submission-notes

> @@ -1235,6 +1235,24 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
>  	for (i = 0; i < XFEATURE_MAX; i++) {
>  		mask = BIT_ULL(i);
>  
> +		if (i == XFEATURE_PKRU) {
> +			/*
> +			 * Retrieve PKRU if not in init state, otherwise
> +			 * initialize it.
> +			 */
> +			if (hdr.xfeatures & mask) {
> +				struct pkru_state xpkru = {0};
> +
> +				if (copy_from_buffer(&xpkru, xstate_offsets[i],
> +						     sizeof(xpkru), kbuf, ubuf))
> +					return -EFAULT;
> +
> +				*pkru = xpkru.pkru;
> +			} else {
> +				*pkru = 0;
> +			}
> +		}

That's really horrible and there is no point in copying the stuff from
the buffer twice:

@@ -1246,6 +1246,15 @@ static int copy_uabi_to_xstate(struct fp
 		}
 	}
 
+	/* Update the user protection key storage */
+	*pkru = 0;
+	if (hdr.xfeatures & XFEATURE_MASK_PKRU) {
+		struct pkru_state *xpkru;
+
+		xpkru = get_xsave_addr(xsave, XFEATURE_PKRU);
+		*pkru = xpkru->pkru;
+	}
+

Hmm?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
  2022-08-18 10:57 ` Thomas Gleixner
@ 2022-08-18 19:48   ` Kyle Huey
  2022-08-18 21:19     ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: Kyle Huey @ 2022-08-18 19:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dave Hansen, Borislav Petkov, Ingo Molnar, x86, H. Peter Anvin,
	Paolo Bonzini, Andy Lutomirski, Peter Zijlstra, linux-kernel,
	linux-kselftest, Robert O'Callahan, David Manouchehri,
	Borislav Petkov, kvm, stable

On Thu, Aug 18, 2022 at 3:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Kyle!

Hi.

> On Mon, Aug 08 2022 at 07:15, Kyle Huey wrote:
> > When management of the PKRU register was moved away from XSTATE, emulation
> > of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> > for APIs that write XSTATE. This can be seen by running gdb and executing
> > `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> > write to the PKRU register (which gdb performs through ptrace) is ignored.
> >
> > There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> > sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> > make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> > down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> > and sigreturn pass in pointers to the appropriate PKRU value.
> >
> > This also adds code to initialize the PKRU value to the hardware init value
> > (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> > This is a change to the current KVM_SET_XSAVE behavior.
>
> You are stating a fact here, but provide 0 justification why this is
> correct.

Well, the justification is that this *is* the behavior we want for
ptrace/sigreturn, and it's very likely the existing KVM_SET_XSAVE
behavior in this edge case is an oversight rather than intentional,
and in the absence of confirmation that KVM wants the existing
behavior (the KVM mailing list and maintainer are CCd) one correct
code path is better than one correct code path and one buggy code
path.

> >
> > Changelog since v4:
>
> Can you please put the change log past the --- seperator line, so it
> gets stripped off when the patch is applied? That spares manual fixups.

Ok.

> >
> > Signed-off-by: Kyle Huey <me@kylehuey.com>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: kvm@vger.kernel.org # For edge case behavior of KVM_SET_XSAVE
> > Cc: stable@vger.kernel.org # 5.14+
> > Fixes: e84ba47e313d ("x86/fpu: Hook up PKRU into ptrace()")
>
> Can you please use the documented tag ordering?
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#patch-submission-notes

Ok.

> > @@ -1235,6 +1235,24 @@ static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
> >       for (i = 0; i < XFEATURE_MAX; i++) {
> >               mask = BIT_ULL(i);
> >
> > +             if (i == XFEATURE_PKRU) {
> > +                     /*
> > +                      * Retrieve PKRU if not in init state, otherwise
> > +                      * initialize it.
> > +                      */
> > +                     if (hdr.xfeatures & mask) {
> > +                             struct pkru_state xpkru = {0};
> > +
> > +                             if (copy_from_buffer(&xpkru, xstate_offsets[i],
> > +                                                  sizeof(xpkru), kbuf, ubuf))
> > +                                     return -EFAULT;
> > +
> > +                             *pkru = xpkru.pkru;
> > +                     } else {
> > +                             *pkru = 0;
> > +                     }
> > +             }
>
> That's really horrible and there is no point in copying the stuff from
> the buffer twice:
>
> @@ -1246,6 +1246,15 @@ static int copy_uabi_to_xstate(struct fp
>                 }
>         }
>
> +       /* Update the user protection key storage */
> +       *pkru = 0;
> +       if (hdr.xfeatures & XFEATURE_MASK_PKRU) {
> +               struct pkru_state *xpkru;
> +
> +               xpkru = get_xsave_addr(xsave, XFEATURE_PKRU);
> +               *pkru = xpkru->pkru;
> +       }
> +
>
> Hmm?

It took me a bit to figure out what this is actually trying to do. To
work, it would need to come at the very end of copy_uabi_to_xstate
after xsave->header.xfeatures is updated. If you just want to avoid
two copies I would counter-propose this though:

@@ -1235,7 +1235,19 @@ static int copy_uabi_to_xstate(struct fpstate
*fpstate, const void *kbuf,
        for (i = 0; i < XFEATURE_MAX; i++) {
                mask = BIT_ULL(i);

-               if (hdr.xfeatures & mask) {
+               if (i == XFEATURE_PKRU) {
+                       /* Update the user protection key storage */
+                       *pkru = 0;
+                       if (hdr.xfeatures & XFEATURE_MASK_PKRU) {
+                               struct pkru_state xpkru = {0};
+
+                               if (copy_from_buffer(&xpkru, xstate_offsets[i],
+                                                    sizeof(xpkru), kbuf, ubuf))
+                                       return -EFAULT;
+
+                               *pkru = xpkru.pkru;
+                       }
+               } else if (hdr.xfeatures & mask) {
                        void *dst = __raw_xsave_addr(xsave, i);

                        offset = xstate_offsets[i];

Thoughts? This avoids a second copy and avoids having to calculate the
offset into the (now potentially compressed) XSTATE.

- Kyle

>
> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
  2022-08-18 19:48   ` Kyle Huey
@ 2022-08-18 21:19     ` Sean Christopherson
  2022-08-26  4:47       ` Kyle Huey
  0 siblings, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2022-08-18 21:19 UTC (permalink / raw)
  To: Kyle Huey
  Cc: Thomas Gleixner, Dave Hansen, Borislav Petkov, Ingo Molnar, x86,
	H. Peter Anvin, Paolo Bonzini, Andy Lutomirski, Peter Zijlstra,
	linux-kernel, linux-kselftest, Robert O'Callahan,
	David Manouchehri, Borislav Petkov, kvm, stable

On Thu, Aug 18, 2022, Kyle Huey wrote:
> On Thu, Aug 18, 2022 at 3:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, Aug 08 2022 at 07:15, Kyle Huey wrote:
> > > When management of the PKRU register was moved away from XSTATE, emulation
> > > of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> > > for APIs that write XSTATE. This can be seen by running gdb and executing
> > > `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> > > write to the PKRU register (which gdb performs through ptrace) is ignored.
> > >
> > > There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> > > sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> > > make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> > > down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> > > and sigreturn pass in pointers to the appropriate PKRU value.
> > >
> > > This also adds code to initialize the PKRU value to the hardware init value
> > > (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> > > This is a change to the current KVM_SET_XSAVE behavior.
> >
> > You are stating a fact here, but provide 0 justification why this is
> > correct.
> 
> Well, the justification is that this *is* the behavior we want for
> ptrace/sigreturn, and it's very likely the existing KVM_SET_XSAVE
> behavior in this edge case is an oversight rather than intentional,
> and in the absence of confirmation that KVM wants the existing
> behavior (the KVM mailing list and maintainer are CCd) one correct
> code path is better than one correct code path and one buggy code
> path.

Sorry, I missed the KVM-relevant flags.

Hrm, the current behavior has been KVM ABI for a very long time.

It's definitely odd because all other components will be initialized due to their
bits being cleared in the header during kvm_load_guest_fpu(), and it probably
wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads.
But, in theory, userspace could save/restore a subset of guest XSTATE and rely on
the kernel not overwriting guest PKRU when its bit is cleared in the header.

All that said, I don't see any reason to force KVM to change at this time, it's
trivial enough to handle KVM's oddities while providing sane behavior for others.
Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to
play nice with a NULL pointer, e.g. 

	/*
	 * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
	 * in the header.  KVM's odd ABI is to leave PKRU untouched in this
	 * case (all other components are eventually re-initialized).
	 */
	if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU))
		vpkru = NULL;

	return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
  2022-08-18 21:19     ` Sean Christopherson
@ 2022-08-26  4:47       ` Kyle Huey
  0 siblings, 0 replies; 7+ messages in thread
From: Kyle Huey @ 2022-08-26  4:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Dave Hansen, Borislav Petkov, Ingo Molnar, x86,
	H. Peter Anvin, Paolo Bonzini, Andy Lutomirski, Peter Zijlstra,
	linux-kernel, linux-kselftest, Robert O'Callahan,
	David Manouchehri, Borislav Petkov, kvm, stable

On Thu, Aug 18, 2022 at 2:19 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Aug 18, 2022, Kyle Huey wrote:
> > On Thu, Aug 18, 2022 at 3:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > On Mon, Aug 08 2022 at 07:15, Kyle Huey wrote:
> > > > When management of the PKRU register was moved away from XSTATE, emulation
> > > > of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> > > > for APIs that write XSTATE. This can be seen by running gdb and executing
> > > > `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> > > > write to the PKRU register (which gdb performs through ptrace) is ignored.
> > > >
> > > > There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> > > > sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> > > > make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> > > > down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> > > > and sigreturn pass in pointers to the appropriate PKRU value.
> > > >
> > > > This also adds code to initialize the PKRU value to the hardware init value
> > > > (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> > > > This is a change to the current KVM_SET_XSAVE behavior.
> > >
> > > You are stating a fact here, but provide 0 justification why this is
> > > correct.
> >
> > Well, the justification is that this *is* the behavior we want for
> > ptrace/sigreturn, and it's very likely the existing KVM_SET_XSAVE
> > behavior in this edge case is an oversight rather than intentional,
> > and in the absence of confirmation that KVM wants the existing
> > behavior (the KVM mailing list and maintainer are CCd) one correct
> > code path is better than one correct code path and one buggy code
> > path.
>
> Sorry, I missed the KVM-relevant flags.
>
> Hrm, the current behavior has been KVM ABI for a very long time.
>
> It's definitely odd because all other components will be initialized due to their
> bits being cleared in the header during kvm_load_guest_fpu(), and it probably
> wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads.
> But, in theory, userspace could save/restore a subset of guest XSTATE and rely on
> the kernel not overwriting guest PKRU when its bit is cleared in the header.

This seems extremely conservative, but ok. As you note, PKRU is the
only XSTATE component you could theoretically do this subset
save/restore with in the KVM ABI since all the others really do have
their hardware behavior.

> All that said, I don't see any reason to force KVM to change at this time, it's
> trivial enough to handle KVM's oddities while providing sane behavior for others.
> Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to
> play nice with a NULL pointer, e.g.
>
>         /*
>          * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
>          * in the header.  KVM's odd ABI is to leave PKRU untouched in this
>          * case (all other components are eventually re-initialized).
>          */
>         if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU))
>                 vpkru = NULL;

You meant ustate->... here (since this is before the copy now), but
yes, ok, I will do that.

>         return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);

- Kyle

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-26  4:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 14:15 [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
2022-08-08 14:15 ` [PATCH v5 2/2] selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace Kyle Huey
2022-08-18  4:02 ` [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace Kyle Huey
2022-08-18 10:57 ` Thomas Gleixner
2022-08-18 19:48   ` Kyle Huey
2022-08-18 21:19     ` Sean Christopherson
2022-08-26  4:47       ` Kyle Huey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).