All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] selftests/vm/pkeys: Bug fixes and a new test
@ 2021-06-11 16:41 ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86

There has been a lot of activity on the x86 front around the XSAVE
architecture which is used to context-switch processor state (among
other things).  In addition, AMD has recently joined the protection
keys club by adding processor support for PKU.

The AMD implementation helped uncover a kernel bug around the PKRU
"init state", which actually applied to Intel's implementation but
was just harder to hit.  This series adds a test which is expected
to help find this class of bug both on AMD and Intel.  All the work
around pkeys on x86 also uncovered a few bugs in the selftest.

Any testing of this new code (especially from my powerpc friends)
would be appreciated.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/4] selftests/vm/pkeys: Bug fixes and a new test
@ 2021-06-11 16:41 ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86

There has been a lot of activity on the x86 front around the XSAVE
architecture which is used to context-switch processor state (among
other things).  In addition, AMD has recently joined the protection
keys club by adding processor support for PKU.

The AMD implementation helped uncover a kernel bug around the PKRU
"init state", which actually applied to Intel's implementation but
was just harder to hit.  This series adds a test which is expected
to help find this class of bug both on AMD and Intel.  All the work
around pkeys on x86 also uncovered a few bugs in the selftest.

Any testing of this new code (especially from my powerpc friends)
would be appreciated.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] selftests/vm/pkeys: Fix alloc_random_pkey() to make it really, really random
  2021-06-11 16:41 ` Dave Hansen
@ 2021-06-11 16:41   ` Dave Hansen
  -1 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The "random" pkey allocation code currently does the good old:

	srand((unsigned int)time(NULL));

*But*, it unfortunately does this on every random pkey allocation.

There may be thousands of these a second.  time() has a one second
resolution.  So, each time alloc_random_pkey() is called, the PRNG is
*RESET* to time().  This is nasty.  Normally, if you do:

	srand(<ANYTHING>);
	foo = rand();
	bar = rand();

You'll be quite guaranteed that 'foo' and 'bar' are different.
But, if you do:

	srand(1);
	foo = rand();
	srand(1);
	bar = rand();

You are quite guaranteed that 'foo' and 'bar' are the *SAME*.
The recent "fix" effectively forced the test case to use the
same "random" pkey for the whole test, unless the test run
crossed a second boundary.

Only run srand() once at program startup.

This explains some very odd and persistent test failures I've been
seeing.

Fixes: 6e373263ce07 ("selftests/vm/pkeys: fix alloc_random_pkey() to make it really random")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Fix_alloc_random_pkey_to_make_it_really_really_random-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Fix_alloc_random_pkey_to_make_it_really_really_random-1	2021-06-11 09:41:31.385468066 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:31.389468066 -0700
@@ -561,7 +561,6 @@ int alloc_random_pkey(void)
 	int nr_alloced = 0;
 	int random_index;
 	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
-	srand((unsigned int)time(NULL));
 
 	/* allocate every possible key and make a note of which ones we got */
 	max_nr_pkey_allocs = NR_PKEYS;
@@ -1552,6 +1551,8 @@ int main(void)
 	int nr_iterations = 22;
 	int pkeys_supported = is_pkeys_supported();
 
+	srand((unsigned int)time(NULL));
+
 	setup_handlers();
 
 	printf("has pkeys: %d\n", pkeys_supported);
_

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] selftests/vm/pkeys: Fix alloc_random_pkey() to make it really, really random
@ 2021-06-11 16:41   ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The "random" pkey allocation code currently does the good old:

	srand((unsigned int)time(NULL));

*But*, it unfortunately does this on every random pkey allocation.

There may be thousands of these a second.  time() has a one second
resolution.  So, each time alloc_random_pkey() is called, the PRNG is
*RESET* to time().  This is nasty.  Normally, if you do:

	srand(<ANYTHING>);
	foo = rand();
	bar = rand();

You'll be quite guaranteed that 'foo' and 'bar' are different.
But, if you do:

	srand(1);
	foo = rand();
	srand(1);
	bar = rand();

You are quite guaranteed that 'foo' and 'bar' are the *SAME*.
The recent "fix" effectively forced the test case to use the
same "random" pkey for the whole test, unless the test run
crossed a second boundary.

Only run srand() once at program startup.

This explains some very odd and persistent test failures I've been
seeing.

Fixes: 6e373263ce07 ("selftests/vm/pkeys: fix alloc_random_pkey() to make it really random")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Fix_alloc_random_pkey_to_make_it_really_really_random-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Fix_alloc_random_pkey_to_make_it_really_really_random-1	2021-06-11 09:41:31.385468066 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:31.389468066 -0700
@@ -561,7 +561,6 @@ int alloc_random_pkey(void)
 	int nr_alloced = 0;
 	int random_index;
 	memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
-	srand((unsigned int)time(NULL));
 
 	/* allocate every possible key and make a note of which ones we got */
 	max_nr_pkey_allocs = NR_PKEYS;
@@ -1552,6 +1551,8 @@ int main(void)
 	int nr_iterations = 22;
 	int pkeys_supported = is_pkeys_supported();
 
+	srand((unsigned int)time(NULL));
+
 	setup_handlers();
 
 	printf("has pkeys: %d\n", pkeys_supported);
_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/4] selftests/vm/pkeys: Handle negative sys_pkey_alloc() return code
  2021-06-11 16:41 ` Dave Hansen
@ 2021-06-11 16:41   ` Dave Hansen
  -1 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The alloc_pkey() sefltest function wraps the sys_pkey_alloc() system
call.  On success, it updates its "shadow" register value because
sys_pkey_alloc() updates the real register.

But, the success check is wrong.  pkey_alloc() considers any
non-zero return code to indicate success where the pkey register
will be modified.  This fails to take negative return codes into
account.

Consider only a positive return value as a successful call.

Fixes: 5f23f6d082a9 ("x86/pkeys: Add self-tests")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_ret-code tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_ret-code	2021-06-11 09:41:32.448468063 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:32.453468063 -0700
@@ -510,7 +510,7 @@ int alloc_pkey(void)
 			" shadow: 0x%016llx\n",
 			__func__, __LINE__, ret, __read_pkey_reg(),
 			shadow_pkey_reg);
-	if (ret) {
+	if (ret > 0) {
 		/* clear both the bits: */
 		shadow_pkey_reg = set_pkey_bits(shadow_pkey_reg, ret,
 						~PKEY_MASK);
_

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/4] selftests/vm/pkeys: Handle negative sys_pkey_alloc() return code
@ 2021-06-11 16:41   ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The alloc_pkey() sefltest function wraps the sys_pkey_alloc() system
call.  On success, it updates its "shadow" register value because
sys_pkey_alloc() updates the real register.

But, the success check is wrong.  pkey_alloc() considers any
non-zero return code to indicate success where the pkey register
will be modified.  This fails to take negative return codes into
account.

Consider only a positive return value as a successful call.

Fixes: 5f23f6d082a9 ("x86/pkeys: Add self-tests")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_ret-code tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_ret-code	2021-06-11 09:41:32.448468063 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:32.453468063 -0700
@@ -510,7 +510,7 @@ int alloc_pkey(void)
 			" shadow: 0x%016llx\n",
 			__func__, __LINE__, ret, __read_pkey_reg(),
 			shadow_pkey_reg);
-	if (ret) {
+	if (ret > 0) {
 		/* clear both the bits: */
 		shadow_pkey_reg = set_pkey_bits(shadow_pkey_reg, ret,
 						~PKEY_MASK);
_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/4] selftests/vm/pkeys: Refill shadow register after implicit kernel write
  2021-06-11 16:41 ` Dave Hansen
@ 2021-06-11 16:42   ` Dave Hansen
  -1 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:42 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The pkey test code keeps a "shadow" of the pkey register around.  This
ensures that any bugs which might write to the register can be caught
more quickly.

Generally, userspace has a good idea when the kernel is going to write
to the register.  For instance, alloc_pkey() is passed a permission
mask.  The caller of alloc_pkey() can update the shadow based on the
return value and the mask.

But, the kernel can also modify the pkey register in a more sneaky
way.  For mprotect(PROT_EXEC) mappings, the kernel will allocate a
pkey and write the pkey register to create an execute-only mapping.
The kernel never tells userspace what key it uses for this.

This can cause the test to fail with messages like:

	protection_keys_64.2: pkey-helpers.h:132: _read_pkey_reg: Assertion `pkey_reg == shadow_pkey_reg' failed.

because the shadow was not updated with the new kernel-set value.

Forcibly update the shadow value immediately after an mprotect().

Fixes: 6af17cf89e99 ("x86/pkeys/selftests: Add PROT_EXEC test")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Refill_shadow_register_after_implict_kernel_write-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Refill_shadow_register_after_implict_kernel_write-1	2021-06-11 09:41:33.508468061 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:33.517468061 -0700
@@ -1448,6 +1448,13 @@ void test_implicit_mprotect_exec_only_me
 	ret = mprotect(p1, PAGE_SIZE, PROT_EXEC);
 	pkey_assert(!ret);
 
+	/*
+	 * Reset the shadow, assuming that the above mprotect()
+	 * correctly changed PKRU, but to an unknown value since
+	 * the actual alllocated pkey is unknown.
+	 */
+	shadow_pkey_reg = __read_pkey_reg();
+
 	dprintf2("pkey_reg: %016llx\n", read_pkey_reg());
 
 	/* Make sure this is an *instruction* fault */
_

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/4] selftests/vm/pkeys: Refill shadow register after implicit kernel write
@ 2021-06-11 16:42   ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:42 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

The pkey test code keeps a "shadow" of the pkey register around.  This
ensures that any bugs which might write to the register can be caught
more quickly.

Generally, userspace has a good idea when the kernel is going to write
to the register.  For instance, alloc_pkey() is passed a permission
mask.  The caller of alloc_pkey() can update the shadow based on the
return value and the mask.

But, the kernel can also modify the pkey register in a more sneaky
way.  For mprotect(PROT_EXEC) mappings, the kernel will allocate a
pkey and write the pkey register to create an execute-only mapping.
The kernel never tells userspace what key it uses for this.

This can cause the test to fail with messages like:

	protection_keys_64.2: pkey-helpers.h:132: _read_pkey_reg: Assertion `pkey_reg == shadow_pkey_reg' failed.

because the shadow was not updated with the new kernel-set value.

Forcibly update the shadow value immediately after an mprotect().

Fixes: 6af17cf89e99 ("x86/pkeys/selftests: Add PROT_EXEC test")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/protection_keys.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Refill_shadow_register_after_implict_kernel_write-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Refill_shadow_register_after_implict_kernel_write-1	2021-06-11 09:41:33.508468061 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:33.517468061 -0700
@@ -1448,6 +1448,13 @@ void test_implicit_mprotect_exec_only_me
 	ret = mprotect(p1, PAGE_SIZE, PROT_EXEC);
 	pkey_assert(!ret);
 
+	/*
+	 * Reset the shadow, assuming that the above mprotect()
+	 * correctly changed PKRU, but to an unknown value since
+	 * the actual alllocated pkey is unknown.
+	 */
+	shadow_pkey_reg = __read_pkey_reg();
+
 	dprintf2("pkey_reg: %016llx\n", read_pkey_reg());
 
 	/* Make sure this is an *instruction* fault */
_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] selftests/vm/pkeys: Exercise x86 XSAVE init state
  2021-06-11 16:41 ` Dave Hansen
@ 2021-06-11 16:42   ` Dave Hansen
  -1 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:42 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

On x86, there is a set of instructions used to save and restore register
state collectively known as the XSAVE architecture.  There are about a
dozen different features managed with XSAVE.  The protection keys
register, PKRU, is one of those features.

The hardware optimizes XSAVE by tracking when the state has not changed
from its initial (init) state.  In this case, it can avoid the cost of
writing state to memory (it would usually just be a bunch of 0's).

When the pkey register is 0x0 the hardware optionally choose to track
the register as being in the init state (optimize away the writes).
AMD CPUs do this more aggressively compared to Intel.

On x86, PKRU is rarely in its (very permissive) init state.  Instead,
the value defaults to something very restrictive.  It is not surprising
that bugs have popped up in the rare cases when PKRU reaches its init
state.

Add a protection key selftest which gets the protection keys register
into its init state in a way that should work on Intel and AMD.  Then,
do a bunch of pkey register reads to watch for inadvertent changes.

This adds "-mxsave" to CFLAGS for all the x86 vm selftests in order
to allow use of the XSAVE instruction __builtin functions.  This will
make the builtins available on all of the vm selftests, but is
expected to be harmless.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/Makefile          |    4 -
 b/tools/testing/selftests/vm/pkey-x86.h        |    1 
 b/tools/testing/selftests/vm/protection_keys.c |   73 +++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff -puN tools/testing/selftests/vm/Makefile~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/Makefile
--- a/tools/testing/selftests/vm/Makefile~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.574468058 -0700
+++ b/tools/testing/selftests/vm/Makefile	2021-06-11 09:41:34.588468058 -0700
@@ -100,7 +100,7 @@ $(1) $(1)_64: $(OUTPUT)/$(1)_64
 endef
 
 ifeq ($(CAN_BUILD_I386),1)
-$(BINARIES_32): CFLAGS += -m32
+$(BINARIES_32): CFLAGS += -m32 -mxsave
 $(BINARIES_32): LDLIBS += -lrt -ldl -lm
 $(BINARIES_32): $(OUTPUT)/%_32: %.c
 	$(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@
@@ -108,7 +108,7 @@ $(foreach t,$(TARGETS),$(eval $(call gen
 endif
 
 ifeq ($(CAN_BUILD_X86_64),1)
-$(BINARIES_64): CFLAGS += -m64
+$(BINARIES_64): CFLAGS += -m64 -mxsave
 $(BINARIES_64): LDLIBS += -lrt -ldl
 $(BINARIES_64): $(OUTPUT)/%_64: %.c
 	$(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@
diff -puN tools/testing/selftests/vm/pkey-x86.h~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/pkey-x86.h
--- a/tools/testing/selftests/vm/pkey-x86.h~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.576468058 -0700
+++ b/tools/testing/selftests/vm/pkey-x86.h	2021-06-11 09:41:34.590468058 -0700
@@ -126,6 +126,7 @@ static inline u32 pkey_bit_position(int
 
 #define XSTATE_PKEY_BIT	(9)
 #define XSTATE_PKEY	0x200
+#define XSTATE_BV_OFFSET	512
 
 int pkey_reg_xstate_offset(void)
 {
diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.578468058 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:34.593468058 -0700
@@ -1277,6 +1277,78 @@ void test_pkey_alloc_exhaust(int *ptr, u
 	}
 }
 
+void arch_force_pkey_reg_init(void)
+{
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	u64 *buf;
+
+	/*
+	 * All keys should be allocated and set to allow reads and
+	 * writes, so the register should be all 0.  If not, just
+	 * skip the test.
+	 */
+	if (read_pkey_reg())
+		return;
+
+	/*
+	 * Just allocate an absurd about of memory rather than
+	 * doing the XSAVE size enumeration dance.
+	 */
+	buf = mmap(NULL, 1*MB, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+
+	/* These __builtins require compiling with -mxsave */
+
+	/* XSAVE to build a valid buffer: */
+	__builtin_ia32_xsave(buf, XSTATE_PKEY);
+	/* Clear XSTATE_BV[PKRU]: */
+	buf[XSTATE_BV_OFFSET/sizeof(u64)] &= ~XSTATE_PKEY;
+	/* XRSTOR will likely get PKRU back to the init state: */
+	__builtin_ia32_xrstor(buf, XSTATE_PKEY);
+
+	munmap(buf, 1*MB);
+#endif
+}
+
+
+/*
+ * This is mostly useless on ppc for now.  But it will not
+ * hurt anything and should give some better coverage as
+ * a long-running test that continually checks the pkey
+ * register.
+ */
+void test_pkey_init_state(int *ptr, u16 pkey)
+{
+	int err;
+	int allocated_pkeys[NR_PKEYS] = {0};
+	int nr_allocated_pkeys = 0;
+	int i;
+
+	for (i = 0; i < NR_PKEYS; i++) {
+		int new_pkey = alloc_pkey();
+
+		if (new_pkey < 0)
+			continue;
+		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
+	}
+
+	dprintf3("%s()::%d\n", __func__, __LINE__);
+
+	arch_force_pkey_reg_init();
+
+	/*
+	 * Loop for a bit, hoping to get exercise the kernel
+	 * context switch code.
+	 */
+	for (i = 0; i < 1000000; i++)
+		read_pkey_reg();
+
+	for (i = 0; i < nr_allocated_pkeys; i++) {
+		err = sys_pkey_free(allocated_pkeys[i]);
+		pkey_assert(!err);
+		read_pkey_reg(); /* for shadow checking */
+	}
+}
+
 /*
  * pkey 0 is special.  It is allocated by default, so you do not
  * have to call pkey_alloc() to use it first.  Make sure that it
@@ -1508,6 +1580,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey)
 	test_implicit_mprotect_exec_only_memory,
 	test_mprotect_with_pkey_0,
 	test_ptrace_of_child,
+	test_pkey_init_state,
 	test_pkey_syscalls_on_non_allocated_pkey,
 	test_pkey_syscalls_bad_args,
 	test_pkey_alloc_exhaust,
_

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] selftests/vm/pkeys: Exercise x86 XSAVE init state
@ 2021-06-11 16:42   ` Dave Hansen
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Hansen @ 2021-06-11 16:42 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, aneesh.kumar, mpe, mhocko,
	msuchanek, shuah, x86


From: Dave Hansen <dave.hansen@linux.intel.com>

On x86, there is a set of instructions used to save and restore register
state collectively known as the XSAVE architecture.  There are about a
dozen different features managed with XSAVE.  The protection keys
register, PKRU, is one of those features.

The hardware optimizes XSAVE by tracking when the state has not changed
from its initial (init) state.  In this case, it can avoid the cost of
writing state to memory (it would usually just be a bunch of 0's).

When the pkey register is 0x0 the hardware optionally choose to track
the register as being in the init state (optimize away the writes).
AMD CPUs do this more aggressively compared to Intel.

On x86, PKRU is rarely in its (very permissive) init state.  Instead,
the value defaults to something very restrictive.  It is not surprising
that bugs have popped up in the rare cases when PKRU reaches its init
state.

Add a protection key selftest which gets the protection keys register
into its init state in a way that should work on Intel and AMD.  Then,
do a bunch of pkey register reads to watch for inadvertent changes.

This adds "-mxsave" to CFLAGS for all the x86 vm selftests in order
to allow use of the XSAVE instruction __builtin functions.  This will
make the builtins available on all of the vm selftests, but is
expected to be harmless.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: x86@kernel.org
---

 b/tools/testing/selftests/vm/Makefile          |    4 -
 b/tools/testing/selftests/vm/pkey-x86.h        |    1 
 b/tools/testing/selftests/vm/protection_keys.c |   73 +++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff -puN tools/testing/selftests/vm/Makefile~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/Makefile
--- a/tools/testing/selftests/vm/Makefile~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.574468058 -0700
+++ b/tools/testing/selftests/vm/Makefile	2021-06-11 09:41:34.588468058 -0700
@@ -100,7 +100,7 @@ $(1) $(1)_64: $(OUTPUT)/$(1)_64
 endef
 
 ifeq ($(CAN_BUILD_I386),1)
-$(BINARIES_32): CFLAGS += -m32
+$(BINARIES_32): CFLAGS += -m32 -mxsave
 $(BINARIES_32): LDLIBS += -lrt -ldl -lm
 $(BINARIES_32): $(OUTPUT)/%_32: %.c
 	$(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@
@@ -108,7 +108,7 @@ $(foreach t,$(TARGETS),$(eval $(call gen
 endif
 
 ifeq ($(CAN_BUILD_X86_64),1)
-$(BINARIES_64): CFLAGS += -m64
+$(BINARIES_64): CFLAGS += -m64 -mxsave
 $(BINARIES_64): LDLIBS += -lrt -ldl
 $(BINARIES_64): $(OUTPUT)/%_64: %.c
 	$(CC) $(CFLAGS) $(EXTRA_CFLAGS) $(notdir $^) $(LDLIBS) -o $@
diff -puN tools/testing/selftests/vm/pkey-x86.h~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/pkey-x86.h
--- a/tools/testing/selftests/vm/pkey-x86.h~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.576468058 -0700
+++ b/tools/testing/selftests/vm/pkey-x86.h	2021-06-11 09:41:34.590468058 -0700
@@ -126,6 +126,7 @@ static inline u32 pkey_bit_position(int
 
 #define XSTATE_PKEY_BIT	(9)
 #define XSTATE_PKEY	0x200
+#define XSTATE_BV_OFFSET	512
 
 int pkey_reg_xstate_offset(void)
 {
diff -puN tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1 tools/testing/selftests/vm/protection_keys.c
--- a/tools/testing/selftests/vm/protection_keys.c~selftests_vm_pkeys_Exercise_x86_XSAVE_init_state-1	2021-06-11 09:41:34.578468058 -0700
+++ b/tools/testing/selftests/vm/protection_keys.c	2021-06-11 09:41:34.593468058 -0700
@@ -1277,6 +1277,78 @@ void test_pkey_alloc_exhaust(int *ptr, u
 	}
 }
 
+void arch_force_pkey_reg_init(void)
+{
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+	u64 *buf;
+
+	/*
+	 * All keys should be allocated and set to allow reads and
+	 * writes, so the register should be all 0.  If not, just
+	 * skip the test.
+	 */
+	if (read_pkey_reg())
+		return;
+
+	/*
+	 * Just allocate an absurd about of memory rather than
+	 * doing the XSAVE size enumeration dance.
+	 */
+	buf = mmap(NULL, 1*MB, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+
+	/* These __builtins require compiling with -mxsave */
+
+	/* XSAVE to build a valid buffer: */
+	__builtin_ia32_xsave(buf, XSTATE_PKEY);
+	/* Clear XSTATE_BV[PKRU]: */
+	buf[XSTATE_BV_OFFSET/sizeof(u64)] &= ~XSTATE_PKEY;
+	/* XRSTOR will likely get PKRU back to the init state: */
+	__builtin_ia32_xrstor(buf, XSTATE_PKEY);
+
+	munmap(buf, 1*MB);
+#endif
+}
+
+
+/*
+ * This is mostly useless on ppc for now.  But it will not
+ * hurt anything and should give some better coverage as
+ * a long-running test that continually checks the pkey
+ * register.
+ */
+void test_pkey_init_state(int *ptr, u16 pkey)
+{
+	int err;
+	int allocated_pkeys[NR_PKEYS] = {0};
+	int nr_allocated_pkeys = 0;
+	int i;
+
+	for (i = 0; i < NR_PKEYS; i++) {
+		int new_pkey = alloc_pkey();
+
+		if (new_pkey < 0)
+			continue;
+		allocated_pkeys[nr_allocated_pkeys++] = new_pkey;
+	}
+
+	dprintf3("%s()::%d\n", __func__, __LINE__);
+
+	arch_force_pkey_reg_init();
+
+	/*
+	 * Loop for a bit, hoping to get exercise the kernel
+	 * context switch code.
+	 */
+	for (i = 0; i < 1000000; i++)
+		read_pkey_reg();
+
+	for (i = 0; i < nr_allocated_pkeys; i++) {
+		err = sys_pkey_free(allocated_pkeys[i]);
+		pkey_assert(!err);
+		read_pkey_reg(); /* for shadow checking */
+	}
+}
+
 /*
  * pkey 0 is special.  It is allocated by default, so you do not
  * have to call pkey_alloc() to use it first.  Make sure that it
@@ -1508,6 +1580,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey)
 	test_implicit_mprotect_exec_only_memory,
 	test_mprotect_with_pkey_0,
 	test_ptrace_of_child,
+	test_pkey_init_state,
 	test_pkey_syscalls_on_non_allocated_pkey,
 	test_pkey_syscalls_bad_args,
 	test_pkey_alloc_exhaust,
_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] selftests/vm/pkeys: Bug fixes and a new test
  2021-06-11 16:41 ` Dave Hansen
                   ` (4 preceding siblings ...)
  (?)
@ 2021-06-13  8:54 ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2021-06-13  8:54 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: linux-kernel, Dave Hansen, tglx, linuxram, sandipan, akpm,
	fweimer, desnesn, mingo, bauerman, mpe, mhocko, msuchanek, shuah,
	x86

Dave Hansen <dave.hansen@linux.intel.com> writes:

> There has been a lot of activity on the x86 front around the XSAVE
> architecture which is used to context-switch processor state (among
> other things).  In addition, AMD has recently joined the protection
> keys club by adding processor support for PKU.
>
> The AMD implementation helped uncover a kernel bug around the PKRU
> "init state", which actually applied to Intel's implementation but
> was just harder to hit.  This series adds a test which is expected
> to help find this class of bug both on AMD and Intel.  All the work
> around pkeys on x86 also uncovered a few bugs in the selftest.
>
> Any testing of this new code (especially from my powerpc friends)
> would be appreciated.

I tested this on ppc64.

Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ram Pai <linuxram@us.ibm.com>
> Cc: Sandipan Das <sandipan@linux.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Florian Weimer <fweimer@redhat.com>
> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Michal Suchanek <msuchanek@suse.de>
> Cc: Shuah Khan <shuah@kernel.org>
> Cc: x86@kernel.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-13  9:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-11 16:41 [PATCH 0/4] selftests/vm/pkeys: Bug fixes and a new test Dave Hansen
2021-06-11 16:41 ` Dave Hansen
2021-06-11 16:41 ` [PATCH 1/4] selftests/vm/pkeys: Fix alloc_random_pkey() to make it really, really random Dave Hansen
2021-06-11 16:41   ` Dave Hansen
2021-06-11 16:41 ` [PATCH 2/4] selftests/vm/pkeys: Handle negative sys_pkey_alloc() return code Dave Hansen
2021-06-11 16:41   ` Dave Hansen
2021-06-11 16:42 ` [PATCH 3/4] selftests/vm/pkeys: Refill shadow register after implicit kernel write Dave Hansen
2021-06-11 16:42   ` Dave Hansen
2021-06-11 16:42 ` [PATCH 4/4] selftests/vm/pkeys: Exercise x86 XSAVE init state Dave Hansen
2021-06-11 16:42   ` Dave Hansen
2021-06-13  8:54 ` [PATCH 0/4] selftests/vm/pkeys: Bug fixes and a new test Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.